1
0
mirror of https://github.com/gsi-upm/senpy synced 2025-10-19 01:38:28 +00:00

Compare commits

..

109 Commits

Author SHA1 Message Date
militarpancho
83e2d415a1 Change name to split, according to issue #37 2017-06-13 19:44:40 +02:00
militarpancho
f8ca595bc9 Added chunker plugin to tokenize texts 2017-06-13 14:00:40 +02:00
J. Fernando Sánchez
312e7f7f12 Avoid python temporary files in pip tests 2017-06-12 21:50:51 +02:00
J. Fernando Sánchez
c555b9547e Non-interactive pip test 2017-06-12 21:27:02 +02:00
J. Fernando Sánchez
991ade8f4d Make sdist non-interactive non-tty 2017-06-12 21:20:07 +02:00
J. Fernando Sánchez
1104e816cb Push pip for tags without a preceding v 2017-06-12 21:06:34 +02:00
J. Fernando Sánchez
c19d03b41d Added SSH access to github fetch 2017-06-12 20:47:46 +02:00
J. Fernando Sánchez
42c9068991 Add pull policy to k8s deployment
* Add git fetch to (try to) fix github push from gitlab
2017-06-12 20:43:39 +02:00
J. Fernando Sánchez
96843827bd Removed __main__ from test coverage reports 2017-06-12 20:29:29 +02:00
J. Fernando Sánchez
d76e4618fe Removed python 3.4 from travis versions 2017-06-12 20:18:56 +02:00
J. Fernando Sánchez
c9bc485535 Merge branch '36-estimate-vad' 2017-06-12 20:10:21 +02:00
J. Fernando Sánchez
6d7575bbcd Merge branch '35-timeout-and-blocking-requests' 2017-06-12 19:57:28 +02:00
J. Fernando Sánchez
852bcc72ba Better centroid conversion
Also added **simple** tests for backward and forward conversion.
In future versions we should add thorough tests.

Should close gsi-upm/senpy#31
2017-06-12 19:52:00 +02:00
J. Fernando Sánchez
bf5ed1bd7d Merge remote-tracking branch 'drevicko/patch-6' 2017-06-12 18:14:15 +02:00
J. Fernando Sánchez
00da75153a Change conversion to Euclidean distance
* Added neutral point (if present)

Closes !gsi-upm/senpy#37 (Ian's)
2017-06-12 18:09:58 +02:00
J. Fernando Sánchez
fa082e11e7 Use flask's server by default
Using this server in production is discouraged, but to implement a
proper asynchronous server with tornado/gevent every blocking call would
have to be converted to a non-blocking call.

Failing to do so causes deadlocks like senpy/senpy#35

For now, it is easier to just use the default server.
2017-06-12 17:29:01 +02:00
J. Fernando Sánchez
6331d31b18 Merge branch '34-document-plugin-repo-creation' into 24-improve-docs
Closes #34
Closes #24
2017-06-12 12:53:24 +02:00
J. Fernando Sánchez
8ee324f566 Clearer docs 2017-06-12 09:31:42 +02:00
J. Fernando Sánchez
188c33332a Removed nbsphinx
It requires pandoc, which cannot be installed with pip.

We can either link to the nbfile or convert the file
manually/automatically:

```
nbconvert SenpyClientUse.ipynb --to rst
```
2017-06-12 09:31:42 +02:00
militarpancho
955e17eb2a Added travis, readthedocs and pypi badges 2017-06-12 09:31:42 +02:00
militarpancho
3e0f55dcff Improve docs. (Badges missing) 2017-06-12 09:31:38 +02:00
J. Fernando Sánchez
2ea01aef42 Fixed deployment IMAGENAME 2017-06-02 20:10:06 +02:00
J. Fernando Sánchez
147fd4a333 Fixed IMAGENAME 2017-06-02 20:02:27 +02:00
J. Fernando Sánchez
e31bca7016 Push to dockerhub instead of private registry 2017-06-02 19:42:22 +02:00
J. Fernando Sánchez
7956d54c35 K8s deployment with limits 2017-06-02 19:17:27 +02:00
militarpancho
5bab9a6a02 #34. Fixed some errors from plugins examples 2017-06-02 17:43:18 +02:00
militarpancho
69ac95bb08 Added example plugin in docs. #34 2017-06-02 17:39:27 +02:00
drevicko
6b843a4384 fixes typo in code 2017-05-29 12:15:35 +01:00
drevicko
65d6e47513 Implements Fernando's suggestion in #31
I've added a neutral point definition (in the converters senpy file) as used in pull request #29
2017-05-29 12:13:21 +01:00
drevicko
8d56a0b630 fixes #31
I've used euclidean metric instead of taxicab as I feel it makes more sense (taxicab has bizzare unintuitive effects for points far from the centroids).
2017-05-29 12:06:44 +01:00
drevicko
e7ac6e66b0 update _forward_conversion docstring + minor edits 2017-05-29 11:50:14 +01:00
J. Fernando Sánchez
0f8d1dff69 Fixed image repository 2017-05-19 19:59:45 +02:00
J. Fernando Sánchez
236183593c Hidden variables 2017-05-19 19:36:25 +02:00
J. Fernando Sánchez
7637498517 Added push to github 2017-05-19 19:33:00 +02:00
J. Fernando Sánchez
8c70433312 Added push to github 2017-05-19 18:54:57 +02:00
J. Fernando Sánchez
ce83fb3981 Added k8s deployment 2017-05-19 16:52:10 +02:00
J. Fernando Sánchez
28f29d159a Merge branch 'gh-34-broken-shelf' into 0.8.x 2017-05-17 17:39:14 +02:00
J. Fernando Sánchez
c803f60fd4 Merge branch 'drevicko/provide-analyse-traceback-in-log' into 0.8.x 2017-05-17 17:38:41 +02:00
J. Fernando Sánchez
12eae16e37 Merge branch 'drevicko-patch-5' into 0.8.x 2017-05-17 17:35:50 +02:00
J. Fernando Sánchez
f3372c27b6 Merge branch '32-update-module-dev-environment' into 0.8.x 2017-05-17 17:34:21 +02:00
Ian Wood
b6de72a143 chaged exception logging to 'exception()' in analysie() 2017-05-17 17:10:05 +02:00
J. Fernando Sánchez
0f89b92457 Fixed pickling error in py2.7 2017-05-17 16:51:01 +02:00
J. Fernando Sánchez
ea91e3e4a4 Add an option to force the load of shelf plugins
Closes gsi-upm/senpy#34
2017-05-17 16:30:01 +02:00
Ian Wood
f76b777b9f don't fail if shelf pickle file broken 2017-05-16 15:09:46 +01:00
drevicko
dcc965ea63 removed superfluous 'neutral' centroid
Neutral is included as an 'origin' field. This is partly because emoml has no vocab for "Neutral" in dimensional models.
2017-05-08 14:34:28 +01:00
drevicko
400f647b7b removed unneccessary defaultdict import 2017-05-08 14:32:53 +01:00
Ian Wood
ec1a2ff5f9 added 'origin' to VAD representation, incorporated into weighed sum for Cat->VAD conversion 2017-05-08 14:28:51 +01:00
J. Fernando Sánchez
e112dd55ce Pip install editable
Closes senpy/senpy#32
2017-05-05 17:27:12 +02:00
J. Fernando Sánchez
60ef304108 Analysis set as a python list
Closes senpy/senpy#31
2017-05-05 17:05:17 +02:00
Ian Wood
1a9dd07f7e Merge branch 'master' 0.8.7 into patch-6 2017-05-05 15:02:15 +01:00
Ian Wood
b80b0c7947 used more specific exception specifier (KeyError) 2017-04-11 11:25:50 +01:00
Ian Wood
1ca6ec52fd fixed weighted average, no explicit treatment of 'neutral' 2017-04-11 11:12:02 +01:00
J. Fernando Sánchez
7927cf1587 add option to show senpy version number
Merge branch 'patch-5' of https://github.com/drevicko/senpy into drevicko-patch-5
2017-04-10 21:03:17 +02:00
J. Fernando Sánchez
13cefbedfb Clean dev containers in makefile 2017-04-10 20:38:12 +02:00
J. Fernando Sánchez
4ba9535d56 Merge branches into 0.8.x
'25-validation-errors'
'27-add-method-to-get-list-of-plugins'
'28-fix-multiprocessing-issues'
2017-04-10 20:31:26 +02:00
J. Fernando Sánchez
e582ef07d4 Fix multiprocessing tests in python2.7
Closes #28 for python 2.

Apparently, process pools are not contexts in python 2.7.
On the other hand, in py2 you cannot pickle instance methods, so
you have to implement Pool tasks as independent functions.
2017-04-10 20:17:38 +02:00
J. Fernando Sánchez
ef40bdb545 Replace gevent with tornado
Closes #28

Added:

* Async test (still missing one that includes the IOLoop)
* Async plugin under tests. To manually try async functionalities:
```
senpy -f tests/
```
2017-04-10 18:16:45 +02:00
J. Fernando Sánchez
e0b4c76238 Add plugin method to client
Closes #28
2017-04-10 18:07:34 +02:00
J. Fernando Sánchez
14c86ec38c Set plugin list as a @set and fixed test case
It turns out setting "plugins" as a @list in the context causes the
"plugins" property to expand to its full name.
Removing the type causes a regression of #17, which I initially missed
because the test in #17 was wrong.

Closes #26
2017-04-10 17:24:39 +02:00
J. Fernando Sánchez
d3d05b3218 Fixed expansion of "plugins"
Closes #26

There was no need to add @list, and it was causing JSON-LD to expand the
URI of 'plugins'
2017-04-07 16:24:28 +02:00
J. Fernando Sánchez
eababcadb0 Analysis as strings or objects in results
Closes #25
2017-04-07 16:13:57 +02:00
drevicko
7efece0224 add option to show senpy version number 2017-04-07 10:04:51 +01:00
drevicko
53138e6942 Estimate VAD by weighted average
Does a weighted average of centroids.

If intensity sums to zero for a category, a 'neutral' category is used or 0 if it's not present. I'm not 100% sure this is the best approach, and the name of the "neutral" category perhaps should use some convention?

Note that if there are no categories present, then no VAD (or other dimensional) estimate is returned. It may be better to use the neutral centroid if it's present in this case also.
2017-04-04 15:37:07 +01:00
J. Fernando Sánchez
1302b0b93c Fixed pip tests (added version) 2017-04-04 11:42:18 +02:00
J. Fernando Sánchez
ad1092690b Merge branch '20-improve-docs' into 0.8.x 2017-04-04 11:26:33 +02:00
J. Fernando Sánchez
e35e810ede Rephrase info on demo plugins
Closes #20
2017-04-04 11:26:05 +02:00
militarpancho
d5ddcb8d3f Change repository url 2017-04-04 11:21:08 +02:00
militarpancho
54c0c9c437 demo doc changed 2017-04-04 11:14:51 +02:00
J. Fernando Sánchez
6e970d01f2 Merge branch '21-ascii-cant-encode' into 0.8.x 2017-04-04 11:12:39 +02:00
J. Fernando Sánchez
1d0a54ecd2 Merge branch '22-pip-screws-with-logging-config' into 0.8.x 2017-04-04 11:12:23 +02:00
J. Fernando Sánchez
800d4a9c2c Fixed typos in Ian's patch 2017-04-04 11:11:51 +02:00
drevicko
035ef98b7e removed broken "/api" link
In index.html, there is a suggestion to try out the api with a link to "/api". Clicking that link results in a json error report - not ideal. 
Instead, I added text suggesting that a use can find example api url's after clickgin "Analyse!".
2017-04-04 11:07:32 +02:00
J. Fernando Sánchez
d7e115d7c2 Encode HEADERS
Closes #21
2017-04-03 19:23:18 +02:00
J. Fernando Sánchez
548cb4c9ba Doc changes
* Alabaster theme
* Restructured
* Simplified introduction
* Reference to entries/models
* Fixed examples
2017-04-03 18:20:09 +02:00
J. Fernando Sánchez
7e5b55ff9c Run pip with Popen
Closes #22
2017-03-30 17:38:17 +02:00
militarpancho
8b2c3e8d40 Update readthedocs. Mainly Api and What is senpy section 2017-03-28 12:34:39 +02:00
J. Fernando Sánchez
0c8f98d466 Pre-0.8.6
* Improved debugging (back to using Flask's built-in mechanisms)
* Recursive model loading from json
* Added DEVPORT to Makefile
* Accept json-ld input. Closes #16
* Improved Exception handling in client
* Modified default plugin selection to only include analysis plugins
* More tests
2017-03-14 19:59:06 +01:00
J. Fernando Sánchez
cc298742ec Merge branch '17-...' into 0.8.x 2017-03-14 13:20:20 +01:00
J. Fernando Sánchez
250052fb99 Options as a set in the JSON-LD context
Closes #18
2017-03-14 13:17:47 +01:00
J. Fernando Sánchez
603e086606 Fix list of plugins
Closes #17
2017-03-14 13:05:52 +01:00
J. Fernando Sánchez
a8614bab0c Accept plugin pipelines
Closes #15
2017-03-13 21:08:21 +01:00
J. Fernando Sánchez
70ca74b03c Added instructions for developers 2017-03-09 00:04:02 +01:00
J. Fernando Sánchez
c9e6d78183 Fixed alises, added PAD and FSRE
Closes #13
2017-03-08 23:23:40 +01:00
J. Fernando Sánchez
1a582c0843 Filter conversion plugins
Closes #12

* Shows only analysis plugins by default on /api/plugins
* Adds a plugin_type parameter to get other types of plugins
* default_plugin chosen from analysis plugins
2017-03-08 22:54:57 +01:00
drevicko
0394bcd69c add make version to readme for pip install
pip install needs the VERSION file - `make version` will create that file

I also added the -U flag to pip install to force install (this is important if the user is playing with the code or trying out different older versions, as pip will not install if it thinks the git repo represents a version already installed or older than the one installed)
2017-03-02 11:08:02 +00:00
J. Fernando Sánchez
cbeb3adbdb Added fallback version '0.0'
Installing depends on the VERSION file, so it raies an error if it is
installed in some other way.

ReadTheDocs installs the package so it can generate code docs.
This commit adds a default version 0.0
2017-03-01 18:53:54 +01:00
J. Fernando Sánchez
efb305173e Removed future from __init__
Since __init__ is imported by setup.py, future may not be installed yet.

Other options would be:

* Read VERSION -> and that code has to be duplicated in setup.py and
  senpy (to avoid the import, once again)
* Eval version.py
* Do without versioning :)
2017-03-01 18:28:20 +01:00
J. Fernando Sánchez
2288b04c92 Remove iteritems for py2/3 compatibility 2017-03-01 18:14:44 +01:00
J. Fernando Sánchez
7899cb4d33 Fixed docker upload
Doing docker push without a tag makes the client upload **ALL** the
images it has for that repo.
2017-03-01 17:59:35 +01:00
J. Fernando Sánchez
62ddca79ac Fixed conversion docs 2017-03-01 17:56:17 +01:00
J. Fernando Sánchez
99403b3443 Fix for async
Should fix #11
2017-03-01 12:25:07 +01:00
J. Fernando Sánchez
a0ff528a4b Improved docs and client
* Client now raises an exception on error
* Added conversion to the documentation
2017-02-28 19:38:01 +01:00
J. Fernando Sánchez
97bd245dfc Changed data directory 2017-02-28 18:31:43 +01:00
J. Fernando Sánchez
d8b59d06a4 Converted Ekman2VAD to centroids
* Changed the way modules are imported -> we can now use dotted
  notation (e.g. senpy.plugins.conversion.centroids)
* Refactored ekman2vad's plugin -> generic centroids
* Added some basic tests
2017-02-28 05:28:55 +01:00
J. Fernando Sánchez
453b9f3257 Fixed bugs in Ekman2VAD 2017-02-28 04:01:05 +01:00
J. Fernando Sánchez
5fb858f5fc Fixed error when installing dependencies 2017-02-28 02:24:49 +01:00
J. Fernando Sánchez
bd984a1437 Fix 5 2017-02-27 21:22:10 +01:00
J. Fernando Sánchez
e741b565a1 Fix 4 2017-02-27 20:44:27 +01:00
J. Fernando Sánchez
668a803d89 Will anything break this time? We shall see 2017-02-27 20:38:55 +01:00
J. Fernando Sánchez
9daae8dda7 Please, please, please let it pass!
Am I a complete moron?
2017-02-27 20:22:55 +01:00
J. Fernando Sánchez
c72094b94b Fixed IMAGE names in GL CI 2017-02-27 20:08:10 +01:00
J. Fernando Sánchez
15d456d048 Testing docker in travis 2017-02-27 19:51:53 +01:00
J. Fernando Sánchez
fef06d4333 Fixed image creation issue with GL CI 2017-02-27 19:37:53 +01:00
J. Fernando Sánchez
454aa61fba Fixed CI problem 2017-02-27 19:31:52 +01:00
J. Fernando Sánchez
ba2e18125c Deployment changes
* Docker all the things!
* Make all the things!
* Fixed version.sh
2017-02-27 19:16:43 +01:00
J. Fernando Sánchez
9f6a6f5ecd Loads of changes!
* Added conversion plugins (API might change!)
* Added conversion to the analysis pipeline
* Changed behaviour of --default-plugins (it adds conversion plugins regardless)
* Added emotionModel [sic] and emotionConversion models

//TODO add conversion tests
//TODO add conversion to docs
2017-02-27 12:01:19 +01:00
J. Fernando Sánchez
3cea7534ef New versioning
Use git to automatically fetch the version
2017-02-17 16:21:44 +01:00
J. Fernando Sánchez
7eaf303124 Added coverage tests 2017-02-17 11:24:57 +01:00
J. Fernando Sánchez
b4ca5f4a7c Several fixes and changes
* Added interactive debugging
* Better exception logging
* More tests for errors
* Added ONBUILD to dockerfile
  Now creating new images based on senpy's is as easy as:
  ```from senpy:<version>```. This will automatically mount the code to
  /senpy-plugins and install all dependencies
* Added /data as a VOLUME
* Added `--use-wheel` to pip install both on the image and in the
  auto-install function.
* Closes #9

Break compatibilitity:

* Removed ability to (de)activate plugins through the web
2017-02-17 09:56:53 +01:00
103 changed files with 3408 additions and 1243 deletions

5
.gitignore vendored
View File

@@ -4,4 +4,7 @@
dist
build
README.html
__pycache__
__pycache__
VERSION
Dockerfile-*
Dockerfile

View File

@@ -1,73 +1,106 @@
image: docker:latest
# Uncomment if you want to use docker-in-docker
# image: gsiupm/dockermake:latest
# services:
# - docker:dind
# When using dind, it's wise to use the overlayfs driver for
# improved performance.
variables:
DOCKER_DRIVER: overlay
DOCKERFILE: Dockerfile
stages:
- test
- images
- release
- push
- deploy
- clean
before_script:
- docker login -u $HUB_USER -p $HUB_PASSWORD
.test: &test_definition
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.eggs"
cache:
paths:
- .eggs/
key: "$CI_PROJECT_NAME"
stage: test
script:
- python setup.py test
- make -e test-$PYTHON_VERSION
test-3.5:
<<: *test_definition
image: "python:3.5"
test-3.4:
<<: *test_definition
image: "python:3.4"
variables:
PYTHON_VERSION: "3.5"
test-2.7:
<<: *test_definition
image: "python:2.7"
variables:
PYTHON_VERSION: "2.7"
.image: &image_definition
variables:
PYTHON_VERSION: "3.5"
before_script:
- docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
stage: push
script:
- docker build -f Dockerfile-3.5 . -t $CI_REGISTRY_IMAGE:$CI_BUILD_REF_SLUG-$PYTHON_VERSION
- docker push $CI_REGISTRY_IMAGE:$CI_BUILD_REF_SLUG-$PYTHON_VERSION
stage: images
- make -e push-$PYTHON_VERSION
only:
- tags
- triggers
image-3.5:
push-3.5:
<<: *image_definition
variables:
PYTHON_VERSION: "3.5"
image-2.7:
push-2.7:
<<: *image_definition
variables:
PYTHON_VERSION: "2.7"
image-latest:
stage: release
before_script:
- docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
script:
- docker build -f Dockerfile . -t $CI_REGISTRY_IMAGE
- docker tag $CI_REGISTRY_IMAGE $CI_REGISTRY_IMAGE:$CI_BUILD_REF_SLUG
- docker push $CI_REGISTRY_IMAGE
- docker push $CI_REGISTRY_IMAGE:$CI_BUILD_REF_SLUG
push-latest:
<<: *image_definition
variables:
PYTHON_VERSION: latest
only:
- master
- triggers
- triggers
push-github:
stage: deploy
script:
- make -e push-github
only:
- master
- triggers
deploy_pypi:
stage: deploy
script: # Configure the PyPI credentials, then push the package, and cleanup the creds.
- echo "[server-login]" >> ~/.pypirc
- echo "username=" ${PYPI_USER} >> ~/.pypirc
- echo "password=" ${PYPI_PASSWORD} >> ~/.pypirc
- make pip_upload
- echo "" > ~/.pypirc && rm ~/.pypirc # If the above fails, this won't run.
only:
- /^v?\d+\.\d+\.\d+([abc]\d*)?$/ # PEP-440 compliant version (tags)
except:
- branches
deploy:
stage: deploy
environment: test
script:
- make -e deploy
only:
- master
push-github:
stage: deploy
script:
- make -e push-github
only:
- master
- triggers
clean :
stage: clean
script:
- make -e clean
when: manual
cleanup_py:
stage: clean
when: always # this is important; run even if preceding stages failed.
script:
- rm -vf ~/.pypirc # we don't want to leave these around, but GitLab may clean up anyway.
- docker logout

5
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,5 @@
- repo: git://github.com/pre-commit/pre-commit-hooks
sha: e626cd57090d8df0be21e4df0f4e55cc3511d6ab
hooks:
- id: flake8
- id: check-json

View File

@@ -1,8 +1,12 @@
sudo: required
services:
- docker
language: python
python:
- "2.7"
- "3.4"
- "3.5"
install: "pip install -r requirements.txt"
env:
- PYV=2.7
- PYV=3.5
# run nosetests - Tests
script: nosetests
script: make test-$PYV

View File

@@ -1 +0,0 @@
Dockerfile-3.5

View File

@@ -1,9 +0,0 @@
from python:2.7
WORKDIR /usr/src/app
ADD requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
ADD . /usr/src/app/
RUN pip install .
ENTRYPOINT ["python", "-m", "senpy", "-f", ".", "--host", "0.0.0.0"]

View File

@@ -1,9 +0,0 @@
from python:3.4
WORKDIR /usr/src/app
ADD requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
ADD . /usr/src/app/
RUN pip install .
ENTRYPOINT ["python", "-m", "senpy", "-f", ".", "--host", "0.0.0.0"]

View File

@@ -1,9 +0,0 @@
from python:3.5
WORKDIR /usr/src/app
ADD requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
ADD . /usr/src/app/
RUN pip install .
ENTRYPOINT ["python", "-m", "senpy", "-f", ".", "--host", "0.0.0.0"]

View File

@@ -1,33 +0,0 @@
from python:2.7
RUN apt-get update
RUN apt-get -y install git
RUN mkdir -p /senpy-plugins
RUN apt-get -y install python-numpy
RUN apt-get -y install python-scipy
RUN apt-get -y install python-sklearn
RUN apt-get -y install python-gevent
RUN apt-get -y install libopenblas-dev
RUN apt-get -y install gfortran
RUN apt-get -y install libxml2-dev libxslt1-dev python-dev
#RUN pip install --upgrade pip
ADD id_rsa /root/.ssh/id_rsa
RUN chmod 700 /root/.ssh/id_rsa
RUN echo "Host github.com\n\tStrictHostKeyChecking no\n" >> /root/.ssh/config
RUN git clone https://github.com/gsi-upm/senpy /usr/src/app/
RUN git clone git@github.com:gsi-upm/senpy-plugins-enterprise /senpy-plugins/enterprise
RUN git clone https://github.com/gsi-upm/senpy-plugins-community /senpy-plugins/community
RUN pip install /usr/src/app
RUN pip install --no-use-wheel -r /senpy-plugins/enterprise/requirements.txt
RUN python -m nltk.downloader stopwords
RUN python -m nltk.downloader punkt
RUN python -m nltk.downloader maxent_treebank_pos_tagger
RUN python -m nltk.downloader wordnet
WORKDIR /senpy-plugins
ENTRYPOINT ["python", "-m", "senpy", "-f", ".", "--host", "0.0.0.0"]

View File

@@ -1,9 +1,22 @@
from python:{{PYVERSION}}
WORKDIR /usr/src/app
ADD requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
ADD . /usr/src/app/
RUN pip install .
MAINTAINER J. Fernando Sánchez <jf.sanchez@upm.es>
ENTRYPOINT ["python", "-m", "senpy", "-f", ".", "--host", "0.0.0.0"]
RUN mkdir /cache/ /senpy-plugins /data/
VOLUME /data/
ENV PIP_CACHE_DIR=/cache/ SENPY_DATA=/data
ONBUILD COPY . /senpy-plugins/
ONBUILD RUN python -m senpy --only-install -f /senpy-plugins
ONBUILD WORKDIR /senpy-plugins/
WORKDIR /usr/src/app
COPY test-requirements.txt requirements.txt /usr/src/app/
RUN pip install --use-wheel -r test-requirements.txt -r requirements.txt
COPY . /usr/src/app/
RUN pip install --no-index --no-deps --editable .
ENTRYPOINT ["python", "-m", "senpy", "-f", "/senpy-plugins/", "--host", "0.0.0.0"]

150
Makefile
View File

@@ -1,19 +1,44 @@
PYVERSIONS=3.5 3.4 2.7
PYMAIN=$(firstword $(PYVERSIONS))
NAME=senpy
REPO=gsiupm
VERSION=$(shell cat $(NAME)/VERSION)
TARNAME=$(NAME)-$(subst -,.,$(VERSION)).tar.gz
IMAGENAME=$(REPO)/$(NAME):$(VERSION)
TEST_COMMAND=gitlab-runner exec docker --cache-dir=/tmp/gitlabrunner --docker-volumes /tmp/gitlabrunner:/tmp/gitlabrunner --env CI_PROJECT_NAME=$(NAME)
VERSION=$(shell git describe --tags --dirty 2>/dev/null)
GITHUB_REPO=git@github.com:gsi-upm/senpy.git
IMAGENAME=gsiupm/senpy
IMAGEWTAG=$(IMAGENAME):$(VERSION)
PYVERSIONS=3.5 2.7
PYMAIN=$(firstword $(PYVERSIONS))
DEVPORT=5000
TARNAME=$(NAME)-$(VERSION).tar.gz
action="test-${PYMAIN}"
GITHUB_REPO=git@github.com:gsi-upm/senpy.git
KUBE_CA_PEM_FILE=""
KUBE_URL=""
KUBE_TOKEN=""
KUBE_NAMESPACE=$(NAME)
KUBECTL=docker run --rm -v $(KUBE_CA_PEM_FILE):/tmp/ca.pem -v $$PWD:/tmp/cwd/ -i lachlanevenson/k8s-kubectl --server="$(KUBE_URL)" --token="$(KUBE_TOKEN)" --certificate-authority="/tmp/ca.pem" -n $(KUBE_NAMESPACE)
CI_REGISTRY=docker.io
CI_REGISTRY_USER=gitlab
CI_BUILD_TOKEN=""
CI_COMMIT_REF_NAME=master
all: build run
.FORCE:
version: .FORCE
@echo $(VERSION) > $(NAME)/VERSION
@echo $(VERSION)
yapf:
yapf -i -r senpy
yapf -i -r $(NAME)
yapf -i -r tests
dev:
init:
pip install --user pre-commit
pre-commit install
@@ -28,56 +53,95 @@ quick_build: $(addprefix build-, $(PYMAIN))
build: $(addprefix build-, $(PYVERSIONS))
build-%: Dockerfile-%
docker build -t '$(IMAGENAME)-python$*' -f Dockerfile-$* .;
build-%: version Dockerfile-%
docker build -t '$(IMAGEWTAG)-python$*' --cache-from $(IMAGENAME):python$* -f Dockerfile-$* .;
quick_test: $(addprefix test-,$(PYMAIN))
test: $(addprefix test-,$(PYVERSIONS))
dev-%:
@docker start $(NAME)-dev$* || (\
$(MAKE) build-$*; \
docker run -d -w /usr/src/app/ -p $(DEVPORT):5000 -v $$PWD:/usr/src/app --entrypoint=/bin/bash -ti --name $(NAME)-dev$* '$(IMAGEWTAG)-python$*'; \
)\
debug-%:
(docker start $(NAME)-debug && docker attach $(NAME)-debug) || docker run -w /usr/src/app/ -v $$PWD:/usr/src/app --entrypoint=/bin/bash -ti --name $(NAME)-debug '$(IMAGENAME)-python$*'
docker exec -ti $(NAME)-dev$* bash
debug: debug-$(PYMAIN)
dev: dev-$(PYMAIN)
test-all: $(addprefix test-,$(PYVERSIONS))
test-%:
$(TEST_COMMAND) test-$*
docker run --rm --entrypoint /usr/local/bin/python -w /usr/src/app $(IMAGENAME):python$* setup.py test
dist/$(TARNAME):
docker run --rm -ti -v $$PWD:/usr/src/app/ -w /usr/src/app/ python:$(PYMAIN) python setup.py sdist;
test: test-$(PYMAIN)
dist/$(TARNAME): version
python setup.py sdist;
sdist: dist/$(TARNAME)
pip_test-%: sdist
docker run --rm -v $$PWD/dist:/dist/ -ti python:$* pip install /dist/$(TARNAME);
docker run --rm -v $$PWD/dist:/dist/ python:$* pip install /dist/$(TARNAME);
pip_test: $(addprefix pip_test-,$(PYVERSIONS))
upload-%: test-%
docker push '$(IMAGENAME)-python$*'
upload: test $(addprefix upload-,$(PYVERSIONS))
docker tag '$(IMAGENAME)-python$(PYMAIN)' '$(IMAGENAME)'
docker tag '$(IMAGENAME)-python$(PYMAIN)' '$(REPO)/$(NAME)'
docker push '$(IMAGENAME)'
docker push '$(REPO)/$(NAME)'
clean:
@docker ps -a | awk '/$(REPO)\/$(NAME)/{ split($$2, vers, "-"); if(vers[1] != "${VERSION}"){ print $$1;}}' | xargs docker rm 2>/dev/null|| true
@docker images | awk '/$(REPO)\/$(NAME)/{ split($$2, vers, "-"); if(vers[1] != "${VERSION}"){ print $$1":"$$2;}}' | xargs docker rmi 2>/dev/null|| true
@docker rmi $(NAME)-debug 2>/dev/null || true
upload_git:
git commit -a
git tag ${VERSION}
git push --tags origin master
pip_upload:
pip_upload: pip_test
python setup.py sdist upload ;
pip_test: $(addprefix pip_test-,$(PYVERSIONS))
clean:
@docker ps -a | grep $(IMAGENAME) | awk '{ split($$2, vers, "-"); if(vers[0] != "${VERSION}"){ print $$1;}}' | xargs docker rm -v 2>/dev/null|| true
@docker images | grep $(IMAGENAME) | awk '{ split($$2, vers, "-"); if(vers[0] != "${VERSION}"){ print $$1":"$$2;}}' | xargs docker rmi 2>/dev/null|| true
@docker stop $(addprefix $(NAME)-dev,$(PYVERSIONS)) 2>/dev/null || true
@docker rm $(addprefix $(NAME)-dev,$(PYVERSIONS)) 2>/dev/null || true
run: build
docker run --rm -p 5000:5000 -ti '$(IMAGENAME)-python$(PYMAIN)'
git_commit:
git commit -a
.PHONY: test test-% build-% build test pip_test run yapf dev
git_tag:
git tag ${VERSION}
git_push:
git push --tags origin master
run-%: build-%
docker run --rm -p $(DEVPORT):5000 -ti '$(IMAGEWTAG)-python$(PYMAIN)' --default-plugins
run: run-$(PYMAIN)
push-latest: $(addprefix push-latest-,$(PYVERSIONS))
docker tag '$(IMAGEWTAG)-python$(PYMAIN)' '$(IMAGEWTAG)'
docker tag '$(IMAGEWTAG)-python$(PYMAIN)' '$(IMAGENAME)'
docker push '$(IMAGENAME):latest'
docker push '$(IMAGEWTAG)'
push-latest-%: build-%
docker tag $(IMAGENAME):$(VERSION)-python$* $(IMAGENAME):python$*
docker push $(IMAGENAME):$(VERSION)-python$*
docker push $(IMAGENAME):python$*
push-%: build-%
docker push $(IMAGENAME):$(VERSION)-python$*
push: $(addprefix push-,$(PYVERSIONS))
docker tag '$(IMAGEWTAG)-python$(PYMAIN)' '$(IMAGEWTAG)'
docker push $(IMAGENAME):$(VERSION)
push-github:
$(eval KEY_FILE := $(shell mktemp))
@echo "$$GITHUB_DEPLOY_KEY" > $(KEY_FILE)
@git remote rm github-deploy || true
git remote add github-deploy $(GITHUB_REPO)
@GIT_SSH_COMMAND="ssh -i $(KEY_FILE)" git fetch github-deploy $(CI_COMMIT_REF_NAME) || true
@GIT_SSH_COMMAND="ssh -i $(KEY_FILE)" git push github-deploy $(CI_COMMIT_REF_NAME)
rm $(KEY_FILE)
ci:
gitlab-runner exec docker --docker-volumes /var/run/docker.sock:/var/run/docker.sock --env CI_PROJECT_NAME=$(NAME) ${action}
deploy:
@$(KUBECTL) delete secret $(CI_REGISTRY) || true
@$(KUBECTL) create secret docker-registry $(CI_REGISTRY) --docker-server=$(CI_REGISTRY) --docker-username=$(CI_REGISTRY_USER) --docker-email=$(CI_REGISTRY_USER) --docker-password=$(CI_BUILD_TOKEN)
@$(KUBECTL) apply -f /tmp/cwd/k8s/
.PHONY: test test-% test-all build-% build test pip_test run yapf push-main push-% dev ci version .FORCE deploy

View File

@@ -23,7 +23,7 @@ Through PIP
.. code:: bash
pip install --user senpy
pip install -U --user senpy
Alternatively, you can use the development version:
@@ -38,9 +38,56 @@ If you want to install senpy globally, use sudo instead of the ``--user`` flag.
Docker Image
************
Build the image or use the pre-built one: ``docker run -ti -p 5000:5000 gsiupm/senpy --host 0.0.0.0 --default-plugins``.
Build the image or use the pre-built one: ``docker run -ti -p 5000:5000 gsiupm/senpy --default-plugins``.
To add custom plugins, add a volume and tell senpy where to find the plugins: ``docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --host 0.0.0.0 --default-plugins -f /plugins``
To add custom plugins, add a volume and tell senpy where to find the plugins: ``docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --default-plugins -f /plugins``
Developing
----------
Developing/debugging
********************
This command will run the senpy container using the latest image available, mounting your current folder so you get your latest code:
.. code:: bash
# Python 3.5
make dev
# Python 2.7
make dev-2.7
Building a docker image
***********************
.. code:: bash
# Python 3.5
make build-3.5
# Python 2.7
make build-2.7
Testing
*******
.. code:: bash
make test
Running
*******
This command will run the senpy server listening on localhost:5000
.. code:: bash
# Python 3.5
make run-3.5
# Python 2.7
make run-2.7
Usage
-----
@@ -49,12 +96,14 @@ However, the easiest and recommended way is to just use the command-line tool to
.. code:: bash
senpy
or, alternatively:
.. code:: bash
python -m senpy

317
docs/SenpyClientUse.ipynb Normal file
View File

@@ -0,0 +1,317 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:05:31.465571Z",
"start_time": "2017-04-10T19:05:31.458282+02:00"
},
"deletable": true,
"editable": true
},
"source": [
"# Client"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"deletable": true,
"editable": true
},
"source": [
"The built-in senpy client allows you to query any Senpy endpoint. We will illustrate how to use it with the public demo endpoint, and then show you how to spin up your own endpoint using docker."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Demo Endpoint\n",
"-------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"To start using senpy, simply create a new Client and point it to your endpoint. In this case, the latest version of Senpy at GSI."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:12.827640Z",
"start_time": "2017-04-10T19:29:12.818617+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"from senpy.client import Client\n",
"\n",
"c = Client('http://latest.senpy.cluster.gsi.dit.upm.es/api')\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Now, let's use that client analyse some queries:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:14.011657Z",
"start_time": "2017-04-10T19:29:13.701808+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"r = c.analyse('I like sugar!!', algorithm='sentiment140')\n",
"r"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:08:19.616754Z",
"start_time": "2017-04-10T19:08:19.610767+02:00"
},
"deletable": true,
"editable": true
},
"source": [
"As you can see, that gave us the full JSON result. A more concise way to print it would be:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:14.854213Z",
"start_time": "2017-04-10T19:29:14.842068+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"for entry in r.entries:\n",
" print('{} -> {}'.format(entry['text'], entry['sentiments'][0]['marl:hasPolarity']))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"We can also obtain a list of available plugins with the client:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:16.245198Z",
"start_time": "2017-04-10T19:29:16.056545+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"c.plugins()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Or, more concisely:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:17.663275Z",
"start_time": "2017-04-10T19:29:17.484623+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"c.plugins().keys()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Local Endpoint\n",
"--------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"To run your own instance of senpy, just create a docker container with the latest Senpy image. Using `--default-plugins` you will get some extra plugins to start playing with the API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:20.637539Z",
"start_time": "2017-04-10T19:29:19.938322+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"!docker run -ti --name 'SenpyEndpoint' -d -p 6000:5000 gsiupm/senpy:0.8.6 --host 0.0.0.0 --default-plugins"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"To use this endpoint:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:21.263976Z",
"start_time": "2017-04-10T19:29:21.260595+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"c_local = Client('http://127.0.0.1:6000/api')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"That's all! After you are done with your analysis, stop the docker container:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:33.226686Z",
"start_time": "2017-04-10T19:29:22.392121+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"!docker stop SenpyEndpoint\n",
"!docker rm SenpyEndpoint"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "68px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 4,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}

BIN
docs/_static/header.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

11
docs/about.rst Normal file
View File

@@ -0,0 +1,11 @@
About
--------
If you use Senpy in your research, please cite `Senpy: A Pragmatic Linked Sentiment Analysis Framework <http://gsi.dit.upm.es/index.php/es/investigacion/publicaciones?view=publication&task=show&id=417>`__ (`BibTex <http://gsi.dit.upm.es/index.php/es/investigacion/publicaciones?controller=publications&task=export&format=bibtex&id=417>`__):
.. code-block:: text
Sánchez-Rada, J. F., Iglesias, C. A., Corcuera, I., & Araque, Ó. (2016, October).
Senpy: A Pragmatic Linked Sentiment Analysis Framework.
In Data Science and Advanced Analytics (DSAA),
2016 IEEE International Conference on (pp. 735-742). IEEE.

View File

@@ -1,5 +1,5 @@
NIF API
=======
-------
.. http:get:: /api
Basic endpoint for sentiment/emotion analysis.
@@ -22,38 +22,32 @@ NIF API
Content-Type: text/javascript
{
"@context": [
"http://127.0.0.1/static/context.jsonld",
],
"analysis": [
{
"@id": "SentimentAnalysisExample",
"@type": "marl:SentimentAnalysis",
"dc:language": "en",
"marl:maxPolarityValue": 10.0,
"marl:minPolarityValue": 0.0
}
],
"domain": "wndomains:electronics",
"entries": [
{
"opinions": [
{
"prov:generatedBy": "SentimentAnalysisExample",
"marl:polarityValue": 7.8,
"marl:hasPolarity": "marl:Positive",
"marl:describesObject": "http://www.gsi.dit.upm.es",
}
],
"nif:isString": "I love GSI",
"strings": [
{
"nif:anchorOf": "GSI",
"nif:taIdentRef": "http://www.gsi.dit.upm.es"
}
]
}
]
"@context":"http://127.0.0.1/api/contexts/Results.jsonld",
"@id":"_:Results_11241245.22",
"@type":"results"
"analysis": [
"plugins/sentiment-140_0.1"
],
"entries": [
{
"@id": "_:Entry_11241245.22"
"@type":"entry",
"emotions": [],
"entities": [],
"sentiments": [
{
"@id": "Sentiment0",
"@type": "sentiment",
"marl:hasPolarity": "marl:Negative",
"marl:polarityValue": 0,
"prefix": ""
}
],
"suggestions": [],
"text": "This text makes me sad.\nwhilst this text makes me happy and surprised at the same time.\nI cannot believe it!",
"topics": []
}
]
}
:query i input: No default. Depends on informat and intype
@@ -62,6 +56,7 @@ NIF API
:query o outformat: one of `turtle` (default), `text`, `json-ld`
:query p prefix: prefix for the URIs
:query algo algorithm: algorithm/plugin to use for the analysis. For a list of options, see :http:get:`/api/plugins`. If not provided, the default plugin will be used (:http:get:`/api/plugins/default`).
:query algo emotionModel: desired emotion model in the results. If the requested algorithm does not use that emotion model, there are conversion plugins specifically for this. If none of the plugins match, an error will be returned, which includes the results *as is*.
:reqheader Accept: the response content type depends on
:mailheader:`Accept` header
@@ -69,6 +64,7 @@ NIF API
header of request
:statuscode 200: no error
:statuscode 404: service not found
:statuscode 400: error while processing the request
.. http:post:: /api
@@ -90,56 +86,59 @@ NIF API
.. sourcecode:: http
{
"@context": {
...
},
"sentiment140": {
"name": "sentiment140",
"is_activated": true,
"version": "0.1",
"extra_params": {
"@id": "extra_params_sentiment140_0.1",
"language": {
"required": false,
"@id": "lang_sentiment140",
"options": [
"es",
"en",
"auto"
],
"aliases": [
"language",
"l"
]
}
},
"@id": "sentiment140_0.1"
},
"rand": {
"name": "rand",
"is_activated": true,
"version": "0.1",
"extra_params": {
"@id": "extra_params_rand_0.1",
"language": {
"required": false,
"@id": "lang_rand",
"options": [
"es",
"en",
"auto"
],
"aliases": [
"language",
"l"
]
}
},
"@id": "rand_0.1"
}
}
{
"@id": "plugins/sentiment-140_0.1",
"@type": "sentimentPlugin",
"author": "@balkian",
"description": "Sentiment classifier using rule-based classification for English and Spanish. This plugin uses sentiment140 data to perform classification. For more information: http://help.sentiment140.com/for-students/",
"extra_params": {
"language": {
"@id": "lang_sentiment140",
"aliases": [
"language",
"l"
],
"options": [
"es",
"en",
"auto"
],
"required": false
}
},
"is_activated": true,
"maxPolarityValue": 1.0,
"minPolarityValue": 0.0,
"module": "sentiment-140",
"name": "sentiment-140",
"requirements": {},
"version": "0.1"
},
{
"@id": "plugins/ExamplePlugin_0.1",
"@type": "sentimentPlugin",
"author": "@balkian",
"custom_attribute": "42",
"description": "I am just an example",
"extra_params": {
"parameter": {
"@id": "parameter",
"aliases": [
"parameter",
"param"
],
"default": 42,
"required": true
}
},
"is_activated": true,
"maxPolarityValue": 1.0,
"minPolarityValue": 0.0,
"module": "example",
"name": "ExamplePlugin",
"requirements": "noop",
"version": "0.1"
}
.. http:get:: /api/plugins/<pluginname>
@@ -148,7 +147,7 @@ NIF API
.. sourcecode:: http
GET /api/plugins/rand HTTP/1.1
GET /api/plugins/rand/ HTTP/1.1
Host: localhost
Accept: application/json, text/javascript
@@ -158,51 +157,61 @@ NIF API
.. sourcecode:: http
{
"@id": "rand_0.1",
"extra_params": {
"@id": "extra_params_rand_0.1",
"language": {
"@id": "lang_rand",
"aliases": [
"language",
"l"
],
"options": [
"es",
"en",
"auto"
],
"required": false
}
},
"is_activated": true,
"name": "rand",
"version": "0.1"
"@context": "http://127.0.0.1/api/contexts/ExamplePlugin.jsonld",
"@id": "plugins/ExamplePlugin_0.1",
"@type": "sentimentPlugin",
"author": "@balkian",
"custom_attribute": "42",
"description": "I am just an example",
"extra_params": {
"parameter": {
"@id": "parameter",
"aliases": [
"parameter",
"param"
],
"default": 42,
"required": true
}
},
"is_activated": true,
"maxPolarityValue": 1.0,
"minPolarityValue": 0.0,
"module": "example",
"name": "ExamplePlugin",
"requirements": "noop",
"version": "0.1"
}
.. http:get:: /api/plugins/default
Return the information about the default plugin.
.. http:get:: /api/plugins/<pluginname>/{de}activate
{De}activate a plugin.
**Example request**:
.. sourcecode:: http
GET /api/plugins/rand/deactivate HTTP/1.1
Host: localhost
Accept: application/json, text/javascript
**Example response**:
.. sourcecode:: http
{
"@context": {},
"message": "Ok"
}

7
docs/apischema.rst Normal file
View File

@@ -0,0 +1,7 @@
API and Examples
################
.. toctree::
vocabularies.rst
api.rst
examples.rst

View File

@@ -0,0 +1,78 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
"me:SAnalysis1",
"me:SgAnalysis1",
"me:EmotionAnalysis1",
"me:NER1",
{
"@type": "analysis",
"@id": "wrong"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}

9
docs/commandline.rst Normal file
View File

@@ -0,0 +1,9 @@
Command line
============
This video shows how to analyse text directly on the command line using the senpy tool.
.. image:: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk.png
:width: 100%
:target: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk
:alt: CLI demo

View File

@@ -37,6 +37,7 @@ extensions = [
'sphinx.ext.todo',
'sphinxcontrib.httpdomain',
'sphinx.ext.coverage',
'sphinx.ext.autosectionlabel',
]
# Add any paths that contain templates here, relative to this directory.
@@ -54,20 +55,21 @@ master_doc = 'index'
# General information about the project.
project = u'Senpy'
copyright = u'2016, J. Fernando Sánchez'
description = u'A framework for sentiment and emotion analysis services'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
with open('../senpy/VERSION') as f:
version = f.read().strip()
# with open('../senpy/VERSION') as f:
# version = f.read().strip()
# The full version, including alpha/beta/rc tags.
release = version
# release = version
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#language = None
language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
@@ -104,14 +106,14 @@ pygments_style = 'sphinx'
#keep_warnings = False
html_theme = 'alabaster'
# -- Options for HTML output ----------------------------------------------
if not on_rtd: # only import and set the theme if we're building docs locally
import sphinx_rtd_theme
html_theme = 'sphinx_rtd_theme'
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
# if not on_rtd: # only import and set the theme if we're building docs locally
# import sphinx_rtd_theme
# html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
else:
html_theme = 'default'
# else:
# html_theme = 'default'
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
@@ -119,7 +121,13 @@ else:
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
html_theme_options = {
'logo': 'header.png',
'github_user': 'gsi-upm',
'github_repo': 'senpy',
'github_banner': True,
}
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
@@ -159,7 +167,13 @@ html_static_path = ['_static']
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
html_sidebars = {
'**': [
'about.html',
'navigation.html',
'searchbox.html',
]
}
# Additional templates that should be rendered to pages, maps page names to
# template names.

116
docs/conversion.rst Normal file
View File

@@ -0,0 +1,116 @@
Conversion
----------
Senpy includes experimental support for emotion/sentiment conversion plugins.
Use
===
Consider the original query: http://127.0.0.1:5000/api/?i=hello&algo=emoRand
The requested plugin (emoRand) returns emotions using Ekman's model (or big6 in EmotionML):
.. code:: json
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emoRand_0.1"
}
To get these emotions in VAD space (FSRE dimensions in EmotionML), we'd do this:
http://127.0.0.1:5000/api/?i=hello&algo=emoRand&emotionModel=emoml:fsre-dimensions
This call, provided there is a valid conversion plugin from Ekman's to VAD, would return something like this:
.. code:: json
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emoRand_0.1"
}, {
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"A": 7.22,
"D": 6.28,
"V": 8.6
},
"prov:wasGeneratedBy": "plugins/Ekman2VAD_0.1"
}
That is called a *full* response, as it simply adds the converted emotion alongside.
It is also possible to get the original emotion nested within the new converted emotion, using the `conversion=nested` parameter:
.. code:: json
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emoRand_0.1"
"onyx:wasDerivedFrom": {
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"A": 7.22,
"D": 6.28,
"V": 8.6
},
"prov:wasGeneratedBy": "plugins/Ekman2VAD_0.1"
}
}
Lastly, `conversion=filtered` would only return the converted emotions.
Developing a conversion plugin
================================
Conversion plugins are discovered by the server just like any other plugin.
The difference is the slightly different API, and the need to specify the `source` and `target` of the conversion.
For instance, an emotion conversion plugin needs the following:
.. code:: yaml
---
onyx:doesConversion:
- onyx:conversionFrom: emoml:big6
onyx:conversionTo: emoml:fsre-dimensions
- onyx:conversionFrom: emoml:fsre-dimensions
onyx:conversionTo: emoml:big6
.. code:: python
class MyConversion(EmotionConversionPlugin):
def convert(self, emotionSet, fromModel, toModel, params):
pass

View File

@@ -1,7 +1,8 @@
Demo
----
There is a demo available on http://senpy.demos.gsi.dit.upm.es/, where you can a serie of different plugins. You can use them in the playground or make a directly requests to the service.
There is a demo available on http://senpy.cluster.gsi.dit.upm.es/, where you can test a serie of different plugins.
You can use the playground (a web interface) or make HTTP requests to the service API.
.. image:: senpy-playground.png
:height: 400px
@@ -12,64 +13,4 @@ There is a demo available on http://senpy.demos.gsi.dit.upm.es/, where you can a
Plugins Demo
============
The next plugins are available at the demo:
* emoTextAnew extracts the VAD (valence-arousal-dominance) of a sentence by matching words from the ANEW dictionary.
* emoTextWordnetAffect based on the hierarchy of WordnetAffect to calculate the emotion of the sentence.
* vaderSentiment utilizes the software from vaderSentiment to calculate the sentiment of a sentence.
* sentiText is a software developed during the TASS 2015 competition, it has been adapted for English and Spanish.
emoTextANEW plugin
******************
This plugin is going to used the ANEW lexicon dictionary to calculate de VAD (valence-arousal-dominance) of the sentence and the determinate which emotion is closer to this value.
Each emotion has a centroid, which it has been approximated using the formula described in this article:
http://www.aclweb.org/anthology/W10-0208
The plugin is going to look for the words in the sentence that appear in the ANEW dictionary and calculate the average VAD score for the sentence. Once this score is calculated, it is going to seek the emotion that is closest to this value.
emoTextWAF plugin
*****************
This plugin uses WordNet-Affect (http://wndomains.fbk.eu/wnaffect.html) to calculate the percentage of each emotion. The emotions that are going to be used are: anger, fear, disgust, joy and sadness. It is has been used a emotion mapping enlarge the emotions:
* anger : general-dislike
* fear : negative-fear
* disgust : shame
* joy : gratitude, affective, enthusiasm, love, joy, liking
* sadness : ingrattitude, daze, humlity, compassion, despair, anxiety, sadness
sentiText plugin
****************
This plugin is based in the classifier developed for the TASS 2015 competition. It has been developed for Spanish and English. The different phases that has this plugin when it is activated:
* Train both classifiers (English and Spanish).
* Initialize resources (dictionaries,stopwords,etc.).
* Extract bag of words,lemmas and chars.
Once the plugin is activated, the features that are going to be extracted for the classifiers are:
* Matches with the bag of words extracted from the train corpus.
* Sentiment score of the sentences extracted from the dictionaries (lexicons and emoticons).
* Identify negations and intensifiers in the sentences.
* Complementary features such as exclamation and interrogation marks, eloganted and caps words, hashtags, etc.
The plugin has a preprocessor, which is focues on Twitter corpora, that is going to be used for cleaning the text to simplify the feature extraction.
There is more information avaliable in the next article.
Aspect based Sentiment Analysis of Spanish Tweets, Oscar Araque and Ignacio Corcuera-Platas and Constantino Román-Gómez and Carlos A. Iglesias and J. Fernando Sánchez-Rada. http://gsi.dit.upm.es/es/investigacion/publicaciones?view=publication&task=show&id=37
vaderSentiment plugin
*********************
For developing this plugin, it has been used the module vaderSentiment, which is described in the paper: VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text C.J. Hutto and Eric Gilbert Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
If you use this plugin in your research, please cite the above paper
For more information about the functionality, check the official repository
https://github.com/cjhutto/vaderSentiment
The source code and description of the plugins used in the demo is available here: https://lab.cluster.gsi.dit.upm.es/senpy/senpy-plugins-community/.

78
docs/examples.rst Normal file
View File

@@ -0,0 +1,78 @@
Examples
------
All the examples in this page use the :download:`the main schema <_static/schemas/definitions.json>`.
Simple NIF annotation
.....................
Description
,,,,,,,,,,,
This example covers the basic example in the NIF documentation: `<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>`_.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-basic.json
:language: json-ld
Sentiment Analysis
.....................
Description
,,,,,,,,,,,
This annotation corresponds to the sentiment analysis of an input. The example shows the sentiment represented according to Marl format.
The sentiments detected are contained in the Sentiments array with their related part of the text.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-sentiment.json
:emphasize-lines: 5-10,25-33
:language: json-ld
Suggestion Mining
.................
Description
,,,,,,,,,,,
The suggestions schema represented below shows the suggestions detected in the text. Within it, we can find the NIF fields highlighted that corresponds to the text of the detected suggestion.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-suggestion.json
:emphasize-lines: 5-8,22-27
:language: json-ld
Emotion Analysis
................
Description
,,,,,,,,,,,
This annotation represents the emotion analysis of an input to Senpy. The emotions are contained in the emotions section with the text that refers to following Onyx format and the emotion model defined beforehand.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-emotion.json
:language: json-ld
:emphasize-lines: 5-8,25-37
Named Entity Recognition
........................
Description
,,,,,,,,,,,
The Named Entity Recognition is represented as follows. In this particular case, it can be seen within the entities array the entities recognised. For the example input, Microsoft and Windows Phone are the ones detected.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-ner.json
:emphasize-lines: 5-8,19-34
:language: json-ld
Complete example
................
Description
,,,,,,,,,,,
This example covers all of the above cases, integrating all the annotations in the same document.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-complete.json
:language: json-ld

View File

@@ -0,0 +1,74 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
"me:SAnalysis1",
"me:SgAnalysis1",
"me:EmotionAnalysis1",
"me:NER1"
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}

View File

@@ -53,7 +53,8 @@
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program"
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [

View File

@@ -24,7 +24,8 @@
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program"
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [

View File

@@ -1,19 +1,35 @@
.. Senpy documentation master file, created by
sphinx-quickstart on Tue Feb 24 08:57:32 2015.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to Senpy's documentation!
=================================
.. image:: https://readthedocs.org/projects/senpy/badge/?version=latest
:target: http://senpy.readthedocs.io/en/latest/
.. image:: https://badge.fury.io/py/senpy.svg
:target: https://badge.fury.io/py/senpy
.. image:: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/badges/master/build.svg
:target: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/commits/master
.. image:: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/badges/master/coverage.svg
:target: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/commits/master
.. image:: https://img.shields.io/pypi/l/requests.svg
:target: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/
Contents:
Senpy is a framework for sentiment and emotion analysis services.
Services built with senpy are interchangeable and easy to use because they share a common :doc:`apischema`.
It also simplifies service development.
.. image:: senpy-architecture.png
:width: 100%
:align: center
.. toctree::
:caption: Learn more about senpy:
:maxdepth: 2
senpy
installation
usage
api
schema
plugins
demo
:maxdepth: 2
usage
apischema
plugins
conversion
about

View File

@@ -1,6 +1,16 @@
Installation
------------
The stable version can be installed in three ways.
The stable version can be used in two ways: as a system/user library through pip, or as a docker image.
The docker image is the recommended way because it is self-contained and isolated from the system, which means:
* Downloading and using it is just one command
* All dependencies are included
* It is OS-independent (MacOS, Windows, GNU/Linux)
* Several versions may coexist in the same machine without additional virtual environments
Additionally, you may create your own docker image with your custom plugins, ready to be used by others.
Through PIP
***********
@@ -22,6 +32,41 @@ If you want to install senpy globally, use sudo instead of the ``--user`` flag.
Docker Image
************
Build the image or use the pre-built one: ``docker run -ti -p 5000:5000 gsiupm/senpy --host 0.0.0.0 --default-plugins``.
Build the image or use the pre-built one:
To add custom plugins, add a volume and tell senpy where to find the plugins: ``docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --host 0.0.0.0 --default-plugins -f /plugins``
.. code:: bash
docker run -ti -p 5000:5000 gsiupm/senpy --host 0.0.0.0 --default-plugins
To add custom plugins, use a docker volume:
.. code:: bash
docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --host 0.0.0.0 --default-plugins -f /plugins
Python 2
........
There is a Senpy version for python2 too:
.. code:: bash
docker run -ti -p 5000:5000 gsiupm/senpy:python2.7 --host 0.0.0.0 --default-plugins
Alias
.....
If you are using the docker approach regularly, it is advisable to use a script or an alias to simplify your executions:
.. code:: bash
alias senpy='docker run --rm -ti -p 5000:5000 -v $PWD:/senpy-plugins gsiupm/senpy --default-plugins'
Now, you may run senpy from any folder in your computer like so:
.. code:: bash
senpy --version

View File

@@ -1,47 +1,254 @@
Developing new plugins
----------------------
Each plugin represents a different analysis process.There are two types of files that are needed by senpy for loading a plugin:
This document describes how to develop a new analysis plugin. For an example of conversion plugins, see :doc:`conversion`.
Plugins Interface
=======
- Definition file, has the ".senpy" extension.
- Code file, is a python file.
A more step-by-step tutorial with slides is available `here <https://lab.cluster.gsi.dit.upm.es/senpy/senpy-tutorial>`__
Plugins Definitions
===================
.. contents:: :local:
The definition file can be written in JSON or YAML, where the data representation consists on attribute-value pairs.
The principal attributes are:
What is a plugin?
=================
* name: plugin name used in senpy to call the plugin.
* module: indicates the module that will be loaded
A plugin is a program that, given a text, will add annotations to it.
In practice, a plugin consists of at least two files:
.. code:: python
- Definition file: a `.senpy` file that describes the plugin (e.g. what input parameters it accepts, what emotion model it uses).
- Python module: the actual code that will add annotations to each input.
This separation allows us to deploy plugins that use the same code but employ different parameters.
For instance, one could use the same classifier and processing in several plugins, but train with different datasets.
This scenario is particularly useful for evaluation purposes.
The only limitation is that the name of each plugin needs to be unique.
Plugin Definition files
=======================
The definition file contains all the attributes of the plugin, and can be written in YAML or JSON.
When the server is launched, it will recursively search for definition files in the plugin folder (the current folder, by default).
The most important attributes are:
* **name**: unique name that senpy will use internally to identify the plugin.
* **module**: indicates the module that contains the plugin code, which will be automatically loaded by senpy.
* **version**
* extra_params: to add parameters to the senpy API when this plugin is requested. Those parameters may be required, and have aliased names. For instance:
.. code:: yaml
extra_params:
hello_param:
aliases: # required
- hello_param
- hello
required: true
default: Hi you
values:
- Hi you
- Hello y'all
- Howdy
Parameter validation will fail if a required parameter without a default has not been provided, or if the definition includes a set of values and the provided one does not match one of them.
A complete example:
.. code:: yaml
name: <Name of the plugin>
module: <Python file>
version: 0.1
And the json equivalent:
.. code:: json
{
"name" : "senpyPlugin",
"module" : "{python code file}"
"name": "<Name of the plugin>",
"module": "<Python file>",
"version": "0.1"
}
.. code:: python
name: senpyPlugin
module: {python code file}
Plugins Code
=================
============
The basic methods in a plugin are:
* __init__
* activate: used to load memory-hungry resources
* deactivate: used to free up resources
* analyse: called in every user requests. It takes in the parameters supplied by a user and should return a senpy Response.
* analyse_entry: called in every user requests. It takes two parameters: ``Entry``, the entry object, and ``params``, the parameters supplied by the user. It should yield one or more ``Entry`` objects.
Plugins are loaded asynchronously, so don't worry if the activate method takes too long. The plugin will be marked as activated once it is finished executing the method.
Entries
=======
Entries are objects that can be annotated.
By default, entries are `NIF contexts <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>`_ represented in JSON-LD format.
Annotations are added to the object like this:
.. code:: python
entry = Entry()
entry.vocabulary__annotationName = 'myvalue'
entry['vocabulary:annotationName'] = 'myvalue'
entry['annotationNameURI'] = 'myvalue'
Where vocabulary is one of the prefixes defined in the default senpy context, and annotationURI is a full URI.
The value may be any valid JSON-LD dictionary.
For simplicity, senpy includes a series of models by default in the ``senpy.models`` module.
Example plugin
==============
In this section, we will implement a basic sentiment analysis plugin.
To determine the polarity of each entry, the plugin will compare the length of the string to a threshold.
This threshold will be included in the definition file.
The definition file would look like this:
.. code:: yaml
name: helloworld
module: helloworld
version: 0.0
threshold: 10
description: Hello World
Now, in a file named ``helloworld.py``:
.. code:: python
#!/bin/env python
#helloworld.py
from senpy.plugins import AnalysisPlugin
from senpy.models import Sentiment
class HelloWorld(AnalysisPlugin):
def analyse_entry(entry, params):
'''Basically do nothing with each entry'''
sentiment = Sentiment()
if len(entry.text) < self.threshold:
sentiment['marl:hasPolarity'] = 'marl:Positive'
else:
sentiment['marl:hasPolarity'] = 'marl:Negative'
entry.sentiments.append(sentiment)
yield entry
The complete code of the example plugin is available `here <https://lab.cluster.gsi.dit.upm.es/senpy/plugin-prueba>`__.
Loading data and files
======================
Most plugins will need access to files (dictionaries, lexicons, etc.).
It is good practice to specify the paths of these files in the plugin configuration, so the same code can be reused with different resources.
.. code:: yaml
name: dictworld
module: dictworld
dictionary_path: <PATH OF THE FILE>
The path can be either absolute, or relative.
From absolute paths
???????????????????
Absolute paths (such as ``/data/dictionary.csv`` are straightfoward:
.. code:: python
with open(os.path.join(self.dictionary_path) as f:
...
From relative paths
???????????????????
Since plugins are loading dynamically, relative paths will refer to the current working directory.
Instead, what you usually want is to load files *relative to the plugin source folder*, like so:
::
.
..
plugin.senpy
plugin.py
dictionary.csv
For this, we need to first get the path of your source folder first, like so:
.. code:: python
import os
root = os.path.realpath(__file__)
with open(os.path.join(root, self.dictionary_path) as f:
...
Docker image
============
Add the following dockerfile to your project to generate a docker image with your plugin:
.. code:: dockerfile
FROM gsiupm/senpy:0.8.8
This will copy your source folder to the image, and install all dependencies.
Now, to build an image:
.. code:: shell
docker build . -t gsiupm/exampleplugin
And you can run it with:
.. code:: shell
docker run -p 5000:5000 gsiupm/exampleplugin
If the plugin non-source files (:ref:`loading data and files`), the recommended way is to use absolute paths.
Data can then be mounted in the container or added to the image.
The former is recommended for open source plugins with licensed resources, whereas the latter is the most convenient and can be used for private images.
Mounting data:
.. code:: bash
docker run -v $PWD/data:/data gsiupm/exampleplugin
Adding data to the image:
.. code:: dockerfile
FROM gsiupm/senpy:0.8.8
COPY data /
F.A.Q.
======
What annotations can I use?
???????????????????????????
You can add almost any annotation to an entry.
The most common use cases are covered in the :doc:`apischema`.
Why does the analyse function yield instead of return?
??????????????????????????????????????????????????????
This is so that plugins may add new entries to the response or filter some of them.
For instance, a `context detection` plugin may add a new entry for each context in the original entry.
On the other hand, a conversion plugin may leave out those entries that do not contain relevant information.
If I'm using a classifier, where should I train it?
???????????????????????????????????????????????????
@@ -49,9 +256,9 @@ Training a classifier can be time time consuming. To avoid running the training
.. code:: python
from senpy.plugins import ShelfMixin, SenpyPlugin
from senpy.plugins import ShelfMixin, AnalysisPlugin
class MyPlugin(ShelfMixin, SenpyPlugin):
class MyPlugin(ShelfMixin, AnalysisPlugin):
def train(self):
''' Code to train the classifier
'''
@@ -68,27 +275,31 @@ Training a classifier can be time time consuming. To avoid running the training
def deactivate(self):
self.close()
You can speficy a 'shelf_file' in your .senpy file. By default the ShelfMixin creates a file based on the plugin name and stores it in that plugin's folder.
You can specify a 'shelf_file' in your .senpy file. By default the ShelfMixin creates a file based on the plugin name and stores it in that plugin's folder.
I want to implement my service as a plugin, How i can do it?
????????????????????????????????????????????????????????????
Shelves may get corrupted if the plugin exists unexpectedly.
A corrupt shelf prevents the plugin from loading.
If you do not care about the pickle, you can force your plugin to remove the corrupted file and load anyway, set the 'force_shelf' to True in your .senpy file.
This example ilustrate how to implement the Sentiment140 service as a plugin in senpy
How can I turn an external service into a plugin?
?????????????????????????????????????????????????
This example ilustrate how to implement a plugin that accesses the Sentiment140 service.
.. code:: python
class Sentiment140Plugin(SentimentPlugin):
def analyse(self, **params):
def analyse_entry(self, entry, params):
text = entry.text
lang = params.get("language", "auto")
res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
json.dumps({"language": lang,
"data": [{"text": params["input"]}]
"data": [{"text": text}]
}
)
)
p = params.get("prefix", None)
response = Results(prefix=p)
polarity_value = self.maxPolarityValue*int(res.json()["data"][0]
["polarity"]) * 0.25
polarity = "marl:Neutral"
@@ -98,40 +309,39 @@ This example ilustrate how to implement the Sentiment140 service as a plugin in
elif polarity_value < neutral_value:
polarity = "marl:Negative"
entry = Entry(id="Entry0",
nif__isString=params["input"])
sentiment = Sentiment(id="Sentiment0",
prefix=p,
marl__hasPolarity=polarity,
marl__polarityValue=polarity_value)
sentiment.prov__wasGeneratedBy = self.id
entry.sentiments = []
entry.sentiments.append(sentiment)
entry.language = lang
response.entries.append(entry)
return response
yield entry
Where can I define extra parameters to be introduced in the request to my plugin?
?????????????????????????????????????????????????????????????????????????????????
Can my plugin require additional parameters from the user?
??????????????????????????????????????????????????????????
You can add these parameters in the definition file under the attribute "extra_params" : "{param_name}". The name of the parameter has new attributes-value pairs. The basic attributes are:
You can add extra parameters in the definition file under the attribute ``extra_params``.
It takes a dictionary, where the keys are the name of the argument/parameter, and the value has the following fields:
* aliases: the different names which can be used in the request to use the parameter.
* required: this option is a boolean and indicates if the parameters is binding in operation plugin.
* options: the different values of the paremeter.
* default: the default value of the parameter, this is useful in case the paremeter is required and you want to have a default value.
* required: if set to true, users need to provide this parameter unless a default is set.
* options: the different acceptable values of the parameter (i.e. an enum). If set, the value provided must match one of the options.
* default: the default value of the parameter, if none is provided in the request.
.. code:: python
"extra_params": {
"language": {
"aliases": ["language", "l"],
"required": true,
"options": ["es","en"],
"default": "es"
}
}
extra_params
language:
aliases:
- language
- lang
- l
required: true,
options:
- es
- en
default: es
This example shows how to introduce a parameter associated with language.
The extraction of this paremeter is used in the analyse method of the Plugin interface.
@@ -143,9 +353,9 @@ The extraction of this paremeter is used in the analyse method of the Plugin int
Where can I set up variables for using them in my plugin?
?????????????????????????????????????????????????????????
You can add these variables in the definition file with the extracture of attribute-value pair.
You can add these variables in the definition file with the structure of attribute-value pairs.
Once you have added your variables, the next step is to extract them into the plugin. The plugin's __init__ method has a parameter called `info` where you can extract the values of the variables. This info parameter has the structure of a python dictionary.
Every field added to the definition file is available to the plugin instance.
Can I activate a DEBUG mode for my plugin?
???????????????????????????????????????????
@@ -154,7 +364,14 @@ You can activate the DEBUG mode by the command-line tool using the option -d.
.. code:: bash
python -m senpy -d
senpy -d
Additionally, with the ``--pdb`` option you will be dropped into a pdb post mortem shell if an exception is raised.
.. code:: bash
senpy --pdb
Where can I find more code examples?
????????????????????????????????????

View File

@@ -1 +1,2 @@
sphinxcontrib-httpdomain>=1.4
nbsphinx

View File

@@ -1,74 +0,0 @@
Schema Examples
===============
All the examples in this page use the :download:`the main schema <_static/schemas/definitions.json>`.
Simple NIF annotation
---------------------
Description
...........
This example covers the basic example in the NIF documentation: `<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>`_.
Representation
..............
.. literalinclude:: examples/example-basic.json
:language: json-ld
Sentiment Analysis
---------------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-sentiment.json
:emphasize-lines: 5-10,25-33
:language: json-ld
Suggestion Mining
-----------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-suggestion.json
:emphasize-lines: 5-8,22-27
:language: json-ld
Emotion Analysis
----------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-emotion.json
:language: json-ld
:emphasize-lines: 5-8,25-37
Named Entity Recognition
------------------------
Description
...........
Representation
..............
.. literalinclude:: examples/example-ner.json
:emphasize-lines: 5-8,19-34
:language: json-ld
Complete example
----------------
Description
...........
This example covers all of the above cases, integrating all the annotations in the same document.
Representation
..............
.. literalinclude:: examples/example-complete.json
:language: json-ld

Binary file not shown.

Before

Width:  |  Height:  |  Size: 65 KiB

After

Width:  |  Height:  |  Size: 122 KiB

BIN
docs/senpy-framework.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 49 KiB

View File

@@ -1,20 +1,38 @@
What is Senpy?
--------------
Senpy is an open source reference implementation of a linked data model for sentiment and emotion analysis services based on the vocabularies NIF, Marl and Onyx.
Web services can get really complex: data validation, user interaction, formatting, logging., etc.
The figure below summarizes the typical features in an analysis service.
Senpy implements all the common blocks, so developers can focus on what really matters: great analysis algorithms that solve real problems.
The overall goal of the reference implementation Senpy is easing the adoption of the proposed linked data model for sentiment and emotion analysis services, so that services from different providers become interoperable. With this aim, the design of the reference implementation has focused on its extensibility and reusability.
.. image:: senpy-framework.png
:width: 60%
:align: center
A modular approach allows organizations to replace individual components with custom ones developed in-house. Furthermore, organizations can benefit from reusing prepackages modules that provide advanced functionalities, such as algorithms for sentiment and emotion analysis, linked data publication or emotion and sentiment mapping between different providers.
Specifications
==============
Senpy for end users
===================
The model used in Senpy is based on the following specifications:
All services built using senpy share a common interface.
This allows users to use them (almost) interchangeably.
Senpy comes with a :ref:`built-in client`.
* Marl, a vocabulary designed to annotate and describe subjetive opinions expressed on the web or in information systems.
* Onyx, which is built one the same principles as Marl to annotate and describe emotions, and provides interoperability with Emotion Markup Language.
* NIF 2.0, which defines a semantic format and APO for improving interoperability among natural language processing services
Senpy for service developers
============================
Senpy is a framework that turns your sentiment or emotion analysis algorithm into a full blown semantic service.
Senpy takes care of:
* Interfacing with the user: parameter validation, error handling.
* Formatting: JSON-LD, Turtle/n-triples input and output, or simple text input
* Linked Data: senpy results are semantically annotated, using a series of well established vocabularies, and sane default URIs.
* User interface: a web UI where users can explore your service and test different settings
* A client to interact with the service. Currently only available in Python.
Sharing your sentiment analysis with the world has never been easier!
Check out the :doc:`plugins` if you have developed an analysis algorithm (e.g. sentiment analysis) and you want to publish it as a service.
Architecture
============
@@ -29,7 +47,5 @@ Senpy proposes a modular and dynamic architecture that allows:
The framework consists of two main modules: Senpy core, which is the building block of the service, and Senpy plugins, which consist of the analysis algorithm. The next figure depicts a simplified version of the processes involved in an analysis with the Senpy framework.
.. image:: senpy-architecture.png
:height: 400px
:width: 800px
:scale: 100 %
:width: 100%
:align: center

58
docs/server.rst Normal file
View File

@@ -0,0 +1,58 @@
Server
======
The senpy server is launched via the `senpy` command:
.. code:: text
usage: senpy [-h] [--level logging_level] [--debug] [--default-plugins]
[--host HOST] [--port PORT] [--plugins-folder PLUGINS_FOLDER]
[--only-install]
Run a Senpy server
optional arguments:
-h, --help show this help message and exit
--level logging_level, -l logging_level
Logging level
--debug, -d Run the application in debug mode
--default-plugins Load the default plugins
--host HOST Use 0.0.0.0 to accept requests from any host.
--port PORT, -p PORT Port to listen on.
--plugins-folder PLUGINS_FOLDER, -f PLUGINS_FOLDER
Where to look for plugins.
--only-install, -i Do not run a server, only install plugin dependencies
When launched, the server will recursively look for plugins in the specified plugins folder (the current working directory by default).
For every plugin found, it will download its dependencies, and try to activate it.
The default server includes a playground and an endpoint with all plugins found.
Let's run senpy with the default plugins:
.. code:: bash
senpy -f . --default-plugins
Now go to `http://localhost:5000 <http://localhost:5000>`_, you should be greeted by the senpy playground:
.. image:: senpy-playground.png
:width: 100%
:alt: Playground
The playground is a user-friendly way to test your plugins, but you can always use the service directly: `http://localhost:5000/api?input=hello <http://localhost:5000/api?input=hello>`_.
By default, senpy will listen only on the `127.0.0.1` address.
That means you can only access the API from your (or localhost).
You can listen on a different address using the `--host` flag (e.g., 0.0.0.0).
The default port is 5000.
You can change it with the `--port` flag.
For instance, to accept connections on port 6000 on any interface:
.. code:: bash
senpy --host 0.0.0.0 --port 6000
For more options, see the `--help` page.

View File

@@ -1,75 +1,15 @@
Usage
-----
The easiest and recommended way is to just use the command-line tool to load your plugins and launch the server.
First of all, you need to install the package.
See :doc:`installation` for instructions.
Once installed, the `senpy` command should be available.
.. code:: bash
.. toctree::
:maxdepth: 1
senpy
Or, alternatively:
.. code:: bash
python -m senpy
server
SenpyClientUse
commandline
This will create a server with any modules found in the current path.
Useful command-line options
===========================
In case you want to load modules, which are located in different folders under the root folder, use the next option.
.. code:: bash
python -m senpy -f .
The default port used by senpy is 5000, but you can change it using the option `--port`.
.. code:: bash
python -m senpy --port 8080
Also, the host can be changed where senpy is deployed. The default value is `127.0.0.1`.
.. code:: bash
python -m senpy --host 0.0.0.0
For more options, see the `--help` page.
Alternatively, you can use the modules included in senpy to build your own application.
Senpy server
============
Once the server is launched, there is a basic endpoint in the server, which provides a playground to use the plugins that have been loaded.
In case you want to know the different endpoints of the server, there is more information available in the NIF API section_.
Video example
=============
This video shows how to use senpy through command-line tool.
https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk
Request example in python
=========================
This example shows how to make a request to a plugin.
.. code:: python
import requests
import json
r = requests.get('http://127.0.0.1:5000/api/?algo=rand&i=Testing')
response = r.content.decode('utf-8')
response_json = json.loads(response)
.. _section: http://senpy.readthedocs.org/en/latest/api.html

8
docs/vocabularies.rst Normal file
View File

@@ -0,0 +1,8 @@
Vocabularies and model
======================
The model used in Senpy is based on the following vocabularies:
* Marl, a vocabulary designed to annotate and describe subjetive opinions expressed on the web or in information systems.
* Onyx, which is built one the same principles as Marl to annotate and describe emotions, and provides interoperability with Emotion Markup Language.
* NIF 2.0, which defines a semantic format and APO for improving interoperability among natural language processing services

7
k8s/README.md Normal file
View File

@@ -0,0 +1,7 @@
Deploy senpy to a kubernetes cluster.
Usage:
```
kubectl apply -f . -n senpy
```

26
k8s/senpy-deployment.yaml Normal file
View File

@@ -0,0 +1,26 @@
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: senpy-latest
spec:
replicas: 1
template:
metadata:
labels:
role: senpy-latest
app: test
spec:
containers:
- name: senpy-latest
image: gsiupm/senpy:latest
imagePullPolicy: Always
args:
- "--default-plugins"
resources:
limits:
memory: "512Mi"
cpu: "1000m"
ports:
- name: web
containerPort: 5000

14
k8s/senpy-ingress.yaml Normal file
View File

@@ -0,0 +1,14 @@
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: senpy-ingress
spec:
rules:
- host: latest.senpy.cluster.gsi.dit.upm.es
http:
paths:
- path: /
backend:
serviceName: senpy-latest
servicePort: 5000

12
k8s/senpy-svc.yaml Normal file
View File

@@ -0,0 +1,12 @@
---
apiVersion: v1
kind: Service
metadata:
name: senpy-latest
spec:
type: ClusterIP
ports:
- port: 5000
protocol: TCP
selector:
role: senpy-latest

View File

@@ -1,12 +1,11 @@
Flask>=0.10.1
gunicorn>=19.0.0
requests>=2.4.1
GitPython>=0.3.2.RC1
gevent>=1.1rc4
tornado>=4.4.3
PyLD>=0.6.5
six
future
jsonschema
jsonref
PyYAML
semver
rdflib
rdflib-jsonld

View File

@@ -1 +0,0 @@
0.7.1

View File

@@ -17,24 +17,12 @@
"""
Sentiment analysis server in Python
"""
from __future__ import print_function
from .version import __version__
try:
import semver
__version_info__ = semver.parse_version_info(__version__)
import logging
if __version_info__.prerelease:
import logging
logger = logging.getLogger(__name__)
msg = 'WARNING: You are using a pre-release version of {} ({})'.format(
__name__, __version__)
if len(logging.root.handlers) > 0:
logger.info(msg)
else:
import sys
print(msg, file=sys.stderr)
except ImportError:
print('semver not installed. Not doing version checking')
logger = logging.getLogger(__name__)
logger.info('Using senpy version: {}'.format(__version__))
__all__ = ['api', 'blueprints', 'cli', 'extensions', 'models', 'plugins']

View File

@@ -22,15 +22,12 @@ the server.
from flask import Flask
from senpy.extensions import Senpy
from gevent.wsgi import WSGIServer
from gevent.monkey import patch_all
import logging
import os
import argparse
import senpy
patch_all(thread=False)
SERVER_PORT = os.environ.get("PORT", 5000)
@@ -57,7 +54,7 @@ def main():
parser.add_argument(
'--host',
type=str,
default="127.0.0.1",
default="0.0.0.0",
help='Use 0.0.0.0 to accept requests from any host.')
parser.add_argument(
'--port',
@@ -76,9 +73,22 @@ def main():
'-i',
action='store_true',
default=False,
help='Do not run a server, only install plugin dependencies'
)
help='Do not run a server, only install plugin dependencies')
parser.add_argument(
'--threaded',
action='store_false',
default=True,
help='Run a threaded server')
parser.add_argument(
'--version',
'-v',
action='store_true',
default=False,
help='Output the senpy version and exit')
args = parser.parse_args()
if args.version:
print('Senpy version {}'.format(senpy.__version__))
exit(1)
logging.basicConfig()
rl = logging.getLogger()
rl.setLevel(getattr(logging, args.level))
@@ -89,15 +99,13 @@ def main():
sp.install_deps()
return
sp.activate_all()
http_server = WSGIServer((args.host, args.port), app)
try:
print('Senpy version {}'.format(senpy.__version__))
print('Server running on port %s:%d. Ctrl+C to quit' % (args.host,
args.port))
http_server.serve_forever()
except KeyboardInterrupt:
print('Bye!')
http_server.stop()
print('Senpy version {}'.format(senpy.__version__))
print('Server running on port %s:%d. Ctrl+C to quit' % (args.host,
args.port))
app.run(args.host,
args.port,
threaded=args.threaded,
debug=app.debug)
sp.deactivate_all()

View File

@@ -7,6 +7,38 @@ API_PARAMS = {
"algorithm": {
"aliases": ["algorithm", "a", "algo"],
"required": False,
},
"outformat": {
"@id": "outformat",
"aliases": ["outformat", "o"],
"default": "json-ld",
"required": True,
"options": ["json-ld", "turtle"],
},
"expanded-jsonld": {
"@id": "expanded-jsonld",
"aliases": ["expanded", "expanded-jsonld"],
"required": True,
"default": 0
},
"emotionModel": {
"@id": "emotionModel",
"aliases": ["emotionModel", "emoModel"],
"required": False
},
"plugin_type": {
"@id": "pluginType",
"description": 'What kind of plugins to list',
"aliases": ["pluginType", "plugin_type"],
"required": True,
"default": "analysisPlugin"
},
"conversion": {
"@id": "conversion",
"description": "How to show the elements that have (not) been converted",
"required": True,
"options": ["filtered", "nested", "full"],
"default": "full"
}
}
@@ -38,7 +70,7 @@ NIF_PARAMS = {
"aliases": ["f", "informat"],
"required": False,
"default": "text",
"options": ["turtle", "text"],
"options": ["turtle", "text", "json-ld"],
},
"intype": {
"@id": "intype",
@@ -47,13 +79,6 @@ NIF_PARAMS = {
"default": "direct",
"options": ["direct", "url", "file"],
},
"outformat": {
"@id": "outformat",
"aliases": ["outformat", "o"],
"default": "json-ld",
"required": False,
"options": ["json-ld"],
},
"language": {
"@id": "language",
"aliases": ["language", "l"],
@@ -76,12 +101,12 @@ NIF_PARAMS = {
def parse_params(indict, spec=NIF_PARAMS):
outdict = {}
logger.debug("Parsing: {}\n{}".format(indict, spec))
outdict = indict.copy()
wrong_params = {}
for param, options in iteritems(spec):
if param[0] != "@": # Exclude json-ld properties
logger.debug("Param: %s - Options: %s", param, options)
for alias in options["aliases"]:
for alias in options.get("aliases", []):
if alias in indict:
outdict[param] = indict[alias]
if param not in outdict:
@@ -95,8 +120,9 @@ def parse_params(indict, spec=NIF_PARAMS):
outdict[param] not in spec[param]["options"]:
wrong_params[param] = spec[param]
if wrong_params:
logger.debug("Error parsing: %s", wrong_params)
message = Error(
status=404,
status=400,
message="Missing or invalid parameters",
parameters=outdict,
errors={param: error

View File

@@ -17,18 +17,21 @@
"""
Blueprints for Senpy
"""
from flask import (Blueprint, request, current_app,
render_template, url_for, jsonify)
from flask import (Blueprint, request, current_app, render_template, url_for,
jsonify)
from .models import Error, Response, Plugins, read_schema
from .api import WEB_PARAMS, parse_params
from .api import WEB_PARAMS, API_PARAMS, parse_params
from .version import __version__
from functools import wraps
import logging
import json
logger = logging.getLogger(__name__)
api_blueprint = Blueprint("api", __name__)
demo_blueprint = Blueprint("demo", __name__)
ns_blueprint = Blueprint("ns", __name__)
def get_params(req):
@@ -43,12 +46,21 @@ def get_params(req):
@demo_blueprint.route('/')
def index():
return render_template("index.html")
return render_template("index.html", version=__version__)
@api_blueprint.route('/contexts/<entity>.jsonld')
def context(entity="context"):
return jsonify({"@context": Response.context})
context = Response._context
context['@vocab'] = url_for('ns.index', _external=True)
return jsonify({"@context": context})
@ns_blueprint.route('/') # noqa: F811
def index():
context = Response._context
context['@vocab'] = url_for('.ns', _external=True)
return jsonify({"@context": context})
@api_blueprint.route('/schemas/<schema>')
@@ -62,26 +74,42 @@ def schema(schema="definitions"):
def basic_api(f):
@wraps(f)
def decorated_function(*args, **kwargs):
print('Getting request:')
print(request)
raw_params = get_params(request)
web_params = parse_params(raw_params, spec=WEB_PARAMS)
headers = {'X-ORIGINAL-PARAMS': json.dumps(raw_params)}
# Get defaults
web_params = parse_params({}, spec=WEB_PARAMS)
api_params = parse_params({}, spec=API_PARAMS)
if hasattr(request, 'params'):
request.params.update(raw_params)
else:
request.params = raw_params
outformat = 'json-ld'
try:
print('Getting request:')
print(request)
web_params = parse_params(raw_params, spec=WEB_PARAMS)
api_params = parse_params(raw_params, spec=API_PARAMS)
if hasattr(request, 'params'):
request.params.update(api_params)
else:
request.params = api_params
response = f(*args, **kwargs)
except Error as ex:
response = ex
in_headers = web_params["inHeaders"] != "0"
headers = {'X-ORIGINAL-PARAMS': raw_params}
logger.error(ex)
if current_app.debug:
raise
in_headers = web_params['inHeaders'] != "0"
expanded = api_params['expanded-jsonld']
outformat = api_params['outformat']
return response.flask(
in_headers=in_headers,
headers=headers,
context_uri=url_for(
'api.context', entity=type(response).__name__, _external=True))
prefix=url_for('.api', _external=True),
context_uri=url_for('api.context',
entity=type(response).__name__,
_external=True),
outformat=outformat,
expanded=expanded)
return decorated_function
@@ -97,27 +125,22 @@ def api():
@basic_api
def plugins():
sp = current_app.senpy
dic = Plugins(plugins=list(sp.plugins.values()))
ptype = request.params.get('plugin_type')
plugins = sp.filter_plugins(plugin_type=ptype)
dic = Plugins(plugins=list(plugins.values()))
return dic
@api_blueprint.route('/plugins/<plugin>/', methods=['POST', 'GET'])
@api_blueprint.route('/plugins/<plugin>/<action>', methods=['POST', 'GET'])
@basic_api
def plugin(plugin=None, action="list"):
def plugin(plugin=None):
sp = current_app.senpy
if plugin == 'default' and sp.default_plugin:
response = sp.default_plugin
plugin = response.name
elif plugin in sp.plugins:
response = sp.plugins[plugin]
return sp.default_plugin
plugins = sp.filter_plugins(
id='plugins/{}'.format(plugin)) or sp.filter_plugins(name=plugin)
if plugins:
response = list(plugins.values())[0]
else:
return Error(message="Plugin not found", status=404)
if action == "list":
return response
method = "{}_plugin".format(action)
if (hasattr(sp, method)):
getattr(sp, method)(plugin)
return Response(message="Ok")
else:
return Error(message="action '{}' not allowed".format(action))
return response

View File

@@ -1,27 +1,27 @@
import requests
import logging
from . import models
from .plugins import default_plugin_type
logger = logging.getLogger(__name__)
class Client(object):
def __init__(self, endpoint):
self.endpoint = endpoint
def analyse(self, input, method='GET', **kwargs):
return self.request('/', method=method, input=input, **kwargs)
def plugins(self, ptype=default_plugin_type):
resp = self.request(path='/plugins', plugin_type=ptype).plugins
return {p.name: p for p in resp}
def request(self, path=None, method='GET', **params):
url = '{}{}'.format(self.endpoint, path)
response = requests.request(method=method,
url=url,
params=params)
response = requests.request(method=method, url=url, params=params)
try:
resp = models.from_dict(response.json())
resp.validate(resp)
return resp
except Exception as ex:
logger.error(('There seems to be a problem with the response:\n'
'\tURL: {url}\n'
@@ -30,16 +30,12 @@ class Client(object):
'#### Response:\n'
'\tCode: {code}'
'\tContent: {content}'
'\n').format(error=ex,
url=url,
code=response.status_code,
content=response.content))
'\n').format(
error=ex,
url=url,
code=response.status_code,
content=response.content))
raise ex
if __name__ == '__main__':
c = Client('http://senpy.cluster.gsi.dit.upm.es/api/')
resp = c.analyse('hello')
# print(resp)
print(resp.entries)
resp.validate()
if isinstance(resp, models.Error):
raise resp
return resp

View File

@@ -1,47 +1,59 @@
"""
Main class for Senpy.
It orchestrates plugin (de)activation and analysis.
"""
from future import standard_library
standard_library.install_aliases()
from .plugins import SentimentPlugin
from .models import Error
from .blueprints import api_blueprint, demo_blueprint
from . import plugins
from .plugins import SenpyPlugin
from .models import Error, Entry, Results, from_string
from .blueprints import api_blueprint, demo_blueprint, ns_blueprint
from .api import API_PARAMS, NIF_PARAMS, parse_params
from git import Repo, InvalidGitRepositoryError
from threading import Thread
import os
import copy
import fnmatch
import inspect
import sys
import imp
import importlib
import logging
import traceback
import yaml
import pip
import subprocess
logger = logging.getLogger(__name__)
def log_subprocess_output(process):
for line in iter(process.stdout.readline, b''):
logger.info('%r', line)
for line in iter(process.stderr.readline, b''):
logger.error('%r', line)
class Senpy(object):
""" Default Senpy extension for Flask """
def __init__(self,
app=None,
plugin_folder="plugins",
plugin_folder=".",
default_plugins=False):
self.app = app
self._search_folders = set()
self._plugin_list = []
self._outdated = True
self._default = None
self.add_folder(plugin_folder)
if default_plugins:
base_folder = os.path.join(os.path.dirname(__file__), "plugins")
self.add_folder(base_folder)
self.add_folder('plugins', from_root=True)
else:
# Add only conversion plugins
self.add_folder(os.path.join('plugins', 'conversion'),
from_root=True)
if app is not None:
self.init_app(app)
@@ -60,9 +72,12 @@ class Senpy(object):
else:
app.teardown_request(self.teardown)
app.register_blueprint(api_blueprint, url_prefix="/api")
app.register_blueprint(ns_blueprint, url_prefix="/ns")
app.register_blueprint(demo_blueprint, url_prefix="/")
def add_folder(self, folder):
def add_folder(self, folder, from_root=False):
if from_root:
folder = os.path.join(os.path.dirname(__file__), folder)
logger.debug("Adding folder: %s", folder)
if os.path.isdir(folder):
self._search_folders.add(folder)
@@ -70,59 +85,177 @@ class Senpy(object):
else:
logger.debug("Not a folder: %s", folder)
def analyse(self, **params):
algo = None
logger.debug("analysing with params: {}".format(params))
api_params = parse_params(params, spec=API_PARAMS)
if "algorithm" in api_params and api_params["algorithm"]:
algo = api_params["algorithm"]
elif self.plugins:
algo = self.default_plugin and self.default_plugin.name
if not algo:
def _find_plugins(self, params):
if not self.analysis_plugins:
raise Error(
status=404,
message=("No plugins found."
" Please install one.").format(algo))
if algo not in self.plugins:
logger.debug(("The algorithm '{}' is not valid\n"
"Valid algorithms: {}").format(algo,
self.plugins.keys()))
" Please install one."))
api_params = parse_params(params, spec=API_PARAMS)
algos = None
if "algorithm" in api_params and api_params["algorithm"]:
algos = api_params["algorithm"].split(',')
elif self.default_plugin:
algos = [self.default_plugin.name, ]
else:
raise Error(
status=404,
message="The algorithm '{}' is not valid".format(algo))
message="No default plugin found, and None provided")
if not self.plugins[algo].is_activated:
logger.debug("Plugin not activated: {}".format(algo))
raise Error(
status=400,
message=("The algorithm '{}'"
" is not activated yet").format(algo))
plug = self.plugins[algo]
plugins = list()
for algo in algos:
if algo not in self.plugins:
logger.debug(("The algorithm '{}' is not valid\n"
"Valid algorithms: {}").format(algo,
self.plugins.keys()))
raise Error(
status=404,
message="The algorithm '{}' is not valid".format(algo))
if not self.plugins[algo].is_activated:
logger.debug("Plugin not activated: {}".format(algo))
raise Error(
status=400,
message=("The algorithm '{}'"
" is not activated yet").format(algo))
plugins.append(self.plugins[algo])
return plugins
def _get_params(self, params, plugin=None):
nif_params = parse_params(params, spec=NIF_PARAMS)
extra_params = plug.get('extra_params', {})
specific_params = parse_params(params, spec=extra_params)
nif_params.update(specific_params)
try:
resp = plug.analyse(**nif_params)
resp.analysis.append(plug)
logger.debug("Returning analysis result: {}".format(resp))
except Exception as ex:
resp = Error(message=str(ex), status=500)
if plugin:
extra_params = plugin.get('extra_params', {})
specific_params = parse_params(params, spec=extra_params)
nif_params.update(specific_params)
return nif_params
def _get_entries(self, params):
if params['informat'] == 'text':
results = Results()
entry = Entry(text=params['input'])
results.entries.append(entry)
elif params['informat'] == 'json-ld':
results = from_string(params['input'], cls=Results)
else:
raise NotImplemented('Informat {} is not implemented'.format(params['informat']))
return results
def _process_entries(self, entries, plugins, nif_params):
if not plugins:
for i in entries:
yield i
return
plugin = plugins[0]
specific_params = self._get_params(nif_params, plugin)
results = plugin.analyse_entries(entries, specific_params)
for i in self._process_entries(results, plugins[1:], nif_params):
yield i
def _process_response(self, resp, plugins, nif_params):
entries = resp.entries
resp.entries = []
for plug in plugins:
resp.analysis.append(plug.id)
for i in self._process_entries(entries, plugins, nif_params):
resp.entries.append(i)
return resp
def analyse(self, **api_params):
"""
Main method that analyses a request, either from CLI or HTTP.
It uses a dictionary of parameters, provided by the user.
"""
logger.debug("analysing with params: {}".format(api_params))
plugins = self._find_plugins(api_params)
nif_params = self._get_params(api_params)
resp = self._get_entries(nif_params)
if 'with_parameters' in api_params:
resp.parameters = nif_params
try:
resp = self._process_response(resp, plugins, nif_params)
self.convert_emotions(resp, plugins, nif_params)
logger.debug("Returning analysis result: {}".format(resp))
except (Error, Exception) as ex:
if not isinstance(ex, Error):
ex = Error(message=str(ex), status=500)
logger.exception('Error returning analysis result')
raise ex
return resp
def _conversion_candidates(self, fromModel, toModel):
candidates = self.filter_plugins(plugin_type='emotionConversionPlugin')
for name, candidate in candidates.items():
for pair in candidate.onyx__doesConversion:
logging.debug(pair)
if pair['onyx:conversionFrom'] == fromModel \
and pair['onyx:conversionTo'] == toModel:
# logging.debug('Found candidate: {}'.format(candidate))
yield candidate
def convert_emotions(self, resp, plugins, params):
"""
Conversion of all emotions in a response **in place**.
In addition to converting from one model to another, it has
to include the conversion plugin to the analysis list.
Needless to say, this is far from an elegant solution, but it works.
@todo refactor and clean up
"""
toModel = params.get('emotionModel', None)
if not toModel:
return
logger.debug('Asked for model: {}'.format(toModel))
output = params.get('conversion', None)
candidates = {}
for plugin in plugins:
try:
fromModel = plugin.get('onyx:usesEmotionModel', None)
candidates[plugin.id] = next(self._conversion_candidates(fromModel, toModel))
logger.debug('Analysis plugin {} uses model: {}'.format(plugin.id, fromModel))
except StopIteration:
e = Error(('No conversion plugin found for: '
'{} -> {}'.format(fromModel, toModel)))
e.original_response = resp
e.parameters = params
raise e
newentries = []
for i in resp.entries:
if output == "full":
newemotions = copy.deepcopy(i.emotions)
else:
newemotions = []
for j in i.emotions:
plugname = j['prov:wasGeneratedBy']
candidate = candidates[plugname]
resp.analysis.append(candidate.id)
for k in candidate.convert(j, fromModel, toModel, params):
k.prov__wasGeneratedBy = candidate.id
if output == 'nested':
k.prov__wasDerivedFrom = j
newemotions.append(k)
i.emotions = newemotions
newentries.append(i)
resp.entries = newentries
resp.analysis = list(set(resp.analysis))
@property
def default_plugin(self):
candidates = self.filter_plugins(is_activated=True)
if len(candidates) > 0:
candidate = list(candidates.values())[0]
logger.debug("Default: {}".format(candidate.name))
return candidate
else:
return None
candidate = self._default
if not candidate:
candidates = self.filter_plugins(plugin_type='analysisPlugin',
is_activated=True)
if len(candidates) > 0:
candidate = list(candidates.values())[0]
logger.debug("Default: {}".format(candidate))
return candidate
def parameters(self, algo):
return getattr(
self.plugins.get(algo) or self.default_plugin, "extra_params", {})
@default_plugin.setter
def default_plugin(self, value):
if isinstance(value, SenpyPlugin):
self._default = value
else:
self._default = self.plugins[value]
def activate_all(self, sync=False):
ps = []
@@ -164,11 +297,13 @@ class Senpy(object):
plugin.name, ex, traceback.format_exc())
logger.error(msg)
raise Error(msg)
if sync:
if sync or 'async' in plugin and not plugin.async:
act()
else:
th = Thread(target=act)
th.start()
return th
def deactivate_plugin(self, plugin_name, sync=False):
try:
@@ -184,45 +319,49 @@ class Senpy(object):
plugin.deactivate()
logger.info("Plugin deactivated: {}".format(plugin.name))
except Exception as ex:
logger.error("Error deactivating plugin {}: {}".format(
plugin.name, ex))
logger.error(
"Error deactivating plugin {}: {}".format(plugin.name, ex))
logger.error("Trace: {}".format(traceback.format_exc()))
if sync:
if sync or 'async' in plugin and not plugin.async:
deact()
else:
th = Thread(target=deact)
th.start()
def reload_plugin(self, name):
logger.debug("Reloading {}".format(name))
plugin = self.plugins[name]
try:
del self.plugins[name]
nplug = self._load_plugin(plugin.module, plugin.path)
self.plugins[nplug.name] = nplug
except Exception as ex:
logger.error('Error reloading {}: {}'.format(name, ex))
self.plugins[name] = plugin
return th
@classmethod
def validate_info(cls, info):
return all(x in info for x in ('name', 'module', 'version'))
return all(x in info for x in ('name', 'module', 'description', 'version'))
def install_deps(self):
for i in self.plugins.values():
self._install_deps(i._info)
self._install_deps(i)
@classmethod
def _install_deps(cls, info=None):
requirements = info.get('requirements', [])
if requirements:
pip_args = []
pip_args = ['pip']
pip_args.append('install')
pip_args.append('--use-wheel')
for req in requirements:
pip_args.append(req)
logger.info('Installing requirements: ' + str(requirements))
pip.main(pip_args)
process = subprocess.Popen(pip_args,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
log_subprocess_output(process)
exitcode = process.wait()
if exitcode != 0:
raise Error("Dependencies not properly installed")
@classmethod
def _load_module(cls, name, root):
sys.path.append(root)
tmp = importlib.import_module(name)
sys.path.remove(root)
return tmp
@classmethod
def _load_plugin_from_info(cls, info, root):
@@ -231,34 +370,21 @@ class Senpy(object):
return None, None
module = info["module"]
name = info["name"]
sys.path.append(root)
(fp, pathname, desc) = imp.find_module(module, [root, ])
try:
cls._install_deps(info)
tmp = imp.load_module(module, fp, pathname, desc)
sys.path.remove(root)
candidate = None
for _, obj in inspect.getmembers(tmp):
if inspect.isclass(obj) and inspect.getmodule(obj) == tmp:
logger.debug(("Found plugin class:"
" {}@{}").format(obj, inspect.getmodule(
obj)))
candidate = obj
break
if not candidate:
logger.debug("No valid plugin for: {}".format(module))
return
module = candidate(info=info)
repo_path = root
module._repo = Repo(repo_path)
except InvalidGitRepositoryError:
logger.debug("The plugin {} is not in a Git repository".format(
module))
module._repo = None
except Exception as ex:
logger.error("Exception importing {}: {}".format(module, ex))
logger.error("Trace: {}".format(traceback.format_exc()))
return None, None
cls._install_deps(info)
tmp = cls._load_module(module, root)
candidate = None
for _, obj in inspect.getmembers(tmp):
if inspect.isclass(obj) and inspect.getmodule(obj) == tmp:
logger.debug(("Found plugin class:"
" {}@{}").format(obj, inspect.getmodule(obj)))
candidate = obj
break
if not candidate:
logger.debug("No valid plugin for: {}".format(module))
return
module = candidate(info=info)
return name, module
@classmethod
@@ -276,7 +402,7 @@ class Senpy(object):
for root, dirnames, filenames in os.walk(search_folder):
for filename in fnmatch.filter(filenames, '*.senpy'):
name, plugin = self._load_plugin(root, filename)
if plugin and name not in self._plugin_list:
if plugin and name:
plugins[name] = plugin
self._outdated = False
@@ -293,23 +419,9 @@ class Senpy(object):
return self._plugin_list
def filter_plugins(self, **kwargs):
""" Filter plugins by different criteria """
return plugins.pfilter(self.plugins, **kwargs)
def matches(plug):
res = all(getattr(plug, k, None) == v for (k, v) in kwargs.items())
logger.debug("matching {} with {}: {}".format(plug.name, kwargs,
res))
return res
if not kwargs:
return self.plugins
else:
return {n: p for n, p in self.plugins.items() if matches(p)}
def sentiment_plugins(self):
""" Return only the sentiment plugins """
return {
p: plugin
for p, plugin in self.plugins.items()
if isinstance(plugin, SentimentPlugin)
}
@property
def analysis_plugins(self):
""" Return only the analysis plugins """
return self.filter_plugins(plugin_type='analysisPlugin')

View File

@@ -16,6 +16,9 @@ import jsonref
import jsonschema
from flask import Response as FlaskResponse
from pyld import jsonld
from rdflib import Graph
import logging
@@ -72,31 +75,60 @@ base_context = Context.load(CONTEXT_PATH)
class SenpyMixin(object):
context = base_context["@context"]
_context = base_context["@context"]
def flask(self, in_headers=True, headers=None, **kwargs):
def flask(self,
in_headers=True,
headers=None,
outformat='json-ld',
**kwargs):
"""
Return the values and error to be used in flask.
So far, it returns a fixed context. We should store/generate different
contexts if the plugin adds more aliases.
"""
headers = headers or {}
kwargs["with_context"] = True
js = self.jsonld(**kwargs)
if in_headers:
url = js["@context"]
del js["@context"]
kwargs["with_context"] = not in_headers
content, mimetype = self.serialize(format=outformat,
with_mime=True,
**kwargs)
if outformat == 'json-ld' and in_headers:
headers.update({
"Link": ('<%s>;'
'rel="http://www.w3.org/ns/json-ld#context";'
' type="application/ld+json"' % url)
"Link":
('<%s>;'
'rel="http://www.w3.org/ns/json-ld#context";'
' type="application/ld+json"' % kwargs.get('context_uri'))
})
return FlaskResponse(
json.dumps(
js, indent=2, sort_keys=True),
response=content,
status=getattr(self, "status", 200),
headers=headers,
mimetype="application/json")
mimetype=mimetype)
def serialize(self, format='json-ld', with_mime=False, **kwargs):
js = self.jsonld(**kwargs)
if format == 'json-ld':
content = json.dumps(js, indent=2, sort_keys=True)
mimetype = "application/json"
elif format in ['turtle', ]:
logger.debug(js)
content = json.dumps(js, indent=2, sort_keys=True)
g = Graph().parse(
data=content,
format='json-ld',
base=kwargs.get('prefix'),
context=self._context)
logger.debug(
'Parsing with prefix: {}'.format(kwargs.get('prefix')))
content = g.serialize(format='turtle').decode('utf-8')
mimetype = 'text/{}'.format(format)
else:
raise Error('Unknown outformat: {}'.format(format))
if with_mime:
return content, mimetype
else:
return content
def serializable(self):
def ser_or_down(item):
@@ -115,28 +147,30 @@ class SenpyMixin(object):
return ser_or_down(self._plain_dict())
def jsonld(self, with_context=True, context_uri=None):
def jsonld(self,
with_context=True,
context_uri=None,
prefix=None,
expanded=False):
ser = self.serializable()
if with_context:
context = []
if context_uri:
context = context_uri
else:
context = self.context.copy()
if hasattr(self, 'prefix'):
# This sets @base for the document, which will be used in
# all relative URIs. For example, if a uri is "Example" and
# prefix =s "http://example.com", the absolute URI after
# expanding with JSON-LD will be "http://example.com/Example"
prefix_context = {"@base": self.prefix}
if isinstance(context, list):
context.append(prefix_context)
else:
context = [context, prefix_context]
ser["@context"] = context
return ser
result = jsonld.compact(
ser,
self._context,
options={
'base': prefix,
'expandContext': self._context,
'senpy': prefix
})
if context_uri:
result['@context'] = context_uri
if expanded:
result = jsonld.expand(
result, options={'base': prefix,
'expandContext': self._context})
if not with_context:
del result['@context']
return result
def to_JSON(self, *args, **kwargs):
js = json.dumps(self.jsonld(*args, **kwargs), indent=4, sort_keys=True)
@@ -161,13 +195,14 @@ class BaseModel(SenpyMixin, dict):
if 'id' in kwargs:
self.id = kwargs.pop('id')
elif kwargs.pop('_auto_id', True):
self.id = '_:{}_{}'.format(
type(self).__name__, time.time())
self.id = '_:{}_{}'.format(type(self).__name__, time.time())
temp = dict(*args, **kwargs)
for obj in [self.schema, ] + self.schema.get('allOf', []):
for obj in [
self.schema,
] + self.schema.get('allOf', []):
for k, v in obj.get('properties', {}).items():
if 'default' in v:
if 'default' in v and k not in temp:
temp[k] = copy.deepcopy(v['default'])
for i in temp:
@@ -175,14 +210,11 @@ class BaseModel(SenpyMixin, dict):
if nk != i:
temp[nk] = temp[i]
del temp[i]
if 'context' in temp:
context = temp['context']
del temp['context']
self.__dict__['context'] = Context.load(context)
try:
temp['@type'] = getattr(self, '@type')
except AttributeError:
logger.warn('Creating an instance of an unknown model')
super(BaseModel, self).__init__(temp)
def _get_key(self, key):
@@ -205,7 +237,10 @@ class BaseModel(SenpyMixin, dict):
self.__setitem__(self._get_key(key), value)
def __delattr__(self, key):
self.__delitem__(self._get_key(key))
try:
object.__delattr__(self, key)
except AttributeError:
self.__delitem__(self._get_key(key))
def _plain_dict(self):
d = {k: v for (k, v) in self.items() if k[0] != "_"}
@@ -221,23 +256,48 @@ def register(rsubclass, rtype=None):
_subtypes[rtype or rsubclass.__name__] = rsubclass
def from_dict(indict):
target = indict.get('@type', None)
if target and target in _subtypes:
cls = _subtypes[target]
else:
cls = BaseModel
return cls(**indict)
def from_dict(indict, cls=None):
if not cls:
target = indict.get('@type', None)
try:
if target and target in _subtypes:
cls = _subtypes[target]
else:
cls = BaseModel
except Exception:
cls = BaseModel
outdict = dict()
for k, v in indict.items():
if k == '@context':
pass
elif isinstance(v, dict):
v = from_dict(indict[k])
elif isinstance(v, list):
for ix, v2 in enumerate(v):
if isinstance(v2, dict):
v[ix] = from_dict(v2)
outdict[k] = v
return cls(**outdict)
def from_string(string, **kwargs):
return from_dict(json.loads(string), **kwargs)
def from_json(injson):
indict = json.loads(injson)
return from_dict(indict)
def from_schema(name, schema_file=None, base_classes=None):
base_classes = base_classes or []
base_classes.append(BaseModel)
schema_file = schema_file or '{}.json'.format(name)
class_name = '{}{}'.format(i[0].upper(), i[1:])
class_name = '{}{}'.format(name[0].upper(), name[1:])
newclass = type(class_name, tuple(base_classes), {})
setattr(newclass, '@type', name)
setattr(newclass, 'schema', read_schema(schema_file))
setattr(newclass, 'class_name', class_name)
register(newclass, name)
return newclass
@@ -248,33 +308,44 @@ def _add_from_schema(*args, **kwargs):
del generatedClass
for i in ['response',
'results',
'entry',
'sentiment',
'analysis',
'emotionSet',
'emotion',
'emotionModel',
'suggestion',
'plugin',
'emotionPlugin',
'sentimentPlugin',
'plugins']:
for i in [
'analysis',
'emotion',
'emotionConversion',
'emotionConversionPlugin',
'emotionAnalysis',
'emotionModel',
'emotionPlugin',
'emotionSet',
'entry',
'plugin',
'plugins',
'response',
'results',
'sentiment',
'sentimentPlugin',
'suggestion',
]:
_add_from_schema(i)
_ErrorModel = from_schema('error')
class Error(SenpyMixin, BaseException):
def __init__(self,
message,
*args,
**kwargs):
class Error(SenpyMixin, Exception):
def __init__(self, message, *args, **kwargs):
super(Error, self).__init__(self, message, message)
self._error = _ErrorModel(message=message, *args, **kwargs)
self.message = message
def __getitem__(self, key):
return self._error[key]
def __setitem__(self, key, value):
self._error[key] = value
def __delitem__(self, key):
del self._error[key]
def __getattr__(self, key):
if key != '_error' and hasattr(self._error, key):
return getattr(self._error, key)
@@ -289,5 +360,8 @@ class Error(SenpyMixin, BaseException):
def __delattr__(self, key):
delattr(self._error, key)
def __str__(self):
return str(self.to_JSON(with_context=False))
register(Error, 'error')

View File

@@ -1,88 +0,0 @@
from future import standard_library
standard_library.install_aliases()
import inspect
import os.path
import pickle
import logging
import tempfile
from . import models
logger = logging.getLogger(__name__)
class SenpyPlugin(models.Plugin):
def __init__(self, info=None):
if not info:
raise models.Error(message=("You need to provide configuration"
"information for the plugin."))
logger.debug("Initialising {}".format(info))
super(SenpyPlugin, self).__init__(info)
self.id = '{}_{}'.format(self.name, self.version)
self._info = info
self.is_activated = False
def get_folder(self):
return os.path.dirname(inspect.getfile(self.__class__))
def analyse(self, *args, **kwargs):
logger.debug("Analysing with: {} {}".format(self.name, self.version))
pass
def activate(self):
pass
def deactivate(self):
pass
def __del__(self):
''' Destructor, to make sure all the resources are freed '''
self.deactivate()
class SentimentPlugin(SenpyPlugin, models.SentimentPlugin):
def __init__(self, info, *args, **kwargs):
super(SentimentPlugin, self).__init__(info, *args, **kwargs)
self.minPolarityValue = float(info.get("minPolarityValue", 0))
self.maxPolarityValue = float(info.get("maxPolarityValue", 1))
self["@type"] = "marl:SentimentAnalysis"
class EmotionPlugin(SentimentPlugin, models.EmotionPlugin):
def __init__(self, info, *args, **kwargs):
self.minEmotionValue = float(info.get("minEmotionValue", 0))
self.maxEmotionValue = float(info.get("maxEmotionValue", 0))
self["@type"] = "onyx:EmotionAnalysis"
class ShelfMixin(object):
@property
def sh(self):
if not hasattr(self, '_sh') or self._sh is None:
self.__dict__['_sh'] = {}
if os.path.isfile(self.shelf_file):
self.__dict__['_sh'] = pickle.load(open(self.shelf_file, 'rb'))
return self._sh
@sh.deleter
def sh(self):
if os.path.isfile(self.shelf_file):
os.remove(self.shelf_file)
del self.__dict__['_sh']
self.save()
@property
def shelf_file(self):
if not hasattr(self, '_shelf_file') or not self._shelf_file:
if hasattr(self, '_info') and 'shelf_file' in self._info:
self.__dict__['_shelf_file'] = self._info['shelf_file']
else:
self._shelf_file = os.path.join(tempfile.gettempdir(),
self.name + '.p')
return self._shelf_file
def save(self):
logger.debug('saving pickle')
if hasattr(self, '_sh') and self._sh is not None:
with open(self.shelf_file, 'wb') as f:
pickle.dump(self._sh, f)

162
senpy/plugins/__init__.py Normal file
View File

@@ -0,0 +1,162 @@
from future import standard_library
standard_library.install_aliases()
import inspect
import os.path
import os
import pickle
import logging
import tempfile
import copy
from .. import models
from ..api import API_PARAMS
logger = logging.getLogger(__name__)
class Plugin(models.Plugin):
def __init__(self, info=None):
"""
Provides a canonical name for plugins and serves as base for other
kinds of plugins.
"""
if not info:
raise models.Error(message=("You need to provide configuration"
"information for the plugin."))
logger.debug("Initialising {}".format(info))
id = 'plugins/{}_{}'.format(info['name'], info['version'])
super(Plugin, self).__init__(id=id, **info)
self.is_activated = False
def get_folder(self):
return os.path.dirname(inspect.getfile(self.__class__))
def activate(self):
pass
def deactivate(self):
pass
SenpyPlugin = Plugin
class AnalysisPlugin(Plugin):
def analyse(self, *args, **kwargs):
raise NotImplemented(
'Your method should implement either analyse or analyse_entry')
def analyse_entry(self, entry, parameters):
""" An implemented plugin should override this method.
This base method is here to adapt old style plugins which only
implement the *analyse* function.
Note that this method may yield an annotated entry or a list of
entries (e.g. in a tokenizer)
"""
text = entry['text']
params = copy.copy(parameters)
params['input'] = text
results = self.analyse(**params)
for i in results.entries:
yield i
def analyse_entries(self, entries, parameters):
for entry in entries:
logger.debug('Analysing entry with plugin {}: {}'.format(self, entry))
for result in self.analyse_entry(entry, parameters):
yield result
class ConversionPlugin(Plugin):
pass
class SentimentPlugin(models.SentimentPlugin, AnalysisPlugin):
def __init__(self, info, *args, **kwargs):
super(SentimentPlugin, self).__init__(info, *args, **kwargs)
self.minPolarityValue = float(info.get("minPolarityValue", 0))
self.maxPolarityValue = float(info.get("maxPolarityValue", 1))
class EmotionPlugin(models.EmotionPlugin, AnalysisPlugin):
def __init__(self, info, *args, **kwargs):
super(EmotionPlugin, self).__init__(info, *args, **kwargs)
self.minEmotionValue = float(info.get("minEmotionValue", -1))
self.maxEmotionValue = float(info.get("maxEmotionValue", 1))
class EmotionConversionPlugin(models.EmotionConversionPlugin, ConversionPlugin):
pass
class ShelfMixin(object):
@property
def sh(self):
if not hasattr(self, '_sh') or self._sh is None:
self.__dict__['_sh'] = {}
if os.path.isfile(self.shelf_file):
try:
self.__dict__['_sh'] = pickle.load(open(self.shelf_file, 'rb'))
except (IndexError, EOFError, pickle.UnpicklingError):
logger.warning('{} has a corrupted shelf file!'.format(self.id))
if not self.get('force_shelf', False):
raise
return self._sh
@sh.deleter
def sh(self):
if os.path.isfile(self.shelf_file):
os.remove(self.shelf_file)
del self.__dict__['_sh']
self.save()
@property
def shelf_file(self):
if 'shelf_file' not in self or not self['shelf_file']:
sd = os.environ.get('SENPY_DATA', tempfile.gettempdir())
self.shelf_file = os.path.join(sd, self.name + '.p')
return self['shelf_file']
def save(self):
logger.debug('saving pickle')
if hasattr(self, '_sh') and self._sh is not None:
with open(self.shelf_file, 'wb') as f:
pickle.dump(self._sh, f)
default_plugin_type = API_PARAMS['plugin_type']['default']
def pfilter(plugins, **kwargs):
""" Filter plugins by different criteria """
if isinstance(plugins, models.Plugins):
plugins = plugins.plugins
elif isinstance(plugins, dict):
plugins = plugins.values()
ptype = kwargs.pop('plugin_type', default_plugin_type)
logger.debug('#' * 100)
logger.debug('ptype {}'.format(ptype))
if ptype:
try:
ptype = ptype[0].upper() + ptype[1:]
pclass = globals()[ptype]
logger.debug('Class: {}'.format(pclass))
candidates = filter(lambda x: isinstance(x, pclass),
plugins)
except KeyError:
raise models.Error('{} is not a valid type'.format(ptype))
else:
candidates = plugins
logger.debug(candidates)
def matches(plug):
res = all(getattr(plug, k, None) == v for (k, v) in kwargs.items())
logger.debug(
"matching {} with {}: {}".format(plug.name, kwargs, res))
return res
if kwargs:
candidates = filter(matches, candidates)
return {p.name: p for p in candidates}

View File

View File

@@ -0,0 +1,102 @@
from senpy.plugins import EmotionConversionPlugin
from senpy.models import EmotionSet, Emotion, Error
import logging
logger = logging.getLogger(__name__)
class CentroidConversion(EmotionConversionPlugin):
def __init__(self, info):
if 'centroids' not in info:
raise Error('Centroid conversion plugins should provide '
'the centroids in their senpy file')
if 'onyx:doesConversion' not in info:
if 'centroids_direction' not in info:
raise Error('Please, provide centroids direction')
cf, ct = info['centroids_direction']
info['onyx:doesConversion'] = [{
'onyx:conversionFrom': cf,
'onyx:conversionTo': ct
}, {
'onyx:conversionFrom': ct,
'onyx:conversionTo': cf
}]
if 'aliases' in info:
aliases = info['aliases']
ncentroids = {}
for k1, v1 in info['centroids'].items():
nv1 = {}
for k2, v2 in v1.items():
nv1[aliases.get(k2, k2)] = v2
ncentroids[aliases.get(k1, k1)] = nv1
info['centroids'] = ncentroids
super(CentroidConversion, self).__init__(info)
self.dimensions = set()
for c in self.centroids.values():
self.dimensions.update(c.keys())
self.neutralPoints = self.get("neutralPoints", dict())
if not self.neutralPoints:
for i in self.dimensions:
self.neutralPoints[i] = self.get("neutralValue", 0)
def _forward_conversion(self, original):
"""Sum the VAD value of all categories found weighted by intensity.
Intensities are scaled by onyx:maxIntensityValue if it is present, else maxIntensityValue
is assumed to be one. Emotion entries that do not have onxy:hasEmotionIntensity specified
are assumed to have maxIntensityValue. Emotion entries that do not have
onyx:hasEmotionCategory specified are ignored."""
res = Emotion()
maxIntensity = float(original.get("onyx:maxIntensityValue", 1))
for e in original.onyx__hasEmotion:
category = e.get("onyx:hasEmotionCategory", None)
if not category:
continue
intensity = e.get("onyx:hasEmotionIntensity", maxIntensity) / maxIntensity
if not intensity:
continue
centroid = self.centroids.get(category, None)
if centroid:
for dim, value in centroid.items():
neutral = self.neutralPoints[dim]
if dim not in res:
res[dim] = 0
res[dim] += (value - neutral) * intensity + neutral
return res
def _backwards_conversion(self, original):
"""Find the closest category"""
centroids = self.centroids
neutralPoints = self.neutralPoints
dimensions = self.dimensions
def distance_k(centroid, original, k):
# k component of the distance between the value and a given centroid
return (centroid.get(k, neutralPoints[k]) - original.get(k, neutralPoints[k]))**2
def distance(centroid):
return sum(distance_k(centroid, original, k) for k in dimensions)
emotion = min(centroids, key=lambda x: distance(centroids[x]))
result = Emotion(onyx__hasEmotionCategory=emotion)
result.onyx__algorithmConfidence = distance(centroids[emotion])
return result
def convert(self, emotionSet, fromModel, toModel, params):
cf, ct = self.centroids_direction
logger.debug(
'{}\n{}\n{}\n{}'.format(emotionSet, fromModel, toModel, params))
e = EmotionSet()
if fromModel == cf and toModel == ct:
e.onyx__hasEmotion.append(self._forward_conversion(emotionSet))
elif fromModel == ct and toModel == cf:
for i in emotionSet.onyx__hasEmotion:
e.onyx__hasEmotion.append(self._backwards_conversion(i))
else:
raise Error('EMOTION MODEL NOT KNOWN')
yield e

View File

@@ -0,0 +1,39 @@
---
name: Ekman2FSRE
module: senpy.plugins.conversion.emotion.centroids
description: Plugin to convert emotion sets from Ekman to VAD
version: 0.1
# No need to specify onyx:doesConversion because centroids.py adds it automatically from centroids_direction
centroids:
anger:
A: 6.95
D: 5.1
V: 2.7
disgust:
A: 5.3
D: 8.05
V: 2.7
fear:
A: 6.5
D: 3.6
V: 3.2
happiness:
A: 7.22
D: 6.28
V: 8.6
sadness:
A: 5.21
D: 2.82
V: 2.21
centroids_direction:
- emoml:big6
- emoml:fsre-dimensions
aliases: # These are aliases for any key in the centroid, to avoid repeating a long name several times
A: emoml:arousal
V: emoml:valence
D: emoml:dominance
anger: emoml:big6anger
disgust: emoml:big6disgust
fear: emoml:big6fear
happiness: emoml:big6happiness
sadness: emoml:big6sadness

View File

@@ -0,0 +1,44 @@
---
name: Ekman2PAD
module: senpy.plugins.conversion.emotion.centroids
description: Plugin to convert emotion sets from Ekman to VAD
version: 0.1
# No need to specify onyx:doesConversion because centroids.py adds it automatically from centroids_direction
origin:
# Point in VAD space with no emotion (aka Neutral)
A: 5.0
D: 5.0
V: 5.0
centroids:
anger:
A: 6.95
D: 5.1
V: 2.7
disgust:
A: 5.3
D: 8.05
V: 2.7
fear:
A: 6.5
D: 3.6
V: 3.2
happiness:
A: 7.22
D: 6.28
V: 8.6
sadness:
A: 5.21
D: 2.82
V: 2.21
centroids_direction:
- emoml:big6
- emoml:pad
aliases: # These are aliases for any key in the centroid, to avoid repeating a long name several times
A: emoml:arousal
V: emoml:valence
D: emoml:dominance
anger: emoml:big6anger
disgust: emoml:big6disgust
fear: emoml:big6fear
happiness: emoml:big6happiness
sadness: emoml:big6sadness

View File

@@ -0,0 +1,18 @@
import random
from senpy.plugins import EmotionPlugin
from senpy.models import EmotionSet, Emotion
class RmoRandPlugin(EmotionPlugin):
def analyse_entry(self, entry, params):
category = "emoml:big6happiness"
number = max(-1, min(1, random.gauss(0, 0.5)))
if number > 0:
category = "emoml:big6anger"
emotionSet = EmotionSet()
emotion = Emotion({"onyx:hasEmotionCategory": category})
emotionSet.onyx__hasEmotion.append(emotion)
emotionSet.prov__wasGeneratedBy = self.id
entry.emotions.append(emotionSet)
yield entry

View File

@@ -0,0 +1,9 @@
---
name: emoRand
module: emoRand
description: A sample plugin that returns a random emotion annotation
author: "@balkian"
version: '0.1'
url: "https://github.com/gsi-upm/senpy-plugins-community"
requirements: {}
onyx:usesEmotionModel: "emoml:big6"

View File

@@ -1,29 +1,24 @@
import random
from senpy.plugins import SentimentPlugin
from senpy.models import Results, Sentiment, Entry
from senpy.models import Sentiment
class Sentiment140Plugin(SentimentPlugin):
def analyse(self, **params):
class RandPlugin(SentimentPlugin):
def analyse_entry(self, entry, params):
lang = params.get("language", "auto")
response = Results()
polarity_value = max(-1, min(1, random.gauss(0.2, 0.2)))
polarity = "marl:Neutral"
if polarity_value > 0:
polarity = "marl:Positive"
elif polarity_value < 0:
polarity = "marl:Negative"
entry = Entry({"id": ":Entry0", "nif:isString": params["input"]})
sentiment = Sentiment({
"id": ":Sentiment0",
"marl:hasPolarity": polarity,
"marl:polarityValue": polarity_value
})
sentiment["prov:wasGeneratedBy"] = self.id
entry.sentiments = []
entry.sentiments.append(sentiment)
entry.language = lang
response.entries.append(entry)
return response
yield entry

View File

@@ -0,0 +1,10 @@
---
name: rand
module: rand
description: A sample plugin that returns a random sentiment annotation
author: "@balkian"
version: '0.1'
url: "https://github.com/gsi-upm/senpy-plugins-community"
requirements: {}
marl:maxPolarityValue: '1'
marl:minPolarityValue: "-1"

View File

@@ -1,18 +0,0 @@
{
"name": "rand",
"module": "rand",
"description": "What my plugin broadly does",
"author": "@balkian",
"version": "0.1",
"extra_params": {
"language": {
"@id": "lang_rand",
"aliases": ["language", "l"],
"required": false,
"options": ["es", "en", "auto"]
}
},
"requirements": {},
"marl:maxPolarityValue": "1",
"marl:minPolarityValue": "-1"
}

View File

@@ -2,24 +2,22 @@ import requests
import json
from senpy.plugins import SentimentPlugin
from senpy.models import Results, Sentiment, Entry
from senpy.models import Sentiment
class Sentiment140Plugin(SentimentPlugin):
def analyse(self, **params):
def analyse_entry(self, entry, params):
lang = params.get("language", "auto")
res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
json.dumps({
"language": lang,
"data": [{
"text": params["input"]
"text": entry.text
}]
}))
p = params.get("prefix", None)
response = Results(prefix=p)
polarity_value = self.maxPolarityValue * int(res.json()["data"][0][
"polarity"]) * 0.25
polarity_value = self.maxPolarityValue * int(
res.json()["data"][0]["polarity"]) * 0.25
polarity = "marl:Neutral"
neutral_value = self.maxPolarityValue / 2.0
if polarity_value > neutral_value:
@@ -27,9 +25,7 @@ class Sentiment140Plugin(SentimentPlugin):
elif polarity_value < neutral_value:
polarity = "marl:Negative"
entry = Entry(id="Entry0", nif__isString=params["input"])
sentiment = Sentiment(
id="Sentiment0",
prefix=p,
marl__hasPolarity=polarity,
marl__polarityValue=polarity_value)
@@ -37,5 +33,4 @@ class Sentiment140Plugin(SentimentPlugin):
entry.sentiments = []
entry.sentiments.append(sentiment)
entry.language = lang
response.entries.append(entry)
return response
yield entry

View File

@@ -0,0 +1,21 @@
---
name: sentiment140
module: sentiment140
description: "Connects to the sentiment140 free API: http://sentiment140.com"
author: "@balkian"
version: '0.2'
url: "https://github.com/gsi-upm/senpy-plugins-community"
extra_params:
language:
"@id": lang_sentiment140
aliases:
- language
- l
required: false
options:
- es
- en
- auto
requirements: {}
maxPolarityValue: 1
minPolarityValue: 0

View File

@@ -1,18 +0,0 @@
{
"name": "sentiment140",
"module": "sentiment140",
"description": "What my plugin broadly does",
"author": "@balkian",
"version": "0.1",
"extra_params": {
"language": {
"@id": "lang_sentiment140",
"aliases": ["language", "l"],
"required": false,
"options": ["es", "en", "auto"]
}
},
"requirements": {},
"maxPolarityValue": "1",
"minPolarityValue": "0"
}

View File

@@ -0,0 +1,30 @@
from senpy.plugins import AnalysisPlugin
from senpy.models import Entry
from nltk.tokenize.punkt import PunktSentenceTokenizer
from nltk.tokenize.simple import LineTokenizer
import nltk
class SplitPlugin(AnalysisPlugin):
def activate(self):
nltk.download('punkt')
def analyse_entry(self, entry, params):
chunker_type = params.get("delimiter", "sentence")
original_id = entry.id
original_text = entry.get("text", None)
if chunker_type == "sentence":
tokenizer = PunktSentenceTokenizer()
chars = tokenizer.span_tokenize(original_text)
for i, sentence in enumerate(tokenizer.tokenize(original_text)):
e = Entry()
e.text = sentence
e.id = original_id + "#char={},{}".format(chars[i][0], chars[i][1])
yield e
if chunker_type == "paragraph":
tokenizer = LineTokenizer()
chars = tokenizer.span_tokenize(original_text)
for i, paragraph in enumerate(tokenizer.tokenize(original_text)):
e = Entry()
e.text = paragraph
chars = [char for char in chars]
e.id = original_id + "#char={},{}".format(chars[i][0], chars[i][1])
yield e

View File

@@ -0,0 +1,18 @@
---
name: split
module: split
description: A sample plugin that chunks input text
author: "@militarpancho"
version: '0.1'
url: "https://github.com/gsi-upm/senpy"
requirements: {nltk}
extra_params:
delimiter:
aliases:
- type
- t
required: false
default: sentence
options:
- sentence
- paragraph

View File

@@ -1,7 +0,0 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Senpy analysis",
"allOf": [{
"$ref": "atom.json"
}]
}

View File

@@ -6,34 +6,57 @@
"prov": "http://www.w3.org/ns/prov#",
"nif": "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#",
"marl": "http://www.gsi.dit.upm.es/ontologies/marl/ns#",
"onyx": "http://www.gsi.dit.upm.es/ontologies/onyx#",
"wnaffect": "http://www.gsi.dit.upm.es/ontologies/wnaffect#",
"onyx": "http://www.gsi.dit.upm.es/ontologies/onyx/ns#",
"wna": "http://www.gsi.dit.upm.es/ontologies/wnaffect/ns#",
"emoml": "http://www.gsi.dit.upm.es/ontologies/onyx/vocabularies/emotionml/ns#",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"topics": {
"@id": "dc:subject"
"@id": "dc:subject"
},
"entities": {
"@id": "me:hasEntities"
"@id": "me:hasEntities"
},
"suggestions": {
"@id": "me:hasSuggestions",
"@container": "@set"
"@id": "me:hasSuggestions",
"@container": "@set"
},
"emotions": {
"@id": "onyx:hasEmotionSet",
"@container": "@set"
"@id": "onyx:hasEmotionSet",
"@container": "@set"
},
"sentiments": {
"@id": "marl:hasOpinion",
"@container": "@set"
"@id": "marl:hasOpinion",
"@container": "@set"
},
"entries": {
"@id": "prov:used",
"@container": "@set"
"@id": "prov:used",
"@container": "@set"
},
"analysis": {
"@id": "prov:wasGeneratedBy"
"@id": "AnalysisInvolved",
"@type": "@id",
"@container": "@set"
},
"options": {
"@container": "@set"
},
"plugins": {
"@container": "@set"
},
"prov:wasGeneratedBy": {
"@type": "@id"
},
"onyx:usesEmotionModel": {
"@type": "@id"
},
"onyx:hasEmotionCategory": {
"@type": "@id"
},
"onyx:conversionFrom": {
"@type": "@id"
},
"onyx:conversionTo": {
"@type": "@id"
}
}
}

View File

@@ -6,13 +6,14 @@
{"$ref": "analysis.json"},
{"properties":
{
"onyx:hasEmotionModel": {
"onyx:usesEmotionModel": {
"anyOf": [
{"type": "string"},
{"$ref": "emotionModel.json"}
]
}
},
"required": ["onyx:hasEmotionModel"]
"required": ["onyx:hasEmotionModel",
"@type"]
}]
}

View File

@@ -0,0 +1,12 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"properties": {
"onyx:conversionFrom": {
"$ref": "emotionModel.json"
},
"onyx:conversionTo": {
"$ref": "emotionModel.json"
}
},
"required": ["onyx:conversionFrom", "onyx:conversionTo"]
}

View File

@@ -0,0 +1,19 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"allOf": [
{
"$ref": "plugin.json"
},
{
"properties": {
"onyx:doesConversion": {
"type": "array",
"items": {
"$ref": "emotionConversion.json"
}
}
}
}
]
}

View File

@@ -1,7 +1,7 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"$allOf": [
"allOf": [
{
"$ref": "plugin.json"
},

View File

@@ -11,23 +11,28 @@
},
"sentiments": {
"type": "array",
"items": {"$ref": "sentiment.json" }
"items": {"$ref": "sentiment.json" },
"default": []
},
"emotions": {
"type": "array",
"items": {"$ref": "emotionSet.json" }
"items": {"$ref": "emotionSet.json" },
"default": []
},
"entities": {
"type": "array",
"items": {"$ref": "entity.json" }
"items": {"$ref": "entity.json" },
"default": []
},
"topics": {
"type": "array",
"items": {"$ref": "topic.json" }
"items": {"$ref": "topic.json" },
"default": []
},
"suggestions": {
"type": "array",
"items": {"$ref": "suggestion.json" }
"items": {"$ref": "suggestion.json" },
"default": []
}
},
"required": ["@id", "nif:isString"]

View File

@@ -2,7 +2,7 @@
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Base schema for all Senpy objects",
"type": "object",
"$allOf": [
"allOf": [
{"$ref": "atom.json"},
{
"properties": {
@@ -10,14 +10,14 @@
"type": "string"
},
"errors": {
"type": "list",
"type": "array",
"items": {"type": "object"}
},
"code": {
"type": "int"
},
"required": ["message"]
}
"status": {
"type": "number"
}
},
"required": ["message"]
}
]
}

View File

@@ -6,11 +6,10 @@
"properties": {
"plugins": {
"type": "array",
"default": [],
"items": {
"$ref": "plugin.json"
}
},
"@type": {
}
}
}

View File

@@ -18,10 +18,16 @@
"type": "string"
},
"analysis": {
"type": "array",
"default": [],
"type": "array",
"items": {
"$ref": "analysis.json"
"anyOf": [
{
"$ref": "analysis.json"
},{
"type": "string"
}
]
}
},
"entries": {

View File

@@ -1,7 +1,7 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"$allOf": [
"allOf": [
{
"$ref": "plugin.json"
},

View File

@@ -1,4 +1,5 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object"
"type": "object",
"required": ["@id", "prov:wasGeneratedBy"]
}

View File

@@ -8,7 +8,7 @@ body {
}
#inputswrapper {
min-height:100%;
background: white;
/* background: white; */
position:relative;
min-width: 800px;
height: 100%;
@@ -50,25 +50,16 @@ body {
#form {
width: 100%;
}
#results {
.results {
overflow: auto;
padding: 20px;
background: lightgray;
-moz-border-radius: 20px;
-webkit-border-radius: 20px;
-khtml-border-radius: 20px;
border-radius: 20px;
/* padding: 20px; */
background: white;
/* -moz-border-radius: 20px; */
/* -webkit-border-radius: 20px; */
/* -khtml-border-radius: 20px; */
/* border-radius: 20px; */
}
#jsonraw {
overflow: auto;
padding: 20px;
background: lightgray;
-moz-border-radius: 20px;
-webkit-border-radius: 20px;
-khtml-border-radius: 20px;
border-radius: 20px;
}
#input_request {
margin-top: 5px;
display:block;
@@ -156,3 +147,8 @@ textarea{
#header {
font-family: 'Architects Daughter', cursive;
}
#results-div {
/* background: white; */
display: none;
}

View File

@@ -32,39 +32,48 @@ $(document).ready(function() {
var availablePlugins = document.getElementById('availablePlugins');
plugins = response.plugins;
for (r in plugins){
if (plugins[r]["name"]){
if (plugins[r]["name"] == defaultPlugin["name"]){
if (plugins[r]["is_activated"]){
html+= "<option value=\""+plugins[r]["name"]+"\" selected=\"selected\">"+plugins[r]["name"]+"</option>"
plugin = plugins[r]
if (plugin["name"]){
if (plugin["name"] == defaultPlugin["name"]){
if (plugin["is_activated"]){
html+= "<option value=\""+plugin["name"]+"\" selected=\"selected\">"+plugin["name"]+"</option>"
}else{
html+= "<option value=\""+plugins[r]["name"]+"\" selected=\"selected\" disabled=\"disabled\">"+plugins[r]["name"]+"</option>"
html+= "<option value=\""+plugin["name"]+"\" selected=\"selected\" disabled=\"disabled\">"+plugin["name"]+"</option>"
}
}
else{
if (plugins[r]["is_activated"]){
html+= "<option value=\""+plugins[r]["name"]+"\">"+plugins[r]["name"]+"</option>"
if (plugin["is_activated"]){
html+= "<option value=\""+plugin["name"]+"\">"+plugin["name"]+"</option>"
}
else{
html+= "<option value=\""+plugins[r]["name"]+"\" disabled=\"disabled\">"+plugins[r]["name"]+"</option>"
html+= "<option value=\""+plugin["name"]+"\" disabled=\"disabled\">"+plugin["name"]+"</option>"
}
}
}
if (plugins[r]["extra_params"]){
plugins_params[plugins[r]["name"]]={};
for (param in plugins[r]["extra_params"]){
if (typeof plugins[r]["extra_params"][param] !="string"){
if (plugin["extra_params"]){
plugins_params[plugin["name"]]={};
for (param in plugin["extra_params"]){
if (typeof plugin["extra_params"][param] !="string"){
var params = new Array();
var alias = plugins[r]["extra_params"][param]["aliases"][0];
var alias = plugin["extra_params"][param]["aliases"][0];
params[alias]=new Array();
for (option in plugins[r]["extra_params"][param]["options"]){
params[alias].push(plugins[r]["extra_params"][param]["options"][option])
for (option in plugin["extra_params"][param]["options"]){
params[alias].push(plugin["extra_params"][param]["options"][option])
}
plugins_params[plugins[r]["name"]][alias] = (params[alias])
plugins_params[plugin["name"]][alias] = (params[alias])
}
}
}
var pluginList = document.createElement('li');
pluginList.innerHTML = "<a href=https://github.com/gsi-upm/senpy-plugins-community>" + plugins[r]["name"] + "</a>" + ": " + plugins[r]["description"]
newHtml = ""
if(plugin.url) {
newHtml= "<a href="+plugin.url+">" + plugin.name + "</a>";
}else {
newHtml= plugin["name"];
}
newHtml += ": " + replaceURLWithHTMLLinks(plugin.description);
pluginList.innerHTML = newHtml;
availablePlugins.appendChild(pluginList)
}
document.getElementById('plugins').innerHTML = html;
@@ -96,6 +105,10 @@ function change_params(){
function load_JSON(){
url = "/api";
var container = document.getElementById('results');
var rawcontainer = document.getElementById("jsonraw");
rawcontainer.innerHTML = '';
container.innerHTML = '';
var plugin = document.getElementById("plugins").options[document.getElementById("plugins").selectedIndex].value;
var input = encodeURIComponent(document.getElementById("input").value);
url += "?algo="+plugin+"&i="+input
@@ -108,18 +121,14 @@ function load_JSON(){
}
}
var response = JSON.parse($.ajax({type: "GET", url: url , async: false}).responseText);
var container = document.getElementById('results');
var options = {
mode: 'view'
};
try {
container.removeChild(container.firstChild);
}
catch(err) {
}
var editor = new JSONEditor(container, options, response);
document.getElementById("jsonraw").innerHTML = replaceURLWithHTMLLinks(JSON.stringify(response, undefined, 2))
editor.expandAll();
rawcontainer.innerHTML = replaceURLWithHTMLLinks(JSON.stringify(response, undefined, 2))
document.getElementById("input_request").innerHTML = "<a href='"+url+"'>"+url+"</a>"
document.getElementById("results-div").style.display = 'block';
}

View File

@@ -2,7 +2,7 @@
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>Playground</title>
<title>Playground {{version}}</title>
</head>
<script src="static/js/jquery-2.1.1.min.js" ></script>
@@ -25,49 +25,68 @@
<h3 id="header-title">
<a href="https://github.com/gsi-upm/senpy" target="_blank">
<img id="header-logo" class="imsg-responsive" src="static/img/header.png"/></a> Playground
</h3>
<h4>v{{ version}}</h4>
</div>
<ul class="nav nav-tabs" role="tablist">
<li role="presentation" class="active"><a class="active" href="#about">About</a></li>
<li role="presentation"><a class="active" href="#test">Test it</a></li>
<li role="presentation" ><a class="active" href="#about">About</a></li>
<li role="presentation"class="active"><a class="active" href="#test">Test it</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane active" id="about">
<div class="tab-pane" id="about">
<div class="row">
<div class="col-lg-6 ">
<div class="well">
<h2>Test Senpy</h2>
<div>
<p class="text-center">
<a class="btn btn-lg btn-primary" href="#test" role="button">Test it »</a>
<div class="col-lg-6">
<h2>About Senpy</h2>
<p>Senpy is a framework to build semantic sentiment and emotion analysis services. It does so by using a mix of web and semantic technologies, such as JSON-LD, RDFlib and Flask.</p>
<p>Senpy makes it easy to develop and publish your own analysis algorithms (plugins in senpy terms).
</p>
<p>
This website is the senpy Playground, which allows you to test the instance of senpy in this server. It provides a user-friendly interface to the functions exposed by the senpy API.
</p>
<p>
Once you get comfortable with the parameters and results, you are encouraged to issue your own requests to the API endpoint. You can find examples of API URL's when you try out a plugin with the "Analyse!" button on the "Test it" tab.
</p>
<p>
These are some of the things you can do with the API:
<ul>
<li>List all available plugins: <a href="/api/plugins">/api/plugins</a></li>
<li>Get information about the default plugin: <a href="/api/plugins/default">/api/plugins/default</a></li>
<li>Download the JSON-LD context used: <a href="/api/contexts/Results.jsonld">/api/contexts/Results.jsonld</a></li>
</ul>
</p>
</div>
</div>
<div class="col-lg-6 ">
<div class="panel panel-default">
<div class="panel-heading"><i class="fa fa-sign-in"></i> Follow us on <a href="http://www.github.com/gsi-upm/senpy">GitHub</a></div>
</div>
<div class="panel panel-default">
<div class="panel-heading"><i class="fa fa-child"></i> Enjoy.</div>
<div class="panel-heading">
Available Plugins
</div>
<div class="panel-body"><ul id=availablePlugins></ul></div>
</div>
</div>
<div class="col-lg-6 ">
<div class="well">
<h2>Available Plugins</h2>
<div>
<span><ul id=availablePlugins></ul></span>
</div>
<a href="http://senpy.readthedocs.io">
<div class="panel panel-default">
<div class="panel-heading"><i class="fa fa-book"></i> If you are new to senpy, you might want to read senpy's documentation</div>
</div>
</a>
<a href="http://www.github.com/gsi-upm/senpy">
<div class="panel panel-default">
<div class="panel-heading"><i class="fa fa-sign-in"></i> Feel free to follow us on GitHub</div>
</div>
</a>
<div class="panel panel-default">
<div class="panel-heading"><i class="fa fa-child"></i> Enjoy.</div>
</div>
</div>
</div>
</div>
<div class="tab-pane" id="test">
<div class="tab-pane active" id="test">
<div class="well">
<form id="form" onsubmit="return getPlugins();" accept-charset="utf-8">
<div id="inputswrapper">
@@ -81,30 +100,32 @@ I cannot believe it!</textarea></div>
<div id ="params">
</div>
</br>
<a id="preview" class="btn btn-lg btn-primary" href="#" onclick="load_JSON()">Analyse!</a>
<a id="preview" class="btn btn-lg btn-primary" onclick="load_JSON()">Analyse!</a>
<!--<button id="visualise" name="type" type="button">Visualise!</button>-->
</div>
</form>
<span id="input_request"></span>
</div>
<span id="input_request"></span>
<div id="results-div">
<ul class="nav nav-tabs" role="tablist">
<li role="presentation" class="active"><a class="active" href="#viewer">Viewer</a></li>
<li role="presentation"><a class="active" href="#raw">Raw</a></li>
</ul>
<div class="tab-content">
<div class="tab-content" id="results-container">
<div class="tab-pane active" id="viewer">
<div id="content">
<pre id="results"></pre>
</div>
<div id="content">
<pre id="results" class="results"></pre>
</div>
</div>
<div class="tab-pane" id="raw">
<div id="content">
<pre id="jsonraw"></pre>
</div>
<div id="content">
<pre id="jsonraw" class="results"></pre>
</div>
</div>
</div>
</div>
</div>
</div>
<a href="http://www.gsi.dit.upm.es" target="_blank"><img class="center-block" src="static/img/gsi.png"/> </a>

View File

@@ -1,4 +1,19 @@
import os
import logging
with open(os.path.join(os.path.dirname(__file__), 'VERSION')) as f:
__version__ = f.read().strip()
logger = logging.getLogger(__name__)
ROOT = os.path.dirname(__file__)
DEFAULT_FILE = os.path.join(ROOT, 'VERSION')
def read_version(versionfile=DEFAULT_FILE):
try:
with open(versionfile) as f:
return f.read().strip()
except IOError: # pragma: no cover
logger.error('Running an unknown version of senpy. Be careful!.')
return '0.0'
__version__ = read_version()

View File

@@ -7,3 +7,11 @@ test=pytest
# finishing the imports. flake8 thinks that we're doing the imports too late,
# but it's actually ok
ignore = E402
max-line-length = 100
[bdist_wheel]
universal=1
[tool:pytest]
addopts = --cov=senpy --cov-report term-missing
[coverage:report]
omit = senpy/__main__.py

View File

@@ -1,7 +1,8 @@
import pip
from setuptools import setup
from pip.req import parse_requirements
# parse_requirements() returns generator of pip.req.InstallRequirement objects
from pip.req import parse_requirements
from senpy import __version__
try:
install_reqs = parse_requirements(
@@ -12,12 +13,9 @@ except AttributeError:
install_reqs = parse_requirements("requirements.txt")
test_reqs = parse_requirements("test-requirements.txt")
# reqs is a list of requirement
# e.g. ['django==1.5.1', 'mezzanine==1.4.6']
install_reqs = [str(ir.req) for ir in install_reqs]
test_reqs = [str(ir.req) for ir in test_reqs]
from senpy import __version__
setup(
name='senpy',

View File

@@ -1,2 +1,3 @@
pytest
mock
pytest-cov
pytest

View File

@@ -1,7 +0,0 @@
from senpy.plugins import SentimentPlugin
from senpy.models import Results
class DummyPlugin(SentimentPlugin):
def analyse(self, *args, **kwargs):
return Results()

View File

@@ -1,7 +0,0 @@
{
"name": "Dummy",
"module": "dummy",
"description": "I am dummy",
"author": "@balkian",
"version": "0.1"
}

View File

@@ -0,0 +1,23 @@
from senpy.plugins import AnalysisPlugin
import multiprocessing
def _train(process_number):
return process_number
class AsyncPlugin(AnalysisPlugin):
def _do_async(self, num_processes):
pool = multiprocessing.Pool(processes=num_processes)
values = pool.map(_train, range(num_processes))
return values
def activate(self):
self.value = self._do_async(4)
def analyse_entry(self, entry, params):
values = self._do_async(2)
entry.async_values = values
yield entry

View File

@@ -0,0 +1,8 @@
---
name: Async
module: asyncplugin
description: I am async
author: "@balkian"
version: '0.1'
async: true
extra_params: {}

View File

@@ -0,0 +1,8 @@
from senpy.plugins import SentimentPlugin
class DummyPlugin(SentimentPlugin):
def analyse_entry(self, entry, params):
entry.text = entry.text[::-1]
entry.reversed = entry.get('reversed', 0) + 1
yield entry

View File

@@ -0,0 +1,15 @@
{
"name": "Dummy",
"module": "dummy",
"description": "I am dummy",
"author": "@balkian",
"version": "0.1",
"extra_params": {
"example": {
"@id": "example_parameter",
"aliases": ["example", "ex"],
"required": false,
"default": 0
}
}
}

View File

@@ -0,0 +1,14 @@
{
"name": "DummyRequired",
"module": "dummy",
"description": "I am dummy",
"author": "@balkian",
"version": "0.1",
"extra_params": {
"example": {
"@id": "example_parameter",
"aliases": ["example", "ex"],
"required": true
}
}
}

View File

@@ -0,0 +1,11 @@
from senpy.plugins import AnalysisPlugin
from time import sleep
class SleepPlugin(AnalysisPlugin):
def activate(self, *args, **kwargs):
sleep(self.timeout)
def analyse_entry(self, entry, params):
sleep(float(params.get("timeout", self.timeout)))
yield entry

View File

@@ -1,12 +0,0 @@
from senpy.plugins import SenpyPlugin
from senpy.models import Results
from time import sleep
class SleepPlugin(SenpyPlugin):
def activate(self, *args, **kwargs):
sleep(self.timeout)
def analyse(self, *args, **kwargs):
sleep(float(kwargs.get("timeout", self.timeout)))
return Results()

68
tests/test_api.py Normal file
View File

@@ -0,0 +1,68 @@
import logging
logger = logging.getLogger(__name__)
from unittest import TestCase
from senpy.api import parse_params, API_PARAMS, NIF_PARAMS, WEB_PARAMS
from senpy.models import Error
class APITest(TestCase):
def test_api_params(self):
"""The API should not define any required parameters without a default"""
parse_params({}, spec=API_PARAMS)
def test_web_params(self):
"""The WEB should not define any required parameters without a default"""
parse_params({}, spec=WEB_PARAMS)
def test_basic(self):
a = {}
try:
parse_params(a, spec=NIF_PARAMS)
raise AssertionError()
except Error:
pass
a = {'input': 'hello'}
p = parse_params(a, spec=NIF_PARAMS)
assert 'input' in p
b = {'i': 'hello'}
p = parse_params(b, spec=NIF_PARAMS)
assert 'input' in p
def test_plugin(self):
query = {}
plug_params = {
'hello': {
'aliases': ['hello', 'hiya'],
'required': True
}
}
try:
parse_params(query, spec=plug_params)
raise AssertionError()
except Error:
pass
query['hello'] = 'world'
p = parse_params(query, spec=plug_params)
assert 'hello' in p
assert p['hello'] == 'world'
del query['hello']
query['hiya'] = 'dlrow'
p = parse_params(query, spec=plug_params)
assert 'hello' in p
assert 'hiya' in p
assert p['hello'] == 'dlrow'
def test_default(self):
spec = {
'hello': {
'required': True,
'default': 1
}
}
p = parse_params({}, spec=spec)
assert 'hello' in p
assert p['hello'] == 1

View File

@@ -1,11 +1,10 @@
import os
import logging
import json
from senpy.extensions import Senpy
from senpy import models
from flask import Flask
from unittest import TestCase
from gevent import sleep
from itertools import product
@@ -14,18 +13,21 @@ def check_dict(indic, template):
def parse_resp(resp):
return json.loads(resp.data.decode('utf-8'))
return models.from_json(resp.data.decode('utf-8'))
class BlueprintsTest(TestCase):
def setUp(self):
self.app = Flask("test_extensions")
self.app.debug = False
self.client = self.app.test_client()
self.senpy = Senpy()
self.senpy.init_app(self.app)
self.dir = os.path.join(os.path.dirname(__file__), "..")
self.senpy.add_folder(self.dir)
self.senpy.activate_plugin("Dummy", sync=True)
self.senpy.activate_plugin("DummyRequired", sync=True)
self.senpy.default_plugin = 'Dummy'
def assertCode(self, resp, code):
self.assertEqual(resp.status_code, code)
@@ -35,12 +37,12 @@ class BlueprintsTest(TestCase):
Calling with no arguments should ask the user for more arguments
"""
resp = self.client.get("/api/")
self.assertCode(resp, 404)
self.assertCode(resp, 400)
js = parse_resp(resp)
logging.debug(js)
assert js["status"] == 404
assert js["status"] == 400
atleast = {
"status": 404,
"status": 400,
"message": "Missing or invalid parameters",
}
assert check_dict(js, atleast)
@@ -57,6 +59,39 @@ class BlueprintsTest(TestCase):
assert "@context" in js
assert "entries" in js
def test_analysis_extra(self):
"""
Extra params that have a default should
"""
resp = self.client.get("/api/?i=My aloha mohame&algo=Dummy")
self.assertCode(resp, 200)
js = parse_resp(resp)
logging.debug("Got response: %s", js)
assert "@context" in js
assert "entries" in js
def test_analysis_extra_required(self):
"""
Extra params that have a required argument that does not
have a default should raise an error.
"""
resp = self.client.get("/api/?i=My aloha mohame&algo=DummyRequired")
self.assertCode(resp, 400)
js = parse_resp(resp)
logging.debug("Got response: %s", js)
assert isinstance(js, models.Error)
def test_error(self):
"""
The dummy plugin returns an empty response,\
it should contain the context
"""
resp = self.client.get("/api/?i=My aloha mohame&algo=DOESNOTEXIST")
self.assertCode(resp, 404)
js = parse_resp(resp)
logging.debug("Got response: %s", js)
assert isinstance(js, models.Error)
def test_list(self):
""" List the plugins """
resp = self.client.get("/api/plugins/")
@@ -92,26 +127,7 @@ class BlueprintsTest(TestCase):
js = parse_resp(resp)
logging.debug(js)
assert "@id" in js
assert js["@id"] == "Dummy_0.1"
def test_activate(self):
""" Activate and deactivate one plugin """
resp = self.client.get("/api/plugins/Dummy/deactivate")
self.assertCode(resp, 200)
sleep(0.5)
resp = self.client.get("/api/plugins/Dummy/")
self.assertCode(resp, 200)
js = parse_resp(resp)
assert "is_activated" in js
assert not js["is_activated"]
resp = self.client.get("/api/plugins/Dummy/activate")
self.assertCode(resp, 200)
sleep(0.5)
resp = self.client.get("/api/plugins/Dummy/")
self.assertCode(resp, 200)
js = parse_resp(resp)
assert "is_activated" in js
assert js["is_activated"]
assert js["@id"] == "plugins/Dummy_0.1"
def test_default(self):
""" Show only one plugin"""
@@ -120,12 +136,7 @@ class BlueprintsTest(TestCase):
js = parse_resp(resp)
logging.debug(js)
assert "@id" in js
assert js["@id"] == "Dummy_0.1"
resp = self.client.get("/api/plugins/Dummy/deactivate")
self.assertCode(resp, 200)
sleep(0.5)
resp = self.client.get("/api/plugins/default/")
self.assertCode(resp, 404)
assert js["@id"] == "plugins/Dummy_0.1"
def test_context(self):
resp = self.client.get("/api/contexts/context.jsonld")

View File

@@ -4,17 +4,21 @@ try:
except ImportError:
from mock import patch
import json
from senpy.client import Client
from senpy.models import Results, Error
from senpy.models import Results, Plugins, Error
from senpy.plugins import AnalysisPlugin, default_plugin_type
class Call(dict):
def __init__(self, obj):
self.obj = obj.jsonld()
self.obj = obj.serialize()
self.status_code = 200
self.content = self.json()
def json(self):
return self.obj
return json.loads(self.obj)
class ModelsTest(TestCase):
@@ -29,14 +33,33 @@ class ModelsTest(TestCase):
with patch('requests.request', return_value=success) as patched:
resp = client.analyse('hello')
assert isinstance(resp, Results)
patched.assert_called_with(url=endpoint + '/',
method='GET',
params={'input': 'hello'})
patched.assert_called_with(
url=endpoint + '/', method='GET', params={'input': 'hello'})
error = Call(Error('Nothing'))
with patch('requests.request', return_value=error) as patched:
resp = client.analyse(input='hello', algorithm='NONEXISTENT')
assert isinstance(resp, Error)
patched.assert_called_with(url=endpoint + '/',
method='GET',
params={'input': 'hello',
'algorithm': 'NONEXISTENT'})
try:
client.analyse(input='hello', algorithm='NONEXISTENT')
raise Exception('Exceptions should be raised. This is not golang')
except Error:
pass
patched.assert_called_with(
url=endpoint + '/',
method='GET',
params={'input': 'hello',
'algorithm': 'NONEXISTENT'})
def test_plugins(self):
endpoint = 'http://dummy/'
client = Client(endpoint)
plugins = Plugins()
p1 = AnalysisPlugin({'name': 'AnalysisP1', 'version': 0, 'description': 'No'})
plugins.plugins = [p1, ]
success = Call(plugins)
with patch('requests.request', return_value=success) as patched:
response = client.plugins()
assert isinstance(response, dict)
assert len(response) == 1
assert "AnalysisP1" in response
patched.assert_called_with(
url=endpoint + '/plugins', method='GET',
params={'plugin_type': default_plugin_type})

Some files were not shown because too many files have changed in this diff Show More