1
0
mirror of https://github.com/gsi-upm/senpy synced 2025-10-21 18:58:25 +00:00

Compare commits

..

223 Commits
0.2 ... chunker

Author SHA1 Message Date
militarpancho
83e2d415a1 Change name to split, according to issue #37 2017-06-13 19:44:40 +02:00
militarpancho
f8ca595bc9 Added chunker plugin to tokenize texts 2017-06-13 14:00:40 +02:00
J. Fernando Sánchez
312e7f7f12 Avoid python temporary files in pip tests 2017-06-12 21:50:51 +02:00
J. Fernando Sánchez
c555b9547e Non-interactive pip test 2017-06-12 21:27:02 +02:00
J. Fernando Sánchez
991ade8f4d Make sdist non-interactive non-tty 2017-06-12 21:20:07 +02:00
J. Fernando Sánchez
1104e816cb Push pip for tags without a preceding v 2017-06-12 21:06:34 +02:00
J. Fernando Sánchez
c19d03b41d Added SSH access to github fetch 2017-06-12 20:47:46 +02:00
J. Fernando Sánchez
42c9068991 Add pull policy to k8s deployment
* Add git fetch to (try to) fix github push from gitlab
2017-06-12 20:43:39 +02:00
J. Fernando Sánchez
96843827bd Removed __main__ from test coverage reports 2017-06-12 20:29:29 +02:00
J. Fernando Sánchez
d76e4618fe Removed python 3.4 from travis versions 2017-06-12 20:18:56 +02:00
J. Fernando Sánchez
c9bc485535 Merge branch '36-estimate-vad' 2017-06-12 20:10:21 +02:00
J. Fernando Sánchez
6d7575bbcd Merge branch '35-timeout-and-blocking-requests' 2017-06-12 19:57:28 +02:00
J. Fernando Sánchez
852bcc72ba Better centroid conversion
Also added **simple** tests for backward and forward conversion.
In future versions we should add thorough tests.

Should close gsi-upm/senpy#31
2017-06-12 19:52:00 +02:00
J. Fernando Sánchez
bf5ed1bd7d Merge remote-tracking branch 'drevicko/patch-6' 2017-06-12 18:14:15 +02:00
J. Fernando Sánchez
00da75153a Change conversion to Euclidean distance
* Added neutral point (if present)

Closes !gsi-upm/senpy#37 (Ian's)
2017-06-12 18:09:58 +02:00
J. Fernando Sánchez
fa082e11e7 Use flask's server by default
Using this server in production is discouraged, but to implement a
proper asynchronous server with tornado/gevent every blocking call would
have to be converted to a non-blocking call.

Failing to do so causes deadlocks like senpy/senpy#35

For now, it is easier to just use the default server.
2017-06-12 17:29:01 +02:00
J. Fernando Sánchez
6331d31b18 Merge branch '34-document-plugin-repo-creation' into 24-improve-docs
Closes #34
Closes #24
2017-06-12 12:53:24 +02:00
J. Fernando Sánchez
8ee324f566 Clearer docs 2017-06-12 09:31:42 +02:00
J. Fernando Sánchez
188c33332a Removed nbsphinx
It requires pandoc, which cannot be installed with pip.

We can either link to the nbfile or convert the file
manually/automatically:

```
nbconvert SenpyClientUse.ipynb --to rst
```
2017-06-12 09:31:42 +02:00
militarpancho
955e17eb2a Added travis, readthedocs and pypi badges 2017-06-12 09:31:42 +02:00
militarpancho
3e0f55dcff Improve docs. (Badges missing) 2017-06-12 09:31:38 +02:00
J. Fernando Sánchez
2ea01aef42 Fixed deployment IMAGENAME 2017-06-02 20:10:06 +02:00
J. Fernando Sánchez
147fd4a333 Fixed IMAGENAME 2017-06-02 20:02:27 +02:00
J. Fernando Sánchez
e31bca7016 Push to dockerhub instead of private registry 2017-06-02 19:42:22 +02:00
J. Fernando Sánchez
7956d54c35 K8s deployment with limits 2017-06-02 19:17:27 +02:00
militarpancho
5bab9a6a02 #34. Fixed some errors from plugins examples 2017-06-02 17:43:18 +02:00
militarpancho
69ac95bb08 Added example plugin in docs. #34 2017-06-02 17:39:27 +02:00
drevicko
6b843a4384 fixes typo in code 2017-05-29 12:15:35 +01:00
drevicko
65d6e47513 Implements Fernando's suggestion in #31
I've added a neutral point definition (in the converters senpy file) as used in pull request #29
2017-05-29 12:13:21 +01:00
drevicko
8d56a0b630 fixes #31
I've used euclidean metric instead of taxicab as I feel it makes more sense (taxicab has bizzare unintuitive effects for points far from the centroids).
2017-05-29 12:06:44 +01:00
drevicko
e7ac6e66b0 update _forward_conversion docstring + minor edits 2017-05-29 11:50:14 +01:00
J. Fernando Sánchez
0f8d1dff69 Fixed image repository 2017-05-19 19:59:45 +02:00
J. Fernando Sánchez
236183593c Hidden variables 2017-05-19 19:36:25 +02:00
J. Fernando Sánchez
7637498517 Added push to github 2017-05-19 19:33:00 +02:00
J. Fernando Sánchez
8c70433312 Added push to github 2017-05-19 18:54:57 +02:00
J. Fernando Sánchez
ce83fb3981 Added k8s deployment 2017-05-19 16:52:10 +02:00
J. Fernando Sánchez
28f29d159a Merge branch 'gh-34-broken-shelf' into 0.8.x 2017-05-17 17:39:14 +02:00
J. Fernando Sánchez
c803f60fd4 Merge branch 'drevicko/provide-analyse-traceback-in-log' into 0.8.x 2017-05-17 17:38:41 +02:00
J. Fernando Sánchez
12eae16e37 Merge branch 'drevicko-patch-5' into 0.8.x 2017-05-17 17:35:50 +02:00
J. Fernando Sánchez
f3372c27b6 Merge branch '32-update-module-dev-environment' into 0.8.x 2017-05-17 17:34:21 +02:00
Ian Wood
b6de72a143 chaged exception logging to 'exception()' in analysie() 2017-05-17 17:10:05 +02:00
J. Fernando Sánchez
0f89b92457 Fixed pickling error in py2.7 2017-05-17 16:51:01 +02:00
J. Fernando Sánchez
ea91e3e4a4 Add an option to force the load of shelf plugins
Closes gsi-upm/senpy#34
2017-05-17 16:30:01 +02:00
Ian Wood
f76b777b9f don't fail if shelf pickle file broken 2017-05-16 15:09:46 +01:00
drevicko
dcc965ea63 removed superfluous 'neutral' centroid
Neutral is included as an 'origin' field. This is partly because emoml has no vocab for "Neutral" in dimensional models.
2017-05-08 14:34:28 +01:00
drevicko
400f647b7b removed unneccessary defaultdict import 2017-05-08 14:32:53 +01:00
Ian Wood
ec1a2ff5f9 added 'origin' to VAD representation, incorporated into weighed sum for Cat->VAD conversion 2017-05-08 14:28:51 +01:00
J. Fernando Sánchez
e112dd55ce Pip install editable
Closes senpy/senpy#32
2017-05-05 17:27:12 +02:00
J. Fernando Sánchez
60ef304108 Analysis set as a python list
Closes senpy/senpy#31
2017-05-05 17:05:17 +02:00
Ian Wood
1a9dd07f7e Merge branch 'master' 0.8.7 into patch-6 2017-05-05 15:02:15 +01:00
Ian Wood
b80b0c7947 used more specific exception specifier (KeyError) 2017-04-11 11:25:50 +01:00
Ian Wood
1ca6ec52fd fixed weighted average, no explicit treatment of 'neutral' 2017-04-11 11:12:02 +01:00
J. Fernando Sánchez
7927cf1587 add option to show senpy version number
Merge branch 'patch-5' of https://github.com/drevicko/senpy into drevicko-patch-5
2017-04-10 21:03:17 +02:00
J. Fernando Sánchez
13cefbedfb Clean dev containers in makefile 2017-04-10 20:38:12 +02:00
J. Fernando Sánchez
4ba9535d56 Merge branches into 0.8.x
'25-validation-errors'
'27-add-method-to-get-list-of-plugins'
'28-fix-multiprocessing-issues'
2017-04-10 20:31:26 +02:00
J. Fernando Sánchez
e582ef07d4 Fix multiprocessing tests in python2.7
Closes #28 for python 2.

Apparently, process pools are not contexts in python 2.7.
On the other hand, in py2 you cannot pickle instance methods, so
you have to implement Pool tasks as independent functions.
2017-04-10 20:17:38 +02:00
J. Fernando Sánchez
ef40bdb545 Replace gevent with tornado
Closes #28

Added:

* Async test (still missing one that includes the IOLoop)
* Async plugin under tests. To manually try async functionalities:
```
senpy -f tests/
```
2017-04-10 18:16:45 +02:00
J. Fernando Sánchez
e0b4c76238 Add plugin method to client
Closes #28
2017-04-10 18:07:34 +02:00
J. Fernando Sánchez
14c86ec38c Set plugin list as a @set and fixed test case
It turns out setting "plugins" as a @list in the context causes the
"plugins" property to expand to its full name.
Removing the type causes a regression of #17, which I initially missed
because the test in #17 was wrong.

Closes #26
2017-04-10 17:24:39 +02:00
J. Fernando Sánchez
d3d05b3218 Fixed expansion of "plugins"
Closes #26

There was no need to add @list, and it was causing JSON-LD to expand the
URI of 'plugins'
2017-04-07 16:24:28 +02:00
J. Fernando Sánchez
eababcadb0 Analysis as strings or objects in results
Closes #25
2017-04-07 16:13:57 +02:00
drevicko
7efece0224 add option to show senpy version number 2017-04-07 10:04:51 +01:00
drevicko
53138e6942 Estimate VAD by weighted average
Does a weighted average of centroids.

If intensity sums to zero for a category, a 'neutral' category is used or 0 if it's not present. I'm not 100% sure this is the best approach, and the name of the "neutral" category perhaps should use some convention?

Note that if there are no categories present, then no VAD (or other dimensional) estimate is returned. It may be better to use the neutral centroid if it's present in this case also.
2017-04-04 15:37:07 +01:00
J. Fernando Sánchez
1302b0b93c Fixed pip tests (added version) 2017-04-04 11:42:18 +02:00
J. Fernando Sánchez
ad1092690b Merge branch '20-improve-docs' into 0.8.x 2017-04-04 11:26:33 +02:00
J. Fernando Sánchez
e35e810ede Rephrase info on demo plugins
Closes #20
2017-04-04 11:26:05 +02:00
militarpancho
d5ddcb8d3f Change repository url 2017-04-04 11:21:08 +02:00
militarpancho
54c0c9c437 demo doc changed 2017-04-04 11:14:51 +02:00
J. Fernando Sánchez
6e970d01f2 Merge branch '21-ascii-cant-encode' into 0.8.x 2017-04-04 11:12:39 +02:00
J. Fernando Sánchez
1d0a54ecd2 Merge branch '22-pip-screws-with-logging-config' into 0.8.x 2017-04-04 11:12:23 +02:00
J. Fernando Sánchez
800d4a9c2c Fixed typos in Ian's patch 2017-04-04 11:11:51 +02:00
drevicko
035ef98b7e removed broken "/api" link
In index.html, there is a suggestion to try out the api with a link to "/api". Clicking that link results in a json error report - not ideal. 
Instead, I added text suggesting that a use can find example api url's after clickgin "Analyse!".
2017-04-04 11:07:32 +02:00
J. Fernando Sánchez
d7e115d7c2 Encode HEADERS
Closes #21
2017-04-03 19:23:18 +02:00
J. Fernando Sánchez
548cb4c9ba Doc changes
* Alabaster theme
* Restructured
* Simplified introduction
* Reference to entries/models
* Fixed examples
2017-04-03 18:20:09 +02:00
J. Fernando Sánchez
7e5b55ff9c Run pip with Popen
Closes #22
2017-03-30 17:38:17 +02:00
militarpancho
8b2c3e8d40 Update readthedocs. Mainly Api and What is senpy section 2017-03-28 12:34:39 +02:00
J. Fernando Sánchez
0c8f98d466 Pre-0.8.6
* Improved debugging (back to using Flask's built-in mechanisms)
* Recursive model loading from json
* Added DEVPORT to Makefile
* Accept json-ld input. Closes #16
* Improved Exception handling in client
* Modified default plugin selection to only include analysis plugins
* More tests
2017-03-14 19:59:06 +01:00
J. Fernando Sánchez
cc298742ec Merge branch '17-...' into 0.8.x 2017-03-14 13:20:20 +01:00
J. Fernando Sánchez
250052fb99 Options as a set in the JSON-LD context
Closes #18
2017-03-14 13:17:47 +01:00
J. Fernando Sánchez
603e086606 Fix list of plugins
Closes #17
2017-03-14 13:05:52 +01:00
J. Fernando Sánchez
a8614bab0c Accept plugin pipelines
Closes #15
2017-03-13 21:08:21 +01:00
J. Fernando Sánchez
70ca74b03c Added instructions for developers 2017-03-09 00:04:02 +01:00
J. Fernando Sánchez
c9e6d78183 Fixed alises, added PAD and FSRE
Closes #13
2017-03-08 23:23:40 +01:00
J. Fernando Sánchez
1a582c0843 Filter conversion plugins
Closes #12

* Shows only analysis plugins by default on /api/plugins
* Adds a plugin_type parameter to get other types of plugins
* default_plugin chosen from analysis plugins
2017-03-08 22:54:57 +01:00
drevicko
0394bcd69c add make version to readme for pip install
pip install needs the VERSION file - `make version` will create that file

I also added the -U flag to pip install to force install (this is important if the user is playing with the code or trying out different older versions, as pip will not install if it thinks the git repo represents a version already installed or older than the one installed)
2017-03-02 11:08:02 +00:00
J. Fernando Sánchez
cbeb3adbdb Added fallback version '0.0'
Installing depends on the VERSION file, so it raies an error if it is
installed in some other way.

ReadTheDocs installs the package so it can generate code docs.
This commit adds a default version 0.0
2017-03-01 18:53:54 +01:00
J. Fernando Sánchez
efb305173e Removed future from __init__
Since __init__ is imported by setup.py, future may not be installed yet.

Other options would be:

* Read VERSION -> and that code has to be duplicated in setup.py and
  senpy (to avoid the import, once again)
* Eval version.py
* Do without versioning :)
2017-03-01 18:28:20 +01:00
J. Fernando Sánchez
2288b04c92 Remove iteritems for py2/3 compatibility 2017-03-01 18:14:44 +01:00
J. Fernando Sánchez
7899cb4d33 Fixed docker upload
Doing docker push without a tag makes the client upload **ALL** the
images it has for that repo.
2017-03-01 17:59:35 +01:00
J. Fernando Sánchez
62ddca79ac Fixed conversion docs 2017-03-01 17:56:17 +01:00
J. Fernando Sánchez
99403b3443 Fix for async
Should fix #11
2017-03-01 12:25:07 +01:00
J. Fernando Sánchez
a0ff528a4b Improved docs and client
* Client now raises an exception on error
* Added conversion to the documentation
2017-02-28 19:38:01 +01:00
J. Fernando Sánchez
97bd245dfc Changed data directory 2017-02-28 18:31:43 +01:00
J. Fernando Sánchez
d8b59d06a4 Converted Ekman2VAD to centroids
* Changed the way modules are imported -> we can now use dotted
  notation (e.g. senpy.plugins.conversion.centroids)
* Refactored ekman2vad's plugin -> generic centroids
* Added some basic tests
2017-02-28 05:28:55 +01:00
J. Fernando Sánchez
453b9f3257 Fixed bugs in Ekman2VAD 2017-02-28 04:01:05 +01:00
J. Fernando Sánchez
5fb858f5fc Fixed error when installing dependencies 2017-02-28 02:24:49 +01:00
J. Fernando Sánchez
bd984a1437 Fix 5 2017-02-27 21:22:10 +01:00
J. Fernando Sánchez
e741b565a1 Fix 4 2017-02-27 20:44:27 +01:00
J. Fernando Sánchez
668a803d89 Will anything break this time? We shall see 2017-02-27 20:38:55 +01:00
J. Fernando Sánchez
9daae8dda7 Please, please, please let it pass!
Am I a complete moron?
2017-02-27 20:22:55 +01:00
J. Fernando Sánchez
c72094b94b Fixed IMAGE names in GL CI 2017-02-27 20:08:10 +01:00
J. Fernando Sánchez
15d456d048 Testing docker in travis 2017-02-27 19:51:53 +01:00
J. Fernando Sánchez
fef06d4333 Fixed image creation issue with GL CI 2017-02-27 19:37:53 +01:00
J. Fernando Sánchez
454aa61fba Fixed CI problem 2017-02-27 19:31:52 +01:00
J. Fernando Sánchez
ba2e18125c Deployment changes
* Docker all the things!
* Make all the things!
* Fixed version.sh
2017-02-27 19:16:43 +01:00
J. Fernando Sánchez
9f6a6f5ecd Loads of changes!
* Added conversion plugins (API might change!)
* Added conversion to the analysis pipeline
* Changed behaviour of --default-plugins (it adds conversion plugins regardless)
* Added emotionModel [sic] and emotionConversion models

//TODO add conversion tests
//TODO add conversion to docs
2017-02-27 12:01:19 +01:00
J. Fernando Sánchez
3cea7534ef New versioning
Use git to automatically fetch the version
2017-02-17 16:21:44 +01:00
J. Fernando Sánchez
7eaf303124 Added coverage tests 2017-02-17 11:24:57 +01:00
J. Fernando Sánchez
b4ca5f4a7c Several fixes and changes
* Added interactive debugging
* Better exception logging
* More tests for errors
* Added ONBUILD to dockerfile
  Now creating new images based on senpy's is as easy as:
  ```from senpy:<version>```. This will automatically mount the code to
  /senpy-plugins and install all dependencies
* Added /data as a VOLUME
* Added `--use-wheel` to pip install both on the image and in the
  auto-install function.
* Closes #9

Break compatibilitity:

* Removed ability to (de)activate plugins through the web
2017-02-17 09:56:53 +01:00
J. Fernando Sánchez
3311af2167 Bumped to v0.7.1 2017-02-13 20:43:27 +01:00
J. Fernando Sánchez
a4694dff2c Merge branch 'gitlabci' 2017-02-13 20:42:04 +01:00
J. Fernando Sánchez
6cb669cdb1 Added docker auth to docker push job 2017-02-13 20:36:12 +01:00
J. Fernando Sánchez
506feec13d Fixed docker push 2017-02-13 20:24:10 +01:00
J. Fernando Sánchez
2e3a6b7c84 TAGNAME->SLUG and cache in .eggs 2017-02-13 20:07:20 +01:00
J. Fernando Sánchez
7cc8b562f4 Moved before_script to images 2017-02-13 19:43:52 +01:00
J. Fernando Sánchez
528bbcac35 Added gitlab-ci docker build jobs 2017-02-13 19:41:18 +01:00
J. Fernando Sánchez
068241fb72 CI_REGISTRY_NAME 2017-02-13 18:34:35 +01:00
J. Fernando Sánchez
39d86a2050 Configured runner to mount socket 2017-02-13 18:29:38 +01:00
J. Fernando Sánchez
5371c83ab0 speeding up testing of the CI pipeline 2017-02-13 17:35:00 +01:00
J. Fernando Sánchez
673992dbe8 Docker dind service made global 2017-02-13 17:16:38 +01:00
J. Fernando Sánchez
eb3a42c247 Updated gitlabci 2017-02-13 12:30:44 +01:00
J. Fernando Sánchez
20357d2a0d Added gitlab CI 2017-02-13 12:04:29 +01:00
J. Fernando Sánchez
e9d7980e42 Merge branch 'jsonplay' into 'master'
Jsonplay

Closes #8

See merge request !9
2017-02-09 14:03:19 +00:00
J. Fernando Sánchez
908090f634 Released v0.7
Bug-fixes and improvements:
* Closes #5
* Closes #1
* Adds Client (beta)
* Added several schemas
* Lighter string representation -> should avoid delays in the analysis
  with plugins that have 'heavy' attributes

Backwards-incompatible changes:
* Context in headers by default
* All schemas include a "@type" argument that is used for autodetection
  in the client

... And possibly many more, this is still <1.0
2017-02-08 21:55:17 +01:00
militarpancho
cb963dc438 Playground improved. This closes #8 2017-02-06 14:08:13 +01:00
militarpancho
477cb18db1 Added tabs to choose view for the response. #8 2017-02-03 14:33:14 +01:00
J. Fernando Sánchez
fbf0384985 Replaced gevent with threading
* Replaced gevent (testing)
* Trying the slim python image (1/3 of previous size)
2017-02-02 16:35:58 +01:00
militarpancho
7a2c016cc6 added jsoneditor javascript plugin in relation with issue #8 2017-02-02 14:31:37 +01:00
J. Fernando Sánchez
b072121e20 Added Model string representation
This should help with performance issues with models that have large
private variables.
2017-02-02 05:01:40 +01:00
J. Fernando Sánchez
ceed9b97d0 Entries should be a set instead of lists
This allows for better framing when two entries have the same @id
2017-01-10 17:01:28 +01:00
J. Fernando Sánchez
2dbdb58b06 Fixed bug with sdist's name convention 2017-01-10 16:59:28 +01:00
J. Fernando Sánchez
db30257373 Flake8, Semver, Pre-commit
* Added pre-commit: http://pre-commit.com
* Fixed flake8 errors
* Added flake8 pre-commit hooks
* Added pre-commit to Makefile
* Changed VERSION numbering
* Changed versioning to match PEP-0440
2017-01-10 16:25:01 +01:00
J. Fernando Sánchez
7fd69cc690 YAPFed 2017-01-10 10:19:32 +01:00
J. Fernando Sánchez
b543a4614e Improved schema validation
* Added debug Dockerfile/Makefile
* Validation of examples in docs
2017-01-10 10:02:14 +01:00
J. Fernando Sánchez
bc1f9e4cf5 Split definitions into individual files 2016-12-26 18:49:53 +01:00
J. Fernando Sánchez
d72a995fa9 New shelf location and better shelf tests 2016-12-26 17:45:17 +01:00
J. Fernando Sánchez
40b67503ce Updated links in README 2016-12-19 13:17:56 +01:00
J. Fernando Sánchez
8624562f02 Dockerfiles not ignored anymore 2016-12-14 17:06:50 +01:00
J. Fernando Sánchez
4dee623ef9 Better makefile 2016-12-14 14:38:58 +01:00
J. Fernando Sánchez
2e7530d9bc Bumped to 0.6.1 2016-09-21 20:29:16 +02:00
J. Fernando Sánchez
07b5dd3823 Added option to install dependencies in CLI 2016-09-21 20:27:30 +02:00
J. Fernando Sánchez
0d511ad3c3 Bumped to 0.6.0
* Downloads pip requirements
* Modified Makefile
2016-09-21 19:00:20 +02:00
J. Fernando Sánchez
7205a0e7b2 Moved to gsiupm
* Updated sphinx docs to include schemas and version
* Added docker push to makefile
2016-09-21 10:10:49 +02:00
J. Fernando Sánchez
fff38bf825 Added 3.4 to travis 2016-09-20 20:58:37 +02:00
J. Fernando Sánchez
5d5de0bc50 Makefile for automated testing (no more drone) 2016-09-20 20:55:59 +02:00
J. Fernando Sánchez
0454fb1afe Updated installation instructions 2016-07-13 16:08:02 +02:00
J. Fernando Sánchez
5e36c71fa7 Merge branch 'master' of github.com:gsi-upm/senpy 2016-05-06 11:54:32 +02:00
J. Fernando Sánchez
c8e742f96e Bumped 0.5.6 2016-05-06 11:50:46 +02:00
J. Fernando Sánchez
1e7ae13700 Substituted json with yaml 2016-05-06 11:46:20 +02:00
NachoCP
bf30c04a52 Updated readthedocs 2016-03-31 11:56:32 +02:00
J. Fernando Sánchez
16ce767d09 v0.5.5 Added CLI 2016-03-02 16:48:48 +01:00
J. Fernando Sánchez
39761e0922 Fix for heroku 2016-03-02 08:02:09 +01:00
J. Fernando Sánchez
03eb38c12d Added CLI and refactored argument parsing 2016-03-02 05:07:48 +01:00
J. Fernando Sánchez
a50f026701 Fixed Playground 2016-02-22 15:51:54 +01:00
J. Fernando Sánchez
b8339e397b Improved request handling
Also:
 * Shelve -> Pickle to avoid weird db problems
 * Serving schemas and contexts
2016-02-21 19:36:24 +01:00
J. Fernando Sánchez
407d17b2b9 __version__ in the module itself 2016-02-21 11:08:00 +01:00
J. Fernando Sánchez
56fef9e835 --amend 2016-02-21 03:00:11 +01:00
J. Fernando Sánchez
14a3e4103b Prefix handling and bug fixes 2016-02-21 02:53:39 +01:00
J. Fernando Sánchez
48d7d1d02e Improved plugins API and loading
Also:

* added drone-ci integration: tests for py2.7 and py3
2016-02-20 18:19:52 +01:00
J. Fernando Sánchez
14c9f61864 Python 3 compatible
There are also some slight changes to the JSON schemas and the use of
JSON-LD.
2016-02-20 16:17:37 +01:00
J. Fernando Sánchez
a79df7a3da Closer to py3 2016-02-20 16:07:47 +01:00
J. Fernando Sánchez
ecc2a8312a Still not functional 2016-02-20 16:07:47 +01:00
J. Fernando Sánchez
aafd6a0938 Merge branch 'nacho' 2016-02-19 20:00:21 +01:00
NachoCP
b88d6c53e0 Update Dockerfile 2016-02-10 11:25:20 +01:00
NachoCP
4f2aee5681 DockerFile updated to deploy plugins 2016-02-03 13:43:31 +01:00
NachoCP
a5c27bcaba Test Changed 2016-02-02 11:34:30 +01:00
NachoCP
cefd6331e0 Test Changed 2016-02-02 11:31:23 +01:00
NachoCP
c2bb93e86c Test Changed 2016-02-02 11:27:56 +01:00
NachoCP
091104bc7d Test Changed 2016-02-02 11:13:17 +01:00
NachoCP
81cbe5ea54 Test Changed 2016-02-02 11:09:56 +01:00
NachoCP
ab2c1f73e4 Test Changed 2016-02-02 11:03:47 +01:00
NachoCP
6a84af1c5a Test Changed 2016-02-02 10:44:13 +01:00
NachoCP
5983493b78 Test Changed 2016-02-02 10:39:59 +01:00
NachoCP
61deabe13e DockerFile 2015-12-22 16:04:39 +01:00
J. Fernando Sánchez
bb1b4d3266 Added analyse to the docs 2015-12-17 19:48:28 +01:00
NachoCP
703fb68b27 Merge branch 'master' of github.com:gsi-upm/senpy into nacho 2015-12-14 09:27:31 +01:00
J. Fernando Sánchez
6fe68e3c40 Fixes #3 2015-12-11 14:53:30 +01:00
NachoCP
7b9f8a8bef Merge branch 'master' of github.com:gsi-upm/senpy into nacho 2015-12-11 09:49:35 +01:00
J. Fernando Sánchez
82496dc8e4 Trying to fix an issue with ShelfPlugin 2015-12-10 16:13:22 +01:00
J. Fernando Sánchez
f74ee668b6 Cleaning 2015-11-27 15:05:59 +01:00
NachoCP
d304dec2f7 Update Playground 2015-11-17 11:42:16 +01:00
J. Fernando Sánchez
45838e7e98 Wording :) 2015-11-05 19:20:13 +01:00
J. Fernando Sánchez
ff002c818a Bumped to 0.4.10 2015-11-05 19:18:38 +01:00
J. Fernando Sánchez
79d6b6f67f Added plugins FAQ to the docs 2015-11-05 19:11:35 +01:00
J. Fernando Sánchez
b8993f7d64 Added shelve mixin 2015-11-05 18:50:37 +01:00
J. Fernando Sánchez
bd2e0f0d5c Added traceback to plugin activation 2015-11-05 18:48:07 +01:00
J. Fernando Sánchez
7de5b41340 Started readthedocs and improved README 2015-10-28 21:25:56 +01:00
J. Fernando Sánchez
a63e9209fd Added gitignore 2015-10-28 15:05:51 +01:00
J. Fernando Sánchez
b0eb2e0628 Fixed error with Sentiment140 2015-10-08 19:39:12 +02:00
J. Fernando Sánchez
60415f8217 Added Dockerfile and instructions 2015-10-08 19:26:02 +02:00
J. Fernando Sánchez
724eac38d8 Bumped to 0.4.8 2015-10-08 19:21:24 +02:00
J. Fernando Sánchez
8fa372de15 Added new launch option to the README 2015-10-08 18:58:21 +02:00
J. Fernando Sánchez
a1ffe04a30 More sensible exceptions when importing 2015-10-08 18:23:37 +02:00
J. Fernando Sánchez
74b0cf868e Added console script 2015-10-08 18:20:16 +02:00
J. Fernando Sánchez
50e8e2730b Added default plugins to app.py 2015-10-06 14:39:42 +02:00
J. Fernando Sánchez
b484b453e0 Added indentation and default plugins
* setup.py:
2015-09-29 11:14:28 +02:00
J. Fernando Sánchez
7c2e0ddec7 Added plugins by default and monkey patching
Fixes #2
2015-06-18 17:53:15 +02:00
J. Fernando Sánchez
384aba4654 Sleep plugin sleeps on request too 2015-04-26 21:08:36 +02:00
J. Fernando Sánchez
a857dd3042 Typo in README 2015-04-26 20:47:29 +02:00
J. Fernando Sánchez
b1b672f66d Quick note about using and installing 2015-02-24 18:04:34 +01:00
J. Fernando Sánchez
09d9143a82 Added docs 2015-02-24 09:03:36 +01:00
J. Fernando Sánchez
c1a6b57ac5 Better readme, fixed app.py 2015-02-24 08:43:00 +01:00
J. Fernando Sánchez
6b78b7ccc7 Fixed requirements 2015-02-24 07:54:58 +01:00
J. Fernando Sánchez
f0b1cfcba6 Forcing Travis 2015-02-24 07:50:14 +01:00
J. Fernando Sánchez
4bcd046016 Added TravisCI 2015-02-24 07:44:29 +01:00
J. Fernando Sánchez
ae09f609c2 Improved message when no plugins are found 2015-02-24 07:37:51 +01:00
J. Fernando Sánchez
d1006bbc92 PEP8+Better JSON-LD support
* The API has also changed, there are new parameters to send the
context as part of the headers.
* Improved tests
* PEP8 compliance (despite the line about gevent)
2015-02-24 07:15:25 +01:00
J. Fernando Sánchez
d58137e8f9 Changed to GSI-UPM 2015-02-23 20:22:58 +01:00
J. Fernando Sánchez
79c83e34a3 Added random plugin and other features 2015-02-23 02:13:31 +01:00
J. Fernando Sánchez
37a098109f Module script and improvement in JSON-LD 2014-12-02 13:31:15 +01:00
J. Fernando Sánchez
ff14925056 Improved plugins, better tests, gevent
Moved from Yapsy again (it is not flexible enough), now we use a
custom solution.
The activation and deactivation of plugins is asynchronous, so
that plugins that take a long time don't interfere with the rest.
2014-12-01 18:27:20 +01:00
J. Fernando Sánchez
10f4782ad7 Better NIF compliance 2014-12-01 09:38:23 +01:00
J. Fernando Sánchez
4351f76b60 Removed unnecessary contexts 2014-11-27 17:43:19 +01:00
J. Fernando Sánchez
86f45f8147 JSON-LD contexts and prefixes 2014-11-27 17:39:36 +01:00
J. Fernando Sánchez
2834967026 Better jsonld support 2014-11-27 11:27:05 +01:00
J. Fernando Sánchez
2f7a8d7267 Fixed setup.py and pip 2014-11-20 20:54:57 +01:00
J. Fernando Sánchez
2b68838514 PEP8 compliance 2014-11-20 19:29:49 +01:00
J. Fernando Sánchez
eaf65f0c6b First tests 2014-11-07 19:12:21 +01:00
J. Fernando Sánchez
a5e79bead3 Version 0.2.5 - Pypi 2014-11-05 19:21:17 +01:00
J. Fernando Sánchez
54492472ba Files for deployment 2014-11-05 19:17:27 +01:00
J. Fernando Sánchez
ff8d12074b Improved plugins (reload, imp) 2014-11-04 21:31:41 +01:00
J. Fernando Sánchez
bdf1992775 Fixed plugins 2014-10-17 19:14:49 +02:00
J. Fernando Sánchez
e06fc2e671 V 0.2.2 - Better plugins 2014-10-17 19:06:19 +02:00
183 changed files with 60580 additions and 400 deletions

9
.drone.yml Normal file
View File

@@ -0,0 +1,9 @@
build:
image: python:$$PYTHON_VERSION
commands:
- python setup.py test
matrix:
PYTHON_VERSION:
- 2.7
- 3.4

10
.gitignore vendored Normal file
View File

@@ -0,0 +1,10 @@
*.pyc
.*
*egg-info
dist
build
README.html
__pycache__
VERSION
Dockerfile-*
Dockerfile

106
.gitlab-ci.yml Normal file
View File

@@ -0,0 +1,106 @@
# Uncomment if you want to use docker-in-docker
# image: gsiupm/dockermake:latest
# services:
# - docker:dind
# When using dind, it's wise to use the overlayfs driver for
# improved performance.
stages:
- test
- push
- deploy
- clean
before_script:
- docker login -u $HUB_USER -p $HUB_PASSWORD
.test: &test_definition
stage: test
script:
- make -e test-$PYTHON_VERSION
test-3.5:
<<: *test_definition
variables:
PYTHON_VERSION: "3.5"
test-2.7:
<<: *test_definition
variables:
PYTHON_VERSION: "2.7"
.image: &image_definition
stage: push
script:
- make -e push-$PYTHON_VERSION
only:
- tags
- triggers
push-3.5:
<<: *image_definition
variables:
PYTHON_VERSION: "3.5"
push-2.7:
<<: *image_definition
variables:
PYTHON_VERSION: "2.7"
push-latest:
<<: *image_definition
variables:
PYTHON_VERSION: latest
only:
- master
- triggers
push-github:
stage: deploy
script:
- make -e push-github
only:
- master
- triggers
deploy_pypi:
stage: deploy
script: # Configure the PyPI credentials, then push the package, and cleanup the creds.
- echo "[server-login]" >> ~/.pypirc
- echo "username=" ${PYPI_USER} >> ~/.pypirc
- echo "password=" ${PYPI_PASSWORD} >> ~/.pypirc
- make pip_upload
- echo "" > ~/.pypirc && rm ~/.pypirc # If the above fails, this won't run.
only:
- /^v?\d+\.\d+\.\d+([abc]\d*)?$/ # PEP-440 compliant version (tags)
except:
- branches
deploy:
stage: deploy
environment: test
script:
- make -e deploy
only:
- master
push-github:
stage: deploy
script:
- make -e push-github
only:
- master
- triggers
clean :
stage: clean
script:
- make -e clean
when: manual
cleanup_py:
stage: clean
when: always # this is important; run even if preceding stages failed.
script:
- rm -vf ~/.pypirc # we don't want to leave these around, but GitLab may clean up anyway.
- docker logout

5
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,5 @@
- repo: git://github.com/pre-commit/pre-commit-hooks
sha: e626cd57090d8df0be21e4df0f4e55cc3511d6ab
hooks:
- id: flake8
- id: check-json

12
.travis.yml Normal file
View File

@@ -0,0 +1,12 @@
sudo: required
services:
- docker
language: python
env:
- PYV=2.7
- PYV=3.5
# run nosetests - Tests
script: make test-$PYV

22
Dockerfile.template Normal file
View File

@@ -0,0 +1,22 @@
from python:{{PYVERSION}}
MAINTAINER J. Fernando Sánchez <jf.sanchez@upm.es>
RUN mkdir /cache/ /senpy-plugins /data/
VOLUME /data/
ENV PIP_CACHE_DIR=/cache/ SENPY_DATA=/data
ONBUILD COPY . /senpy-plugins/
ONBUILD RUN python -m senpy --only-install -f /senpy-plugins
ONBUILD WORKDIR /senpy-plugins/
WORKDIR /usr/src/app
COPY test-requirements.txt requirements.txt /usr/src/app/
RUN pip install --use-wheel -r test-requirements.txt -r requirements.txt
COPY . /usr/src/app/
RUN pip install --no-index --no-deps --editable .
ENTRYPOINT ["python", "-m", "senpy", "-f", "/senpy-plugins/", "--host", "0.0.0.0"]

9
MANIFEST.in Normal file
View File

@@ -0,0 +1,9 @@
include requirements.txt
include test-requirements.txt
include README.rst
include senpy/VERSION
graft senpy/plugins
graft senpy/schemas
graft senpy/templates
graft senpy/static
graft img

147
Makefile Normal file
View File

@@ -0,0 +1,147 @@
NAME=senpy
VERSION=$(shell git describe --tags --dirty 2>/dev/null)
GITHUB_REPO=git@github.com:gsi-upm/senpy.git
IMAGENAME=gsiupm/senpy
IMAGEWTAG=$(IMAGENAME):$(VERSION)
PYVERSIONS=3.5 2.7
PYMAIN=$(firstword $(PYVERSIONS))
DEVPORT=5000
TARNAME=$(NAME)-$(VERSION).tar.gz
action="test-${PYMAIN}"
GITHUB_REPO=git@github.com:gsi-upm/senpy.git
KUBE_CA_PEM_FILE=""
KUBE_URL=""
KUBE_TOKEN=""
KUBE_NAMESPACE=$(NAME)
KUBECTL=docker run --rm -v $(KUBE_CA_PEM_FILE):/tmp/ca.pem -v $$PWD:/tmp/cwd/ -i lachlanevenson/k8s-kubectl --server="$(KUBE_URL)" --token="$(KUBE_TOKEN)" --certificate-authority="/tmp/ca.pem" -n $(KUBE_NAMESPACE)
CI_REGISTRY=docker.io
CI_REGISTRY_USER=gitlab
CI_BUILD_TOKEN=""
CI_COMMIT_REF_NAME=master
all: build run
.FORCE:
version: .FORCE
@echo $(VERSION) > $(NAME)/VERSION
@echo $(VERSION)
yapf:
yapf -i -r $(NAME)
yapf -i -r tests
init:
pip install --user pre-commit
pre-commit install
dockerfiles: $(addprefix Dockerfile-,$(PYVERSIONS))
@unlink Dockerfile >/dev/null
ln -s Dockerfile-$(PYMAIN) Dockerfile
Dockerfile-%: Dockerfile.template
sed "s/{{PYVERSION}}/$*/" Dockerfile.template > Dockerfile-$*
quick_build: $(addprefix build-, $(PYMAIN))
build: $(addprefix build-, $(PYVERSIONS))
build-%: version Dockerfile-%
docker build -t '$(IMAGEWTAG)-python$*' --cache-from $(IMAGENAME):python$* -f Dockerfile-$* .;
quick_test: $(addprefix test-,$(PYMAIN))
dev-%:
@docker start $(NAME)-dev$* || (\
$(MAKE) build-$*; \
docker run -d -w /usr/src/app/ -p $(DEVPORT):5000 -v $$PWD:/usr/src/app --entrypoint=/bin/bash -ti --name $(NAME)-dev$* '$(IMAGEWTAG)-python$*'; \
)\
docker exec -ti $(NAME)-dev$* bash
dev: dev-$(PYMAIN)
test-all: $(addprefix test-,$(PYVERSIONS))
test-%:
docker run --rm --entrypoint /usr/local/bin/python -w /usr/src/app $(IMAGENAME):python$* setup.py test
test: test-$(PYMAIN)
dist/$(TARNAME): version
python setup.py sdist;
sdist: dist/$(TARNAME)
pip_test-%: sdist
docker run --rm -v $$PWD/dist:/dist/ python:$* pip install /dist/$(TARNAME);
pip_test: $(addprefix pip_test-,$(PYVERSIONS))
pip_upload: pip_test
python setup.py sdist upload ;
clean:
@docker ps -a | grep $(IMAGENAME) | awk '{ split($$2, vers, "-"); if(vers[0] != "${VERSION}"){ print $$1;}}' | xargs docker rm -v 2>/dev/null|| true
@docker images | grep $(IMAGENAME) | awk '{ split($$2, vers, "-"); if(vers[0] != "${VERSION}"){ print $$1":"$$2;}}' | xargs docker rmi 2>/dev/null|| true
@docker stop $(addprefix $(NAME)-dev,$(PYVERSIONS)) 2>/dev/null || true
@docker rm $(addprefix $(NAME)-dev,$(PYVERSIONS)) 2>/dev/null || true
git_commit:
git commit -a
git_tag:
git tag ${VERSION}
git_push:
git push --tags origin master
run-%: build-%
docker run --rm -p $(DEVPORT):5000 -ti '$(IMAGEWTAG)-python$(PYMAIN)' --default-plugins
run: run-$(PYMAIN)
push-latest: $(addprefix push-latest-,$(PYVERSIONS))
docker tag '$(IMAGEWTAG)-python$(PYMAIN)' '$(IMAGEWTAG)'
docker tag '$(IMAGEWTAG)-python$(PYMAIN)' '$(IMAGENAME)'
docker push '$(IMAGENAME):latest'
docker push '$(IMAGEWTAG)'
push-latest-%: build-%
docker tag $(IMAGENAME):$(VERSION)-python$* $(IMAGENAME):python$*
docker push $(IMAGENAME):$(VERSION)-python$*
docker push $(IMAGENAME):python$*
push-%: build-%
docker push $(IMAGENAME):$(VERSION)-python$*
push: $(addprefix push-,$(PYVERSIONS))
docker tag '$(IMAGEWTAG)-python$(PYMAIN)' '$(IMAGEWTAG)'
docker push $(IMAGENAME):$(VERSION)
push-github:
$(eval KEY_FILE := $(shell mktemp))
@echo "$$GITHUB_DEPLOY_KEY" > $(KEY_FILE)
@git remote rm github-deploy || true
git remote add github-deploy $(GITHUB_REPO)
@GIT_SSH_COMMAND="ssh -i $(KEY_FILE)" git fetch github-deploy $(CI_COMMIT_REF_NAME) || true
@GIT_SSH_COMMAND="ssh -i $(KEY_FILE)" git push github-deploy $(CI_COMMIT_REF_NAME)
rm $(KEY_FILE)
ci:
gitlab-runner exec docker --docker-volumes /var/run/docker.sock:/var/run/docker.sock --env CI_PROJECT_NAME=$(NAME) ${action}
deploy:
@$(KUBECTL) delete secret $(CI_REGISTRY) || true
@$(KUBECTL) create secret docker-registry $(CI_REGISTRY) --docker-server=$(CI_REGISTRY) --docker-username=$(CI_REGISTRY_USER) --docker-email=$(CI_REGISTRY_USER) --docker-password=$(CI_BUILD_TOKEN)
@$(KUBECTL) apply -f /tmp/cwd/k8s/
.PHONY: test test-% test-all build-% build test pip_test run yapf push-main push-% dev ci version .FORCE deploy

View File

@@ -1 +1 @@
web: gunicorn app:app --log-file=-
web: python -m senpy --host 0.0.0.0 --port $PORT --default-plugins

View File

@@ -1,19 +0,0 @@
![GSI Logo](logo.png)
[Senpy](http://senpy.herokuapp.com)
=========================================
Example endpoint that yields results compatible with the EUROSENTIMENT format and exposes the NIF API.
It can be used as a template to adapt existing services to EUROSENTIMENT or to create new services.
[DEMO on Heroku](http://eurosentiment-endpoint.herokuapp.com)
This endpoint serves as bootcampt for any developer wishing to build applications that use the EUROSENTIMENT services.
Acknowledgement
---------------
EUROSENTIMENT PROJECT
Grant Agreement no: 296277
Starting date: 01/09/2012
Project duration: 24 months
![Eurosentiment Logo](logo_grande.png)
![FP7 logo](logo_fp7.gif)

140
README.rst Normal file
View File

@@ -0,0 +1,140 @@
.. image:: img/header.png
:height: 6em
:target: http://demos.gsi.dit.upm.es/senpy
.. image:: https://travis-ci.org/gsi-upm/senpy.svg?branch=master
:target: https://travis-ci.org/gsi-upm/senpy
Senpy lets you create sentiment analysis web services easily, fast and using a well known API.
As a bonus, senpy services use semantic vocabularies (e.g. `NIF <http://persistence.uni-leipzig.org/nlp2rdf/>`_, `Marl <http://www.gsi.dit.upm.es/ontologies/marl>`_, `Onyx <http://www.gsi.dit.upm.es/ontologies/onyx>`_) and formats (turtle, JSON-LD, xml-rdf).
Have you ever wanted to turn your sentiment analysis algorithms into a service?
With senpy, now you can.
It provides all the tools so you just have to worry about improving your algorithms:
`See it in action. <http://senpy.cluster.gsi.dit.upm.es/>`_
Installation
------------
The stable version can be installed in three ways.
Through PIP
***********
.. code:: bash
pip install -U --user senpy
Alternatively, you can use the development version:
.. code:: bash
git clone http://github.com/gsi-upm/senpy
cd senpy
pip install --user .
If you want to install senpy globally, use sudo instead of the ``--user`` flag.
Docker Image
************
Build the image or use the pre-built one: ``docker run -ti -p 5000:5000 gsiupm/senpy --default-plugins``.
To add custom plugins, add a volume and tell senpy where to find the plugins: ``docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --default-plugins -f /plugins``
Developing
----------
Developing/debugging
********************
This command will run the senpy container using the latest image available, mounting your current folder so you get your latest code:
.. code:: bash
# Python 3.5
make dev
# Python 2.7
make dev-2.7
Building a docker image
***********************
.. code:: bash
# Python 3.5
make build-3.5
# Python 2.7
make build-2.7
Testing
*******
.. code:: bash
make test
Running
*******
This command will run the senpy server listening on localhost:5000
.. code:: bash
# Python 3.5
make run-3.5
# Python 2.7
make run-2.7
Usage
-----
However, the easiest and recommended way is to just use the command-line tool to load your plugins and launch the server.
.. code:: bash
senpy
or, alternatively:
.. code:: bash
python -m senpy
This will create a server with any modules found in the current path.
For more options, see the `--help` page.
Alternatively, you can use the modules included in senpy to build your own application.
Deploying on Heroku
-------------------
Use a free heroku instance to share your service with the world.
Just use the example Procfile in this repository, or build your own.
`DEMO on heroku <http://senpy.herokuapp.com>`_
For more information, check out the `documentation <http://senpy.readthedocs.org>`_.
------------------------------------------------------------------------------------
Acknowledgement
---------------
This development has been partially funded by the European Union through the MixedEmotions Project (project number H2020 655632), as part of the `RIA ICT 15 Big data and Open Data Innovation and take-up` programme.
.. image:: img/me.png
:target: http://mixedemotions-project.eu
:height: 100px
:alt: MixedEmotions Logo
.. image:: img/eu-flag.jpg
:height: 100px
:target: http://ec.europa.eu/research/participants/portal/desktop/en/opportunities/index.html

33
app.py
View File

@@ -1,33 +0,0 @@
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes
# DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
'''
Simple Sentiment Analysis server for EUROSENTIMENT
This class shows how to use the nif_server module to create custom services.
'''
import config
from flask import Flask
from senpy import Senpy
app = Flask(__name__)
sp = Senpy()
sp.init_app(app)
if __name__ == '__main__':
app.debug = config.DEBUG
app.run()

1
dev-requirements.txt Normal file
View File

@@ -0,0 +1 @@
mock

1
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
_build

177
docs/Makefile Normal file
View File

@@ -0,0 +1,177 @@
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
BUILDDIR = _build
# User-friendly check for sphinx-build
ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
$(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
endif
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
# the i18n builder cannot share the environment and doctrees with the others
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " dirhtml to make HTML files named index.html in directories"
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " latexpdf to make LaTeX files and run them through pdflatex"
@echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
@echo " text to make text files"
@echo " man to make manual pages"
@echo " texinfo to make Texinfo files"
@echo " info to make Texinfo files and run them through makeinfo"
@echo " gettext to make PO message catalogs"
@echo " changes to make an overview of all changed/added/deprecated items"
@echo " xml to make Docutils-native XML files"
@echo " pseudoxml to make pseudoxml-XML files for display purposes"
@echo " linkcheck to check all external links for integrity"
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
clean:
rm -rf $(BUILDDIR)/*
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
singlehtml:
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
@echo
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
pickle:
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
@echo
@echo "Build finished; now you can process the pickle files."
json:
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
@echo
@echo "Build finished; now you can process the JSON files."
htmlhelp:
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in $(BUILDDIR)/htmlhelp."
qthelp:
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
@echo
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
".qhcp project file in $(BUILDDIR)/qthelp, like this:"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/Senpy.qhcp"
@echo "To view the help file:"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/Senpy.qhc"
devhelp:
$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
@echo
@echo "Build finished."
@echo "To view the help file:"
@echo "# mkdir -p $$HOME/.local/share/devhelp/Senpy"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/Senpy"
@echo "# devhelp"
epub:
$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
@echo
@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
latex:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo
@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
@echo "Run \`make' in that directory to run these through (pdf)latex" \
"(use \`make latexpdf' here to do that automatically)."
latexpdf:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through pdflatex..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
latexpdfja:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through platex and dvipdfmx..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
text:
$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
@echo
@echo "Build finished. The text files are in $(BUILDDIR)/text."
man:
$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
@echo
@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
texinfo:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo
@echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
@echo "Run \`make' in that directory to run these through makeinfo" \
"(use \`make info' here to do that automatically)."
info:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo "Running Texinfo files through makeinfo..."
make -C $(BUILDDIR)/texinfo info
@echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
gettext:
$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
@echo
@echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
changes:
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
@echo
@echo "The overview file is in $(BUILDDIR)/changes."
linkcheck:
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in $(BUILDDIR)/linkcheck/output.txt."
doctest:
$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
@echo "Testing of doctests in the sources finished, look at the " \
"results in $(BUILDDIR)/doctest/output.txt."
xml:
$(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
@echo
@echo "Build finished. The XML files are in $(BUILDDIR)/xml."
pseudoxml:
$(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
@echo
@echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."

317
docs/SenpyClientUse.ipynb Normal file
View File

@@ -0,0 +1,317 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:05:31.465571Z",
"start_time": "2017-04-10T19:05:31.458282+02:00"
},
"deletable": true,
"editable": true
},
"source": [
"# Client"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"deletable": true,
"editable": true
},
"source": [
"The built-in senpy client allows you to query any Senpy endpoint. We will illustrate how to use it with the public demo endpoint, and then show you how to spin up your own endpoint using docker."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Demo Endpoint\n",
"-------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"To start using senpy, simply create a new Client and point it to your endpoint. In this case, the latest version of Senpy at GSI."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:12.827640Z",
"start_time": "2017-04-10T19:29:12.818617+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"from senpy.client import Client\n",
"\n",
"c = Client('http://latest.senpy.cluster.gsi.dit.upm.es/api')\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Now, let's use that client analyse some queries:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:14.011657Z",
"start_time": "2017-04-10T19:29:13.701808+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"r = c.analyse('I like sugar!!', algorithm='sentiment140')\n",
"r"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:08:19.616754Z",
"start_time": "2017-04-10T19:08:19.610767+02:00"
},
"deletable": true,
"editable": true
},
"source": [
"As you can see, that gave us the full JSON result. A more concise way to print it would be:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:14.854213Z",
"start_time": "2017-04-10T19:29:14.842068+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"for entry in r.entries:\n",
" print('{} -> {}'.format(entry['text'], entry['sentiments'][0]['marl:hasPolarity']))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"We can also obtain a list of available plugins with the client:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:16.245198Z",
"start_time": "2017-04-10T19:29:16.056545+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"c.plugins()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Or, more concisely:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:17.663275Z",
"start_time": "2017-04-10T19:29:17.484623+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"c.plugins().keys()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"Local Endpoint\n",
"--------------"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"To run your own instance of senpy, just create a docker container with the latest Senpy image. Using `--default-plugins` you will get some extra plugins to start playing with the API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:20.637539Z",
"start_time": "2017-04-10T19:29:19.938322+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"!docker run -ti --name 'SenpyEndpoint' -d -p 6000:5000 gsiupm/senpy:0.8.6 --host 0.0.0.0 --default-plugins"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"To use this endpoint:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:21.263976Z",
"start_time": "2017-04-10T19:29:21.260595+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"c_local = Client('http://127.0.0.1:6000/api')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"That's all! After you are done with your analysis, stop the docker container:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2017-04-10T17:29:33.226686Z",
"start_time": "2017-04-10T19:29:22.392121+02:00"
},
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"!docker stop SenpyEndpoint\n",
"!docker rm SenpyEndpoint"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "68px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 4,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}

BIN
docs/_static/header.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

1
docs/_static/schemas vendored Symbolic link
View File

@@ -0,0 +1 @@
../../senpy/schemas/

11
docs/about.rst Normal file
View File

@@ -0,0 +1,11 @@
About
--------
If you use Senpy in your research, please cite `Senpy: A Pragmatic Linked Sentiment Analysis Framework <http://gsi.dit.upm.es/index.php/es/investigacion/publicaciones?view=publication&task=show&id=417>`__ (`BibTex <http://gsi.dit.upm.es/index.php/es/investigacion/publicaciones?controller=publications&task=export&format=bibtex&id=417>`__):
.. code-block:: text
Sánchez-Rada, J. F., Iglesias, C. A., Corcuera, I., & Araque, Ó. (2016, October).
Senpy: A Pragmatic Linked Sentiment Analysis Framework.
In Data Science and Advanced Analytics (DSAA),
2016 IEEE International Conference on (pp. 735-742). IEEE.

217
docs/api.rst Normal file
View File

@@ -0,0 +1,217 @@
NIF API
-------
.. http:get:: /api
Basic endpoint for sentiment/emotion analysis.
**Example request**:
.. sourcecode:: http
GET /api?input=I%20love%20GSI HTTP/1.1
Host: localhost
Accept: application/json, text/javascript
**Example response**:
.. sourcecode:: http
HTTP/1.1 200 OK
Vary: Accept
Content-Type: text/javascript
{
"@context":"http://127.0.0.1/api/contexts/Results.jsonld",
"@id":"_:Results_11241245.22",
"@type":"results"
"analysis": [
"plugins/sentiment-140_0.1"
],
"entries": [
{
"@id": "_:Entry_11241245.22"
"@type":"entry",
"emotions": [],
"entities": [],
"sentiments": [
{
"@id": "Sentiment0",
"@type": "sentiment",
"marl:hasPolarity": "marl:Negative",
"marl:polarityValue": 0,
"prefix": ""
}
],
"suggestions": [],
"text": "This text makes me sad.\nwhilst this text makes me happy and surprised at the same time.\nI cannot believe it!",
"topics": []
}
]
}
:query i input: No default. Depends on informat and intype
:query f informat: one of `turtle` (default), `text`, `json-ld`
:query t intype: one of `direct` (default), `url`
:query o outformat: one of `turtle` (default), `text`, `json-ld`
:query p prefix: prefix for the URIs
:query algo algorithm: algorithm/plugin to use for the analysis. For a list of options, see :http:get:`/api/plugins`. If not provided, the default plugin will be used (:http:get:`/api/plugins/default`).
:query algo emotionModel: desired emotion model in the results. If the requested algorithm does not use that emotion model, there are conversion plugins specifically for this. If none of the plugins match, an error will be returned, which includes the results *as is*.
:reqheader Accept: the response content type depends on
:mailheader:`Accept` header
:resheader Content-Type: this depends on :mailheader:`Accept`
header of request
:statuscode 200: no error
:statuscode 404: service not found
:statuscode 400: error while processing the request
.. http:post:: /api
The same as :http:get:`/api`.
.. http:get:: /api/plugins
Returns a list of installed plugins.
**Example request**:
.. sourcecode:: http
GET /api/plugins HTTP/1.1
Host: localhost
Accept: application/json, text/javascript
**Example response**:
.. sourcecode:: http
{
"@id": "plugins/sentiment-140_0.1",
"@type": "sentimentPlugin",
"author": "@balkian",
"description": "Sentiment classifier using rule-based classification for English and Spanish. This plugin uses sentiment140 data to perform classification. For more information: http://help.sentiment140.com/for-students/",
"extra_params": {
"language": {
"@id": "lang_sentiment140",
"aliases": [
"language",
"l"
],
"options": [
"es",
"en",
"auto"
],
"required": false
}
},
"is_activated": true,
"maxPolarityValue": 1.0,
"minPolarityValue": 0.0,
"module": "sentiment-140",
"name": "sentiment-140",
"requirements": {},
"version": "0.1"
},
{
"@id": "plugins/ExamplePlugin_0.1",
"@type": "sentimentPlugin",
"author": "@balkian",
"custom_attribute": "42",
"description": "I am just an example",
"extra_params": {
"parameter": {
"@id": "parameter",
"aliases": [
"parameter",
"param"
],
"default": 42,
"required": true
}
},
"is_activated": true,
"maxPolarityValue": 1.0,
"minPolarityValue": 0.0,
"module": "example",
"name": "ExamplePlugin",
"requirements": "noop",
"version": "0.1"
}
.. http:get:: /api/plugins/<pluginname>
Returns the information of a specific plugin.
**Example request**:
.. sourcecode:: http
GET /api/plugins/rand/ HTTP/1.1
Host: localhost
Accept: application/json, text/javascript
**Example response**:
.. sourcecode:: http
{
"@context": "http://127.0.0.1/api/contexts/ExamplePlugin.jsonld",
"@id": "plugins/ExamplePlugin_0.1",
"@type": "sentimentPlugin",
"author": "@balkian",
"custom_attribute": "42",
"description": "I am just an example",
"extra_params": {
"parameter": {
"@id": "parameter",
"aliases": [
"parameter",
"param"
],
"default": 42,
"required": true
}
},
"is_activated": true,
"maxPolarityValue": 1.0,
"minPolarityValue": 0.0,
"module": "example",
"name": "ExamplePlugin",
"requirements": "noop",
"version": "0.1"
}
.. http:get:: /api/plugins/default
Return the information about the default plugin.

7
docs/apischema.rst Normal file
View File

@@ -0,0 +1,7 @@
API and Examples
################
.. toctree::
vocabularies.rst
api.rst
examples.rst

View File

@@ -0,0 +1,4 @@
{
"plugins": [
]
}

View File

@@ -0,0 +1,78 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
"me:SAnalysis1",
"me:SgAnalysis1",
"me:EmotionAnalysis1",
"me:NER1",
{
"@type": "analysis",
"@id": "wrong"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}

View File

@@ -0,0 +1,18 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "http://example.com#NIFExample",
"@type": "results",
"analysis": [
],
"entries": [
{
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:beginIndex": 0,
"nif:endIndex": 40,
"nif:isString": "My favourite actress is Natalie Portman"
}
]
}

9
docs/commandline.rst Normal file
View File

@@ -0,0 +1,9 @@
Command line
============
This video shows how to analyse text directly on the command line using the senpy tool.
.. image:: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk.png
:width: 100%
:target: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk
:alt: CLI demo

288
docs/conf.py Normal file
View File

@@ -0,0 +1,288 @@
# -*- coding: utf-8 -*-
# flake8: noqa
#
# Senpy documentation build configuration file, created by
# sphinx-quickstart on Tue Feb 24 08:57:32 2015.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
import sys
import os
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#sys.path.insert(0, os.path.abspath('.'))
# -- General configuration ------------------------------------------------
on_rtd = os.environ.get('READTHEDOCS', None) == 'True'
# If your documentation needs a minimal Sphinx version, state it here.
#needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.doctest',
'sphinx.ext.todo',
'sphinxcontrib.httpdomain',
'sphinx.ext.coverage',
'sphinx.ext.autosectionlabel',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = u'Senpy'
copyright = u'2016, J. Fernando Sánchez'
description = u'A framework for sentiment and emotion analysis services'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
# with open('../senpy/VERSION') as f:
# version = f.read().strip()
# The full version, including alpha/beta/rc tags.
# release = version
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
#today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build']
# The reST default role (used for this markup: `text`) to use for all
# documents.
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# A list of ignored prefixes for module index sorting.
#modindex_common_prefix = []
# If true, keep warnings as "system message" paragraphs in the built documents.
#keep_warnings = False
html_theme = 'alabaster'
# -- Options for HTML output ----------------------------------------------
# if not on_rtd: # only import and set the theme if we're building docs locally
# import sphinx_rtd_theme
# html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
# else:
# html_theme = 'default'
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
html_theme_options = {
'logo': 'header.png',
'github_user': 'gsi-upm',
'github_repo': 'senpy',
'github_banner': True,
}
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
#html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#html_logo = None
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
#html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
# directly to the root of the documentation.
#html_extra_path = []
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
#html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
html_sidebars = {
'**': [
'about.html',
'navigation.html',
'searchbox.html',
]
}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
#html_domain_indices = True
# If false, no index is generated.
#html_use_index = True
# If true, the index is split into individual pages for each letter.
#html_split_index = False
# If true, links to the reST sources are added to the pages.
#html_show_sourcelink = True
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
#html_show_sphinx = True
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
#html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
#html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = 'Senpydoc'
# -- Options for LaTeX output ---------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#'preamble': '',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
('index', 'Senpy.tex', u'Senpy Documentation',
u'J. Fernando Sánchez', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# If true, show page references after internal links.
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
#latex_show_urls = False
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_domain_indices = True
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', 'senpy', u'Senpy Documentation',
[u'J. Fernando Sánchez'], 1)
]
# If true, show URL addresses after external links.
#man_show_urls = False
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
('index', 'Senpy', u'Senpy Documentation',
u'J. Fernando Sánchez', 'Senpy', 'One line description of project.',
'Miscellaneous'),
]
# Documents to append as an appendix to all manuals.
#texinfo_appendices = []
# If false, no module index is generated.
#texinfo_domain_indices = True
# How to display URL addresses: 'footnote', 'no', or 'inline'.
#texinfo_show_urls = 'footnote'
# If true, do not generate a @detailmenu in the "Top" node's menu.
#texinfo_no_detailmenu = False

116
docs/conversion.rst Normal file
View File

@@ -0,0 +1,116 @@
Conversion
----------
Senpy includes experimental support for emotion/sentiment conversion plugins.
Use
===
Consider the original query: http://127.0.0.1:5000/api/?i=hello&algo=emoRand
The requested plugin (emoRand) returns emotions using Ekman's model (or big6 in EmotionML):
.. code:: json
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emoRand_0.1"
}
To get these emotions in VAD space (FSRE dimensions in EmotionML), we'd do this:
http://127.0.0.1:5000/api/?i=hello&algo=emoRand&emotionModel=emoml:fsre-dimensions
This call, provided there is a valid conversion plugin from Ekman's to VAD, would return something like this:
.. code:: json
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emoRand_0.1"
}, {
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"A": 7.22,
"D": 6.28,
"V": 8.6
},
"prov:wasGeneratedBy": "plugins/Ekman2VAD_0.1"
}
That is called a *full* response, as it simply adds the converted emotion alongside.
It is also possible to get the original emotion nested within the new converted emotion, using the `conversion=nested` parameter:
.. code:: json
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emoRand_0.1"
"onyx:wasDerivedFrom": {
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"A": 7.22,
"D": 6.28,
"V": 8.6
},
"prov:wasGeneratedBy": "plugins/Ekman2VAD_0.1"
}
}
Lastly, `conversion=filtered` would only return the converted emotions.
Developing a conversion plugin
================================
Conversion plugins are discovered by the server just like any other plugin.
The difference is the slightly different API, and the need to specify the `source` and `target` of the conversion.
For instance, an emotion conversion plugin needs the following:
.. code:: yaml
---
onyx:doesConversion:
- onyx:conversionFrom: emoml:big6
onyx:conversionTo: emoml:fsre-dimensions
- onyx:conversionFrom: emoml:fsre-dimensions
onyx:conversionTo: emoml:big6
.. code:: python
class MyConversion(EmotionConversionPlugin):
def convert(self, emotionSet, fromModel, toModel, params):
pass

16
docs/demo.rst Normal file
View File

@@ -0,0 +1,16 @@
Demo
----
There is a demo available on http://senpy.cluster.gsi.dit.upm.es/, where you can test a serie of different plugins.
You can use the playground (a web interface) or make HTTP requests to the service API.
.. image:: senpy-playground.png
:height: 400px
:width: 800px
:scale: 100 %
:align: center
Plugins Demo
============
The source code and description of the plugins used in the demo is available here: https://lab.cluster.gsi.dit.upm.es/senpy/senpy-plugins-community/.

78
docs/examples.rst Normal file
View File

@@ -0,0 +1,78 @@
Examples
------
All the examples in this page use the :download:`the main schema <_static/schemas/definitions.json>`.
Simple NIF annotation
.....................
Description
,,,,,,,,,,,
This example covers the basic example in the NIF documentation: `<http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>`_.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-basic.json
:language: json-ld
Sentiment Analysis
.....................
Description
,,,,,,,,,,,
This annotation corresponds to the sentiment analysis of an input. The example shows the sentiment represented according to Marl format.
The sentiments detected are contained in the Sentiments array with their related part of the text.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-sentiment.json
:emphasize-lines: 5-10,25-33
:language: json-ld
Suggestion Mining
.................
Description
,,,,,,,,,,,
The suggestions schema represented below shows the suggestions detected in the text. Within it, we can find the NIF fields highlighted that corresponds to the text of the detected suggestion.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-suggestion.json
:emphasize-lines: 5-8,22-27
:language: json-ld
Emotion Analysis
................
Description
,,,,,,,,,,,
This annotation represents the emotion analysis of an input to Senpy. The emotions are contained in the emotions section with the text that refers to following Onyx format and the emotion model defined beforehand.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-emotion.json
:language: json-ld
:emphasize-lines: 5-8,25-37
Named Entity Recognition
........................
Description
,,,,,,,,,,,
The Named Entity Recognition is represented as follows. In this particular case, it can be seen within the entities array the entities recognised. For the example input, Microsoft and Windows Phone are the ones detected.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-ner.json
:emphasize-lines: 5-8,19-34
:language: json-ld
Complete example
................
Description
,,,,,,,,,,,
This example covers all of the above cases, integrating all the annotations in the same document.
Representation
,,,,,,,,,,,,,,
.. literalinclude:: examples/results/example-complete.json
:language: json-ld

View File

@@ -0,0 +1,5 @@
{
"@type": "plugins",
"plugins": [
]
}

View File

@@ -0,0 +1,74 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
"me:SAnalysis1",
"me:SgAnalysis1",
"me:EmotionAnalysis1",
"me:NER1"
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}

View File

@@ -0,0 +1,19 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "http://example.com#NIFExample",
"@type": "results",
"analysis": [
],
"entries": [
{
"@id": "http://example.org#char=0,40",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:beginIndex": 0,
"nif:endIndex": 40,
"nif:isString": "My favourite actress is Natalie Portman"
}
]
}

View File

@@ -0,0 +1,88 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
{
"@id": "me:SAnalysis1",
"@type": "marl:SentimentAnalysis",
"marl:maxPolarityValue": 1,
"marl:minPolarityValue": 0
},
{
"@id": "me:SgAnalysis1",
"@type": "me:SuggestionAnalysis"
},
{
"@id": "me:EmotionAnalysis1",
"@type": "me:EmotionAnalysis"
},
{
"@id": "me:NER1",
"@type": "me:NER"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}

View File

@@ -0,0 +1,42 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
{
"@id": "me:EmotionAnalysis1",
"@type": "onyx:EmotionAnalysis"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
],
"suggestions": [
],
"sentiments": [
],
"emotions": [
{
"@id": "http://micro.blog/status1#char=0,109",
"nif:anchorOf": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"prov:wasGeneratedBy": "me:EmotionAnalysis1",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:liking"
},
{
"onyx:hasEmotionCategory": "wna:excitement"
}
]
}
]
}
]
}

View File

@@ -0,0 +1,45 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
{
"@id": "me:NER1",
"@type": "me:NERAnalysis"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
{
"@id": "http://micro.blog/status1#char=5,13",
"nif:beginIndex": 5,
"nif:endIndex": 13,
"nif:anchorOf": "Microsoft",
"me:references": "http://dbpedia.org/page/Microsoft",
"prov:wasGeneratedBy": "me:NER1"
},
{
"@id": "http://micro.blog/status1#char=25,37",
"nif:beginIndex": 25,
"nif:endIndex": 37,
"nif:anchorOf": "Windows Phone",
"me:references": "http://dbpedia.org/page/Windows_Phone",
"prov:wasGeneratedBy": "me:NER1"
}
],
"suggestions": [
],
"sentiments": [
],
"emotionSets": [
]
}
]
}

View File

@@ -0,0 +1,46 @@
{
"@context": [
"http://mixedemotions-project.eu/ns/context.jsonld",
{
"emovoc": "http://www.gsi.dit.upm.es/ontologies/onyx/vocabularies/emotionml/ns#"
}
],
"@id": "me:Result1",
"@type": "results",
"analysis": [
{
"@id": "me:HesamsAnalysis",
"@type": "onyx:EmotionAnalysis",
"onyx:usesEmotionModel": "emovoc:pad-dimensions"
}
],
"entries": [
{
"@id": "Entry1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "This is a test string",
"entities": [
],
"suggestions": [
],
"sentiments": [
],
"emotions": [
{
"@id": "Entry1#char=0,21",
"nif:anchorOf": "This is a test string",
"prov:wasGeneratedBy": "me:HesamAnalysis",
"onyx:hasEmotion": [
{
"emovoc:pleasure": 0.5,
"emovoc:arousal": 0.7
}
]
}
]
}
]
}

View File

@@ -0,0 +1,40 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
{
"@id": "me:SAnalysis1",
"@type": "marl:SentimentAnalysis",
"marl:maxPolarityValue": 1,
"marl:minPolarityValue": 0
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
],
"suggestions": [
],
"sentiments": [
{
"@id": "http://micro.blog/status1#char=80,97",
"nif:beginIndex": 80,
"nif:endIndex": 97,
"nif:anchorOf": "You'll be awesome.",
"marl:hasPolarity": "marl:Positive",
"marl:polarityValue": 0.9,
"prov:wasGeneratedBy": "me:SAnalysis1"
}
],
"emotionSets": [
]
}
]
}

View File

@@ -0,0 +1,37 @@
{
"@context": "http://mixedemotions-project.eu/ns/context.jsonld",
"@id": "me:Result1",
"@type": "results",
"analysis": [
{
"@id": "me:SgAnalysis1",
"@type": "me:SuggestionAnalysis"
}
],
"entries": [
{
"@id": "http://micro.blog/status1",
"@type": [
"nif:RFC5147String",
"nif:Context"
],
"prov:wasGeneratedBy": "me:SAnalysis1",
"nif:isString": "Dear Microsoft, put your Windows Phone on your newest #open technology program. You'll be awesome. #opensource",
"entities": [
],
"suggestions": [
{
"@id": "http://micro.blog/status1#char=16,77",
"nif:beginIndex": 16,
"nif:endIndex": 77,
"nif:anchorOf": "put your Windows Phone on your newest #open technology program",
"prov:wasGeneratedBy": "me:SgAnalysis1"
}
],
"sentiments": [
],
"emotionSets": [
]
}
]
}

35
docs/index.rst Normal file
View File

@@ -0,0 +1,35 @@
Welcome to Senpy's documentation!
=================================
.. image:: https://readthedocs.org/projects/senpy/badge/?version=latest
:target: http://senpy.readthedocs.io/en/latest/
.. image:: https://badge.fury.io/py/senpy.svg
:target: https://badge.fury.io/py/senpy
.. image:: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/badges/master/build.svg
:target: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/commits/master
.. image:: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/badges/master/coverage.svg
:target: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/commits/master
.. image:: https://img.shields.io/pypi/l/requests.svg
:target: https://lab.cluster.gsi.dit.upm.es/senpy/senpy/
Senpy is a framework for sentiment and emotion analysis services.
Services built with senpy are interchangeable and easy to use because they share a common :doc:`apischema`.
It also simplifies service development.
.. image:: senpy-architecture.png
:width: 100%
:align: center
.. toctree::
:caption: Learn more about senpy:
:maxdepth: 2
senpy
installation
demo
usage
apischema
plugins
conversion
about

72
docs/installation.rst Normal file
View File

@@ -0,0 +1,72 @@
Installation
------------
The stable version can be used in two ways: as a system/user library through pip, or as a docker image.
The docker image is the recommended way because it is self-contained and isolated from the system, which means:
* Downloading and using it is just one command
* All dependencies are included
* It is OS-independent (MacOS, Windows, GNU/Linux)
* Several versions may coexist in the same machine without additional virtual environments
Additionally, you may create your own docker image with your custom plugins, ready to be used by others.
Through PIP
***********
.. code:: bash
pip install --user senpy
Alternatively, you can use the development version:
.. code:: bash
git clone git@github.com:gsi-upm/senpy
cd senpy
pip install --user .
If you want to install senpy globally, use sudo instead of the ``--user`` flag.
Docker Image
************
Build the image or use the pre-built one:
.. code:: bash
docker run -ti -p 5000:5000 gsiupm/senpy --host 0.0.0.0 --default-plugins
To add custom plugins, use a docker volume:
.. code:: bash
docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --host 0.0.0.0 --default-plugins -f /plugins
Python 2
........
There is a Senpy version for python2 too:
.. code:: bash
docker run -ti -p 5000:5000 gsiupm/senpy:python2.7 --host 0.0.0.0 --default-plugins
Alias
.....
If you are using the docker approach regularly, it is advisable to use a script or an alias to simplify your executions:
.. code:: bash
alias senpy='docker run --rm -ti -p 5000:5000 -v $PWD:/senpy-plugins gsiupm/senpy --default-plugins'
Now, you may run senpy from any folder in your computer like so:
.. code:: bash
senpy --version

379
docs/plugins.rst Normal file
View File

@@ -0,0 +1,379 @@
Developing new plugins
----------------------
This document describes how to develop a new analysis plugin. For an example of conversion plugins, see :doc:`conversion`.
A more step-by-step tutorial with slides is available `here <https://lab.cluster.gsi.dit.upm.es/senpy/senpy-tutorial>`__
.. contents:: :local:
What is a plugin?
=================
A plugin is a program that, given a text, will add annotations to it.
In practice, a plugin consists of at least two files:
- Definition file: a `.senpy` file that describes the plugin (e.g. what input parameters it accepts, what emotion model it uses).
- Python module: the actual code that will add annotations to each input.
This separation allows us to deploy plugins that use the same code but employ different parameters.
For instance, one could use the same classifier and processing in several plugins, but train with different datasets.
This scenario is particularly useful for evaluation purposes.
The only limitation is that the name of each plugin needs to be unique.
Plugin Definition files
=======================
The definition file contains all the attributes of the plugin, and can be written in YAML or JSON.
When the server is launched, it will recursively search for definition files in the plugin folder (the current folder, by default).
The most important attributes are:
* **name**: unique name that senpy will use internally to identify the plugin.
* **module**: indicates the module that contains the plugin code, which will be automatically loaded by senpy.
* **version**
* extra_params: to add parameters to the senpy API when this plugin is requested. Those parameters may be required, and have aliased names. For instance:
.. code:: yaml
extra_params:
hello_param:
aliases: # required
- hello_param
- hello
required: true
default: Hi you
values:
- Hi you
- Hello y'all
- Howdy
Parameter validation will fail if a required parameter without a default has not been provided, or if the definition includes a set of values and the provided one does not match one of them.
A complete example:
.. code:: yaml
name: <Name of the plugin>
module: <Python file>
version: 0.1
And the json equivalent:
.. code:: json
{
"name": "<Name of the plugin>",
"module": "<Python file>",
"version": "0.1"
}
Plugins Code
============
The basic methods in a plugin are:
* __init__
* activate: used to load memory-hungry resources
* deactivate: used to free up resources
* analyse_entry: called in every user requests. It takes two parameters: ``Entry``, the entry object, and ``params``, the parameters supplied by the user. It should yield one or more ``Entry`` objects.
Plugins are loaded asynchronously, so don't worry if the activate method takes too long. The plugin will be marked as activated once it is finished executing the method.
Entries
=======
Entries are objects that can be annotated.
By default, entries are `NIF contexts <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>`_ represented in JSON-LD format.
Annotations are added to the object like this:
.. code:: python
entry = Entry()
entry.vocabulary__annotationName = 'myvalue'
entry['vocabulary:annotationName'] = 'myvalue'
entry['annotationNameURI'] = 'myvalue'
Where vocabulary is one of the prefixes defined in the default senpy context, and annotationURI is a full URI.
The value may be any valid JSON-LD dictionary.
For simplicity, senpy includes a series of models by default in the ``senpy.models`` module.
Example plugin
==============
In this section, we will implement a basic sentiment analysis plugin.
To determine the polarity of each entry, the plugin will compare the length of the string to a threshold.
This threshold will be included in the definition file.
The definition file would look like this:
.. code:: yaml
name: helloworld
module: helloworld
version: 0.0
threshold: 10
description: Hello World
Now, in a file named ``helloworld.py``:
.. code:: python
#!/bin/env python
#helloworld.py
from senpy.plugins import AnalysisPlugin
from senpy.models import Sentiment
class HelloWorld(AnalysisPlugin):
def analyse_entry(entry, params):
'''Basically do nothing with each entry'''
sentiment = Sentiment()
if len(entry.text) < self.threshold:
sentiment['marl:hasPolarity'] = 'marl:Positive'
else:
sentiment['marl:hasPolarity'] = 'marl:Negative'
entry.sentiments.append(sentiment)
yield entry
The complete code of the example plugin is available `here <https://lab.cluster.gsi.dit.upm.es/senpy/plugin-prueba>`__.
Loading data and files
======================
Most plugins will need access to files (dictionaries, lexicons, etc.).
It is good practice to specify the paths of these files in the plugin configuration, so the same code can be reused with different resources.
.. code:: yaml
name: dictworld
module: dictworld
dictionary_path: <PATH OF THE FILE>
The path can be either absolute, or relative.
From absolute paths
???????????????????
Absolute paths (such as ``/data/dictionary.csv`` are straightfoward:
.. code:: python
with open(os.path.join(self.dictionary_path) as f:
...
From relative paths
???????????????????
Since plugins are loading dynamically, relative paths will refer to the current working directory.
Instead, what you usually want is to load files *relative to the plugin source folder*, like so:
::
.
..
plugin.senpy
plugin.py
dictionary.csv
For this, we need to first get the path of your source folder first, like so:
.. code:: python
import os
root = os.path.realpath(__file__)
with open(os.path.join(root, self.dictionary_path) as f:
...
Docker image
============
Add the following dockerfile to your project to generate a docker image with your plugin:
.. code:: dockerfile
FROM gsiupm/senpy:0.8.8
This will copy your source folder to the image, and install all dependencies.
Now, to build an image:
.. code:: shell
docker build . -t gsiupm/exampleplugin
And you can run it with:
.. code:: shell
docker run -p 5000:5000 gsiupm/exampleplugin
If the plugin non-source files (:ref:`loading data and files`), the recommended way is to use absolute paths.
Data can then be mounted in the container or added to the image.
The former is recommended for open source plugins with licensed resources, whereas the latter is the most convenient and can be used for private images.
Mounting data:
.. code:: bash
docker run -v $PWD/data:/data gsiupm/exampleplugin
Adding data to the image:
.. code:: dockerfile
FROM gsiupm/senpy:0.8.8
COPY data /
F.A.Q.
======
What annotations can I use?
???????????????????????????
You can add almost any annotation to an entry.
The most common use cases are covered in the :doc:`apischema`.
Why does the analyse function yield instead of return?
??????????????????????????????????????????????????????
This is so that plugins may add new entries to the response or filter some of them.
For instance, a `context detection` plugin may add a new entry for each context in the original entry.
On the other hand, a conversion plugin may leave out those entries that do not contain relevant information.
If I'm using a classifier, where should I train it?
???????????????????????????????????????????????????
Training a classifier can be time time consuming. To avoid running the training unnecessarily, you can use ShelfMixin to store the classifier. For instance:
.. code:: python
from senpy.plugins import ShelfMixin, AnalysisPlugin
class MyPlugin(ShelfMixin, AnalysisPlugin):
def train(self):
''' Code to train the classifier
'''
# Here goes the code
# ...
return classifier
def activate(self):
if 'classifier' not in self.sh:
classifier = self.train()
self.sh['classifier'] = classifier
self.classifier = self.sh['classifier']
def deactivate(self):
self.close()
You can specify a 'shelf_file' in your .senpy file. By default the ShelfMixin creates a file based on the plugin name and stores it in that plugin's folder.
Shelves may get corrupted if the plugin exists unexpectedly.
A corrupt shelf prevents the plugin from loading.
If you do not care about the pickle, you can force your plugin to remove the corrupted file and load anyway, set the 'force_shelf' to True in your .senpy file.
How can I turn an external service into a plugin?
?????????????????????????????????????????????????
This example ilustrate how to implement a plugin that accesses the Sentiment140 service.
.. code:: python
class Sentiment140Plugin(SentimentPlugin):
def analyse_entry(self, entry, params):
text = entry.text
lang = params.get("language", "auto")
res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
json.dumps({"language": lang,
"data": [{"text": text}]
}
)
)
p = params.get("prefix", None)
polarity_value = self.maxPolarityValue*int(res.json()["data"][0]
["polarity"]) * 0.25
polarity = "marl:Neutral"
neutral_value = self.maxPolarityValue / 2.0
if polarity_value > neutral_value:
polarity = "marl:Positive"
elif polarity_value < neutral_value:
polarity = "marl:Negative"
sentiment = Sentiment(id="Sentiment0",
prefix=p,
marl__hasPolarity=polarity,
marl__polarityValue=polarity_value)
sentiment.prov__wasGeneratedBy = self.id
entry.sentiments.append(sentiment)
yield entry
Can my plugin require additional parameters from the user?
??????????????????????????????????????????????????????????
You can add extra parameters in the definition file under the attribute ``extra_params``.
It takes a dictionary, where the keys are the name of the argument/parameter, and the value has the following fields:
* aliases: the different names which can be used in the request to use the parameter.
* required: if set to true, users need to provide this parameter unless a default is set.
* options: the different acceptable values of the parameter (i.e. an enum). If set, the value provided must match one of the options.
* default: the default value of the parameter, if none is provided in the request.
.. code:: python
extra_params
language:
aliases:
- language
- lang
- l
required: true,
options:
- es
- en
default: es
This example shows how to introduce a parameter associated with language.
The extraction of this paremeter is used in the analyse method of the Plugin interface.
.. code:: python
lang = params.get("language")
Where can I set up variables for using them in my plugin?
?????????????????????????????????????????????????????????
You can add these variables in the definition file with the structure of attribute-value pairs.
Every field added to the definition file is available to the plugin instance.
Can I activate a DEBUG mode for my plugin?
???????????????????????????????????????????
You can activate the DEBUG mode by the command-line tool using the option -d.
.. code:: bash
senpy -d
Additionally, with the ``--pdb`` option you will be dropped into a pdb post mortem shell if an exception is raised.
.. code:: bash
senpy --pdb
Where can I find more code examples?
????????????????????????????????????
See: `<http://github.com/gsi-upm/senpy-plugins-community>`_.

2
docs/requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
sphinxcontrib-httpdomain>=1.4
nbsphinx

BIN
docs/senpy-architecture.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

BIN
docs/senpy-framework.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

BIN
docs/senpy-playground.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

51
docs/senpy.rst Normal file
View File

@@ -0,0 +1,51 @@
What is Senpy?
--------------
Web services can get really complex: data validation, user interaction, formatting, logging., etc.
The figure below summarizes the typical features in an analysis service.
Senpy implements all the common blocks, so developers can focus on what really matters: great analysis algorithms that solve real problems.
.. image:: senpy-framework.png
:width: 60%
:align: center
Senpy for end users
===================
All services built using senpy share a common interface.
This allows users to use them (almost) interchangeably.
Senpy comes with a :ref:`built-in client`.
Senpy for service developers
============================
Senpy is a framework that turns your sentiment or emotion analysis algorithm into a full blown semantic service.
Senpy takes care of:
* Interfacing with the user: parameter validation, error handling.
* Formatting: JSON-LD, Turtle/n-triples input and output, or simple text input
* Linked Data: senpy results are semantically annotated, using a series of well established vocabularies, and sane default URIs.
* User interface: a web UI where users can explore your service and test different settings
* A client to interact with the service. Currently only available in Python.
Sharing your sentiment analysis with the world has never been easier!
Check out the :doc:`plugins` if you have developed an analysis algorithm (e.g. sentiment analysis) and you want to publish it as a service.
Architecture
============
The main component of a sentiment analysis service is the algorithm itself. However, for the algorithm to work, it needs to get the appropriate parameters from the user, format the results according to the defined API, interact with the user whn errors occur or more information is needed, etc.
Senpy proposes a modular and dynamic architecture that allows:
* Implementing different algorithms in a extensible way, yet offering a common interface.
* Offering common services that facilitate development, so developers can focus on implementing new and better algorithms.
The framework consists of two main modules: Senpy core, which is the building block of the service, and Senpy plugins, which consist of the analysis algorithm. The next figure depicts a simplified version of the processes involved in an analysis with the Senpy framework.
.. image:: senpy-architecture.png
:width: 100%
:align: center

58
docs/server.rst Normal file
View File

@@ -0,0 +1,58 @@
Server
======
The senpy server is launched via the `senpy` command:
.. code:: text
usage: senpy [-h] [--level logging_level] [--debug] [--default-plugins]
[--host HOST] [--port PORT] [--plugins-folder PLUGINS_FOLDER]
[--only-install]
Run a Senpy server
optional arguments:
-h, --help show this help message and exit
--level logging_level, -l logging_level
Logging level
--debug, -d Run the application in debug mode
--default-plugins Load the default plugins
--host HOST Use 0.0.0.0 to accept requests from any host.
--port PORT, -p PORT Port to listen on.
--plugins-folder PLUGINS_FOLDER, -f PLUGINS_FOLDER
Where to look for plugins.
--only-install, -i Do not run a server, only install plugin dependencies
When launched, the server will recursively look for plugins in the specified plugins folder (the current working directory by default).
For every plugin found, it will download its dependencies, and try to activate it.
The default server includes a playground and an endpoint with all plugins found.
Let's run senpy with the default plugins:
.. code:: bash
senpy -f . --default-plugins
Now go to `http://localhost:5000 <http://localhost:5000>`_, you should be greeted by the senpy playground:
.. image:: senpy-playground.png
:width: 100%
:alt: Playground
The playground is a user-friendly way to test your plugins, but you can always use the service directly: `http://localhost:5000/api?input=hello <http://localhost:5000/api?input=hello>`_.
By default, senpy will listen only on the `127.0.0.1` address.
That means you can only access the API from your (or localhost).
You can listen on a different address using the `--host` flag (e.g., 0.0.0.0).
The default port is 5000.
You can change it with the `--port` flag.
For instance, to accept connections on port 6000 on any interface:
.. code:: bash
senpy --host 0.0.0.0 --port 6000
For more options, see the `--help` page.

15
docs/usage.rst Normal file
View File

@@ -0,0 +1,15 @@
Usage
-----
First of all, you need to install the package.
See :doc:`installation` for instructions.
Once installed, the `senpy` command should be available.
.. toctree::
:maxdepth: 1
server
SenpyClientUse
commandline

8
docs/vocabularies.rst Normal file
View File

@@ -0,0 +1,8 @@
Vocabularies and model
======================
The model used in Senpy is based on the following vocabularies:
* Marl, a vocabulary designed to annotate and describe subjetive opinions expressed on the web or in information systems.
* Onyx, which is built one the same principles as Marl to annotate and describe emotions, and provides interoperability with Emotion Markup Language.
* NIF 2.0, which defines a semantic format and APO for improving interoperability among natural language processing services

BIN
img/eu-flag.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

828
img/final-logo.svg Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 81 KiB

BIN
img/gsi.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 KiB

BIN
img/header.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

View File

Before

Width:  |  Height:  |  Size: 8.0 KiB

After

Width:  |  Height:  |  Size: 8.0 KiB

2728
img/logo.svg Normal file

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 180 KiB

BIN
img/logo_grande.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

BIN
img/me.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

7
k8s/README.md Normal file
View File

@@ -0,0 +1,7 @@
Deploy senpy to a kubernetes cluster.
Usage:
```
kubectl apply -f . -n senpy
```

26
k8s/senpy-deployment.yaml Normal file
View File

@@ -0,0 +1,26 @@
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: senpy-latest
spec:
replicas: 1
template:
metadata:
labels:
role: senpy-latest
app: test
spec:
containers:
- name: senpy-latest
image: gsiupm/senpy:latest
imagePullPolicy: Always
args:
- "--default-plugins"
resources:
limits:
memory: "512Mi"
cpu: "1000m"
ports:
- name: web
containerPort: 5000

14
k8s/senpy-ingress.yaml Normal file
View File

@@ -0,0 +1,14 @@
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: senpy-ingress
spec:
rules:
- host: latest.senpy.cluster.gsi.dit.upm.es
http:
paths:
- path: /
backend:
serviceName: senpy-latest
servicePort: 5000

12
k8s/senpy-svc.yaml Normal file
View File

@@ -0,0 +1,12 @@
---
apiVersion: v1
kind: Service
metadata:
name: senpy-latest
spec:
type: ClusterIP
ports:
- port: 5000
protocol: TCP
selector:
role: senpy-latest

View File

@@ -1,5 +1,11 @@
Flask==0.10.1
gunicorn==19.0.0
requests==2.4.1
Flask-Plugins==1.4
GitPython==0.3.2.RC1
Flask>=0.10.1
requests>=2.4.1
tornado>=4.4.3
PyLD>=0.6.5
six
future
jsonschema
jsonref
PyYAML
rdflib
rdflib-jsonld

View File

@@ -14,17 +14,15 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
'''
"""
Sentiment analysis server in Python
'''
"""
from .version import __version__
from blueprints import nif_blueprint
from extensions import Senpy
import logging
logger = logging.getLogger(__name__)
if __name__ == '__main__':
from flask import Flask
app = Flask(__name__)
app.register_blueprint(nif_server)
app.debug = config.DEBUG
app.run()
logger.info('Using senpy version: {}'.format(__version__))
__all__ = ['api', 'blueprints', 'cli', 'extensions', 'models', 'plugins']

View File

@@ -1,7 +1,113 @@
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes
# DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Senpy is a modular sentiment analysis server. This script runs an instance of
the server.
"""
from flask import Flask
from extensions import Senpy
app = Flask(__name__)
app.debug = True
sp = Senpy()
sp.init_app(app)
app.run()
from senpy.extensions import Senpy
import logging
import os
import argparse
import senpy
SERVER_PORT = os.environ.get("PORT", 5000)
def main():
parser = argparse.ArgumentParser(description='Run a Senpy server')
parser.add_argument(
'--level',
'-l',
metavar='logging_level',
type=str,
default="INFO",
help='Logging level')
parser.add_argument(
'--debug',
'-d',
action='store_true',
default=False,
help='Run the application in debug mode')
parser.add_argument(
'--default-plugins',
action='store_true',
default=False,
help='Load the default plugins')
parser.add_argument(
'--host',
type=str,
default="0.0.0.0",
help='Use 0.0.0.0 to accept requests from any host.')
parser.add_argument(
'--port',
'-p',
type=int,
default=SERVER_PORT,
help='Port to listen on.')
parser.add_argument(
'--plugins-folder',
'-f',
type=str,
default='plugins',
help='Where to look for plugins.')
parser.add_argument(
'--only-install',
'-i',
action='store_true',
default=False,
help='Do not run a server, only install plugin dependencies')
parser.add_argument(
'--threaded',
action='store_false',
default=True,
help='Run a threaded server')
parser.add_argument(
'--version',
'-v',
action='store_true',
default=False,
help='Output the senpy version and exit')
args = parser.parse_args()
if args.version:
print('Senpy version {}'.format(senpy.__version__))
exit(1)
logging.basicConfig()
rl = logging.getLogger()
rl.setLevel(getattr(logging, args.level))
app = Flask(__name__)
app.debug = args.debug
sp = Senpy(app, args.plugins_folder, default_plugins=args.default_plugins)
if args.only_install:
sp.install_deps()
return
sp.activate_all()
print('Senpy version {}'.format(senpy.__version__))
print('Server running on port %s:%d. Ctrl+C to quit' % (args.host,
args.port))
app.run(args.host,
args.port,
threaded=args.threaded,
debug=app.debug)
sp.deactivate_all()
if __name__ == '__main__':
main()

131
senpy/api.py Normal file
View File

@@ -0,0 +1,131 @@
from future.utils import iteritems
from .models import Error
import logging
logger = logging.getLogger(__name__)
API_PARAMS = {
"algorithm": {
"aliases": ["algorithm", "a", "algo"],
"required": False,
},
"outformat": {
"@id": "outformat",
"aliases": ["outformat", "o"],
"default": "json-ld",
"required": True,
"options": ["json-ld", "turtle"],
},
"expanded-jsonld": {
"@id": "expanded-jsonld",
"aliases": ["expanded", "expanded-jsonld"],
"required": True,
"default": 0
},
"emotionModel": {
"@id": "emotionModel",
"aliases": ["emotionModel", "emoModel"],
"required": False
},
"plugin_type": {
"@id": "pluginType",
"description": 'What kind of plugins to list',
"aliases": ["pluginType", "plugin_type"],
"required": True,
"default": "analysisPlugin"
},
"conversion": {
"@id": "conversion",
"description": "How to show the elements that have (not) been converted",
"required": True,
"options": ["filtered", "nested", "full"],
"default": "full"
}
}
WEB_PARAMS = {
"inHeaders": {
"aliases": ["inHeaders", "headers"],
"required": True,
"default": "0"
},
}
CLI_PARAMS = {
"plugin_folder": {
"aliases": ["plugin_folder", "folder"],
"required": True,
"default": "."
},
}
NIF_PARAMS = {
"input": {
"@id": "input",
"aliases": ["i", "input"],
"required": True,
"help": "Input text"
},
"informat": {
"@id": "informat",
"aliases": ["f", "informat"],
"required": False,
"default": "text",
"options": ["turtle", "text", "json-ld"],
},
"intype": {
"@id": "intype",
"aliases": ["intype", "t"],
"required": False,
"default": "direct",
"options": ["direct", "url", "file"],
},
"language": {
"@id": "language",
"aliases": ["language", "l"],
"required": False,
},
"prefix": {
"@id": "prefix",
"aliases": ["prefix", "p"],
"required": True,
"default": "",
},
"urischeme": {
"@id": "urischeme",
"aliases": ["urischeme", "u"],
"required": False,
"default": "RFC5147String",
"options": "RFC5147String"
},
}
def parse_params(indict, spec=NIF_PARAMS):
logger.debug("Parsing: {}\n{}".format(indict, spec))
outdict = indict.copy()
wrong_params = {}
for param, options in iteritems(spec):
if param[0] != "@": # Exclude json-ld properties
for alias in options.get("aliases", []):
if alias in indict:
outdict[param] = indict[alias]
if param not in outdict:
if options.get("required", False) and "default" not in options:
wrong_params[param] = spec[param]
else:
if "default" in options:
outdict[param] = options["default"]
else:
if "options" in spec[param] and \
outdict[param] not in spec[param]["options"]:
wrong_params[param] = spec[param]
if wrong_params:
logger.debug("Error parsing: %s", wrong_params)
message = Error(
status=400,
message="Missing or invalid parameters",
parameters=outdict,
errors={param: error
for param, error in iteritems(wrong_params)})
raise message
return outdict

View File

@@ -1,147 +1,146 @@
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes
# DIT, UPM
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes
# DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
'''
Simple Sentiment Analysis server
'''
from flask import Blueprint, render_template, request, jsonify, current_app
"""
Blueprints for Senpy
"""
from flask import (Blueprint, request, current_app, render_template, url_for,
jsonify)
from .models import Error, Response, Plugins, read_schema
from .api import WEB_PARAMS, API_PARAMS, parse_params
from .version import __version__
from functools import wraps
import logging
import json
nif_blueprint = Blueprint("NIF Sentiment Analysis Server", __name__)
logger = logging.getLogger(__name__)
PARAMS = {"input": {"aliases": ["i", "input"],
"required": True,
"help": "Input text"
},
"informat": {"aliases": ["f", "informat"],
"required": False,
"default": "text",
"options": ["turtle", "text"],
},
"intype": {"aliases": ["intype", "t"],
"required": False,
"default": "direct",
"options": ["direct", "url", "file"],
},
"outformat": {"aliases": ["outformat", "o"],
"default": "json-ld",
"required": False,
"options": ["json-ld"],
},
"algorithm": {"aliases": ["algorithm", "a", "algo"],
"required": False,
},
"language": {"aliases": ["language", "l"],
"required": False,
"options": ["es", "en"],
},
"urischeme": {"aliases": ["urischeme", "u"],
"required": False,
"default": "RFC5147String",
"options": "RFC5147String"
},
}
api_blueprint = Blueprint("api", __name__)
demo_blueprint = Blueprint("demo", __name__)
ns_blueprint = Blueprint("ns", __name__)
def get_params(req):
indict = None
if req.method == 'POST':
indict = req.form
indict = req.form.to_dict(flat=True)
elif req.method == 'GET':
indict = req.args
indict = req.args.to_dict(flat=True)
else:
raise ValueError("Invalid data")
raise Error(message="Invalid data")
return indict
outdict = {}
wrongParams = {}
for param, options in PARAMS.iteritems():
for alias in options["aliases"]:
if alias in indict:
outdict[param] = indict[alias]
if param not in outdict:
if options.get("required", False):
wrongParams[param] = PARAMS[param]
@demo_blueprint.route('/')
def index():
return render_template("index.html", version=__version__)
@api_blueprint.route('/contexts/<entity>.jsonld')
def context(entity="context"):
context = Response._context
context['@vocab'] = url_for('ns.index', _external=True)
return jsonify({"@context": context})
@ns_blueprint.route('/') # noqa: F811
def index():
context = Response._context
context['@vocab'] = url_for('.ns', _external=True)
return jsonify({"@context": context})
@api_blueprint.route('/schemas/<schema>')
def schema(schema="definitions"):
try:
return jsonify(read_schema(schema))
except Exception: # Should be FileNotFoundError, but it's missing from py2
return Error(message="Schema not found", status=404).flask()
def basic_api(f):
@wraps(f)
def decorated_function(*args, **kwargs):
raw_params = get_params(request)
headers = {'X-ORIGINAL-PARAMS': json.dumps(raw_params)}
# Get defaults
web_params = parse_params({}, spec=WEB_PARAMS)
api_params = parse_params({}, spec=API_PARAMS)
outformat = 'json-ld'
try:
print('Getting request:')
print(request)
web_params = parse_params(raw_params, spec=WEB_PARAMS)
api_params = parse_params(raw_params, spec=API_PARAMS)
if hasattr(request, 'params'):
request.params.update(api_params)
else:
if "default" in options:
outdict[param] = options["default"]
else:
if "options" in PARAMS[param] and \
outdict[param] not in PARAMS[param]["options"]:
wrongParams[param] = PARAMS[param]
if wrongParams:
message = {"status": "failed", "message": "Missing or invalid parameters"}
message["parameters"] = outdict
message["errors"] = {param:error for param, error in wrongParams.iteritems()}
raise ValueError(json.dumps(message))
return outdict
request.params = api_params
response = f(*args, **kwargs)
except Error as ex:
response = ex
logger.error(ex)
if current_app.debug:
raise
def basic_analysis(params):
response = {"@context": ["http://demos.gsi.dit.upm.es/eurosentiment/static/context.jsonld",
{
"@base": "{}#".format(request.url.encode('utf-8'))
}
],
"analysis": [{
"@type": "marl:SentimentAnalysis"
}],
"entries": []
}
if "language" in params:
response["language"] = params["language"]
for idx, sentence in enumerate(params["input"].split(".")):
response["entries"].append({
"@id": "Sentence{}".format(idx),
"nif:isString": sentence
})
in_headers = web_params['inHeaders'] != "0"
expanded = api_params['expanded-jsonld']
outformat = api_params['outformat']
return response.flask(
in_headers=in_headers,
headers=headers,
prefix=url_for('.api', _external=True),
context_uri=url_for('api.context',
entity=type(response).__name__,
_external=True),
outformat=outformat,
expanded=expanded)
return decorated_function
@api_blueprint.route('/', methods=['POST', 'GET'])
@basic_api
def api():
response = current_app.senpy.analyse(**request.params)
return response
@nif_blueprint.route('/', methods=['POST', 'GET'])
def home(entries=None):
try:
params = get_params(request)
except ValueError as ex:
return ex.message
response = current_app.senpy.analyse(**params)
return jsonify(response)
@nif_blueprint.route('/plugins/', methods=['POST', 'GET'])
@nif_blueprint.route('/plugins/<plugin>', methods=['POST', 'GET'])
@nif_blueprint.route('/plugins/<plugin>/<action>', methods=['POST', 'GET'])
def plugins(plugin=None, action="list"):
print current_app.senpy.plugins.keys()
if plugin:
plugs = {plugin:current_app.senpy.plugins[plugin]}
else:
plugs = current_app.senpy.plugins
if action == "list":
dic = {plug:plugs[plug].enabled for plug in plugs}
return jsonify(dic)
elif action == "disable":
plugs[plugin].enabled = False
return "Ok"
elif action == "enable":
plugs[plugin].enabled = True
return "Ok"
else:
return "action '{}' not allowed".format(action), 404
@api_blueprint.route('/plugins/', methods=['POST', 'GET'])
@basic_api
def plugins():
sp = current_app.senpy
ptype = request.params.get('plugin_type')
plugins = sp.filter_plugins(plugin_type=ptype)
dic = Plugins(plugins=list(plugins.values()))
return dic
if __name__ == '__main__':
import config
from flask import Flask
app = Flask(__name__)
app.register_blueprint(nif_blueprint)
app.debug = config.DEBUG
app.run()
@api_blueprint.route('/plugins/<plugin>/', methods=['POST', 'GET'])
@basic_api
def plugin(plugin=None):
sp = current_app.senpy
if plugin == 'default' and sp.default_plugin:
return sp.default_plugin
plugins = sp.filter_plugins(
id='plugins/{}'.format(plugin)) or sp.filter_plugins(name=plugin)
if plugins:
response = list(plugins.values())[0]
else:
return Error(message="Plugin not found", status=404)
return response

52
senpy/cli.py Normal file
View File

@@ -0,0 +1,52 @@
import sys
from .models import Error
from .api import parse_params, CLI_PARAMS
from .extensions import Senpy
def argv_to_dict(argv):
'''Turns parameters in the form of '--key value' into a dict {'key': 'value'}
'''
cli_dict = {}
for i in range(len(argv)):
if argv[i][0] == '-':
key = argv[i].strip('-')
value = argv[i + 1] if len(argv) > i + 1 else None
if value and value[0] == '-':
cli_dict[key] = ""
else:
cli_dict[key] = value
return cli_dict
def parse_cli(argv):
cli_dict = argv_to_dict(argv)
cli_params = parse_params(cli_dict, spec=CLI_PARAMS)
return cli_params, cli_dict
def main_function(argv):
'''This is the method for unit testing
'''
cli_params, cli_dict = parse_cli(argv)
plugin_folder = cli_params['plugin_folder']
sp = Senpy(default_plugins=False, plugin_folder=plugin_folder)
sp.activate_all(sync=True)
res = sp.analyse(**cli_dict)
return res
def main():
'''This method is the entrypoint for the CLI (as configured un setup.py)
'''
try:
res = main_function(sys.argv[1:])
print(res.to_JSON())
except Error as err:
print(err.to_JSON())
sys.exit(2)
if __name__ == '__main__':
main()

41
senpy/client.py Normal file
View File

@@ -0,0 +1,41 @@
import requests
import logging
from . import models
from .plugins import default_plugin_type
logger = logging.getLogger(__name__)
class Client(object):
def __init__(self, endpoint):
self.endpoint = endpoint
def analyse(self, input, method='GET', **kwargs):
return self.request('/', method=method, input=input, **kwargs)
def plugins(self, ptype=default_plugin_type):
resp = self.request(path='/plugins', plugin_type=ptype).plugins
return {p.name: p for p in resp}
def request(self, path=None, method='GET', **params):
url = '{}{}'.format(self.endpoint, path)
response = requests.request(method=method, url=url, params=params)
try:
resp = models.from_dict(response.json())
except Exception as ex:
logger.error(('There seems to be a problem with the response:\n'
'\tURL: {url}\n'
'\tError: {error}\n'
'\t\n'
'#### Response:\n'
'\tCode: {code}'
'\tContent: {content}'
'\n').format(
error=ex,
url=url,
code=response.status_code,
content=response.content))
raise ex
if isinstance(resp, models.Error):
raise resp
return resp

View File

@@ -1,107 +1,427 @@
"""
Main class for Senpy.
It orchestrates plugin (de)activation and analysis.
"""
from future import standard_library
standard_library.install_aliases()
from . import plugins
from .plugins import SenpyPlugin
from .models import Error, Entry, Results, from_string
from .blueprints import api_blueprint, demo_blueprint, ns_blueprint
from .api import API_PARAMS, NIF_PARAMS, parse_params
from threading import Thread
import os
import copy
import fnmatch
import inspect
import sys
import importlib
import logging
import traceback
import yaml
import subprocess
from flask import current_app
logger = logging.getLogger(__name__)
try:
from flask import _app_ctx_stack as stack
except ImportError:
from flask import _request_ctx_stack as stack
from blueprints import nif_blueprint
from git import Repo, InvalidGitRepositoryError
def log_subprocess_output(process):
for line in iter(process.stdout.readline, b''):
logger.info('%r', line)
for line in iter(process.stderr.readline, b''):
logger.error('%r', line)
class Senpy(object):
""" Default Senpy extension for Flask """
def __init__(self, app=None, plugin_folder="plugins"):
def __init__(self,
app=None,
plugin_folder=".",
default_plugins=False):
self.app = app
base_folder = os.path.join(os.path.dirname(__file__), "plugins")
self._search_folders = set()
self._plugin_list = []
self._outdated = True
self._default = None
self.search_folders = (folder for folder in (base_folder, plugin_folder)
if folder and os.path.isdir(folder))
self.add_folder(plugin_folder)
if default_plugins:
self.add_folder('plugins', from_root=True)
else:
# Add only conversion plugins
self.add_folder(os.path.join('plugins', 'conversion'),
from_root=True)
if app is not None:
self.init_app(app)
"""
def init_app(self, app):
""" Initialise a flask app to add plugins to its context """
"""
Note: I'm not particularly fond of adding self.app and app.senpy, but
I can't think of a better way to do it.
"""
def init_app(self, app, plugin_folder="plugins"):
"""
app.senpy = self
#app.config.setdefault('SQLITE3_DATABASE', ':memory:')
# Use the newstyle teardown_appcontext if it's available,
# otherwise fall back to the request context
if hasattr(app, 'teardown_appcontext'):
app.teardown_appcontext(self.teardown)
else:
app.teardown_request(self.teardown)
app.register_blueprint(nif_blueprint)
app.register_blueprint(api_blueprint, url_prefix="/api")
app.register_blueprint(ns_blueprint, url_prefix="/ns")
app.register_blueprint(demo_blueprint, url_prefix="/")
def analyse(self, **params):
if "algorithm" in params:
algo = params["algorithm"]
if algo in self.plugins and self.plugins[algo].enabled:
return self.plugins[algo].plugin.analyse(**params)
return {"status": 500, "message": "No valid algorithm"}
def add_folder(self, folder, from_root=False):
if from_root:
folder = os.path.join(os.path.dirname(__file__), folder)
logger.debug("Adding folder: %s", folder)
if os.path.isdir(folder):
self._search_folders.add(folder)
self._outdated = True
else:
logger.debug("Not a folder: %s", folder)
def _find_plugins(self, params):
if not self.analysis_plugins:
raise Error(
status=404,
message=("No plugins found."
" Please install one."))
api_params = parse_params(params, spec=API_PARAMS)
algos = None
if "algorithm" in api_params and api_params["algorithm"]:
algos = api_params["algorithm"].split(',')
elif self.default_plugin:
algos = [self.default_plugin.name, ]
else:
raise Error(
status=404,
message="No default plugin found, and None provided")
def _load_plugin(self, plugin, search_folder, enabled=True):
sys.path.append(search_folder)
tmp = importlib.import_module(plugin)
sys.path.remove(search_folder)
tmp.path = search_folder
plugins = list()
for algo in algos:
if algo not in self.plugins:
logger.debug(("The algorithm '{}' is not valid\n"
"Valid algorithms: {}").format(algo,
self.plugins.keys()))
raise Error(
status=404,
message="The algorithm '{}' is not valid".format(algo))
if not self.plugins[algo].is_activated:
logger.debug("Plugin not activated: {}".format(algo))
raise Error(
status=400,
message=("The algorithm '{}'"
" is not activated yet").format(algo))
plugins.append(self.plugins[algo])
return plugins
def _get_params(self, params, plugin=None):
nif_params = parse_params(params, spec=NIF_PARAMS)
if plugin:
extra_params = plugin.get('extra_params', {})
specific_params = parse_params(params, spec=extra_params)
nif_params.update(specific_params)
return nif_params
def _get_entries(self, params):
if params['informat'] == 'text':
results = Results()
entry = Entry(text=params['input'])
results.entries.append(entry)
elif params['informat'] == 'json-ld':
results = from_string(params['input'], cls=Results)
else:
raise NotImplemented('Informat {} is not implemented'.format(params['informat']))
return results
def _process_entries(self, entries, plugins, nif_params):
if not plugins:
for i in entries:
yield i
return
plugin = plugins[0]
specific_params = self._get_params(nif_params, plugin)
results = plugin.analyse_entries(entries, specific_params)
for i in self._process_entries(results, plugins[1:], nif_params):
yield i
def _process_response(self, resp, plugins, nif_params):
entries = resp.entries
resp.entries = []
for plug in plugins:
resp.analysis.append(plug.id)
for i in self._process_entries(entries, plugins, nif_params):
resp.entries.append(i)
return resp
def analyse(self, **api_params):
"""
Main method that analyses a request, either from CLI or HTTP.
It uses a dictionary of parameters, provided by the user.
"""
logger.debug("analysing with params: {}".format(api_params))
plugins = self._find_plugins(api_params)
nif_params = self._get_params(api_params)
resp = self._get_entries(nif_params)
if 'with_parameters' in api_params:
resp.parameters = nif_params
try:
repo_path = os.path.join(search_folder, plugin)
tmp.repo = Repo(repo_path)
except InvalidGitRepositoryError:
tmp.repo = None
if not hasattr(tmp, "enabled"):
tmp.enabled = enabled
resp = self._process_response(resp, plugins, nif_params)
self.convert_emotions(resp, plugins, nif_params)
logger.debug("Returning analysis result: {}".format(resp))
except (Error, Exception) as ex:
if not isinstance(ex, Error):
ex = Error(message=str(ex), status=500)
logger.exception('Error returning analysis result')
raise ex
return resp
def _conversion_candidates(self, fromModel, toModel):
candidates = self.filter_plugins(plugin_type='emotionConversionPlugin')
for name, candidate in candidates.items():
for pair in candidate.onyx__doesConversion:
logging.debug(pair)
if pair['onyx:conversionFrom'] == fromModel \
and pair['onyx:conversionTo'] == toModel:
# logging.debug('Found candidate: {}'.format(candidate))
yield candidate
def convert_emotions(self, resp, plugins, params):
"""
Conversion of all emotions in a response **in place**.
In addition to converting from one model to another, it has
to include the conversion plugin to the analysis list.
Needless to say, this is far from an elegant solution, but it works.
@todo refactor and clean up
"""
toModel = params.get('emotionModel', None)
if not toModel:
return
logger.debug('Asked for model: {}'.format(toModel))
output = params.get('conversion', None)
candidates = {}
for plugin in plugins:
try:
fromModel = plugin.get('onyx:usesEmotionModel', None)
candidates[plugin.id] = next(self._conversion_candidates(fromModel, toModel))
logger.debug('Analysis plugin {} uses model: {}'.format(plugin.id, fromModel))
except StopIteration:
e = Error(('No conversion plugin found for: '
'{} -> {}'.format(fromModel, toModel)))
e.original_response = resp
e.parameters = params
raise e
newentries = []
for i in resp.entries:
if output == "full":
newemotions = copy.deepcopy(i.emotions)
else:
newemotions = []
for j in i.emotions:
plugname = j['prov:wasGeneratedBy']
candidate = candidates[plugname]
resp.analysis.append(candidate.id)
for k in candidate.convert(j, fromModel, toModel, params):
k.prov__wasGeneratedBy = candidate.id
if output == 'nested':
k.prov__wasDerivedFrom = j
newemotions.append(k)
i.emotions = newemotions
newentries.append(i)
resp.entries = newentries
resp.analysis = list(set(resp.analysis))
@property
def default_plugin(self):
candidate = self._default
if not candidate:
candidates = self.filter_plugins(plugin_type='analysisPlugin',
is_activated=True)
if len(candidates) > 0:
candidate = list(candidates.values())[0]
logger.debug("Default: {}".format(candidate))
return candidate
@default_plugin.setter
def default_plugin(self, value):
if isinstance(value, SenpyPlugin):
self._default = value
else:
self._default = self.plugins[value]
def activate_all(self, sync=False):
ps = []
for plug in self.plugins.keys():
ps.append(self.activate_plugin(plug, sync=sync))
return ps
def deactivate_all(self, sync=False):
ps = []
for plug in self.plugins.keys():
ps.append(self.deactivate_plugin(plug, sync=sync))
return ps
def _set_active_plugin(self, plugin_name, active=True, *args, **kwargs):
''' We're using a variable in the plugin itself to activate/deactive plugins.\
Note that plugins may activate themselves by setting this variable.
'''
self.plugins[plugin_name].is_activated = active
def activate_plugin(self, plugin_name, sync=False):
try:
plugin = self.plugins[plugin_name]
except KeyError:
raise Error(
message="Plugin not found: {}".format(plugin_name), status=404)
logger.info("Activating plugin: {}".format(plugin.name))
def act():
success = False
try:
plugin.activate()
msg = "Plugin activated: {}".format(plugin.name)
logger.info(msg)
success = True
self._set_active_plugin(plugin_name, success)
except Exception as ex:
msg = "Error activating plugin {} - {} : \n\t{}".format(
plugin.name, ex, traceback.format_exc())
logger.error(msg)
raise Error(msg)
if sync or 'async' in plugin and not plugin.async:
act()
else:
th = Thread(target=act)
th.start()
return th
def deactivate_plugin(self, plugin_name, sync=False):
try:
plugin = self.plugins[plugin_name]
except KeyError:
raise Error(
message="Plugin not found: {}".format(plugin_name), status=404)
self._set_active_plugin(plugin_name, False)
def deact():
try:
plugin.deactivate()
logger.info("Plugin deactivated: {}".format(plugin.name))
except Exception as ex:
logger.error(
"Error deactivating plugin {}: {}".format(plugin.name, ex))
logger.error("Trace: {}".format(traceback.format_exc()))
if sync or 'async' in plugin and not plugin.async:
deact()
else:
th = Thread(target=deact)
th.start()
return th
@classmethod
def validate_info(cls, info):
return all(x in info for x in ('name', 'module', 'description', 'version'))
def install_deps(self):
for i in self.plugins.values():
self._install_deps(i)
@classmethod
def _install_deps(cls, info=None):
requirements = info.get('requirements', [])
if requirements:
pip_args = ['pip']
pip_args.append('install')
pip_args.append('--use-wheel')
for req in requirements:
pip_args.append(req)
logger.info('Installing requirements: ' + str(requirements))
process = subprocess.Popen(pip_args,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
log_subprocess_output(process)
exitcode = process.wait()
if exitcode != 0:
raise Error("Dependencies not properly installed")
@classmethod
def _load_module(cls, name, root):
sys.path.append(root)
tmp = importlib.import_module(name)
sys.path.remove(root)
return tmp
@classmethod
def _load_plugin_from_info(cls, info, root):
if not cls.validate_info(info):
logger.warn('The module info is not valid.\n\t{}'.format(info))
return None, None
module = info["module"]
name = info["name"]
cls._install_deps(info)
tmp = cls._load_module(module, root)
candidate = None
for _, obj in inspect.getmembers(tmp):
if inspect.isclass(obj) and inspect.getmodule(obj) == tmp:
logger.debug(("Found plugin class:"
" {}@{}").format(obj, inspect.getmodule(obj)))
candidate = obj
break
if not candidate:
logger.debug("No valid plugin for: {}".format(module))
return
module = candidate(info=info)
return name, module
@classmethod
def _load_plugin(cls, root, filename):
fpath = os.path.join(root, filename)
logger.debug("Loading plugin: {}".format(fpath))
with open(fpath, 'r') as f:
info = yaml.load(f)
logger.debug("Info: {}".format(info))
return cls._load_plugin_from_info(info, root)
def _load_plugins(self):
#print(sys.path)
#print(search_folder)
plugins = {}
for search_folder in self.search_folders:
for item in os.listdir(search_folder):
if os.path.isdir(os.path.join(search_folder, item)) \
and os.path.exists(
os.path.join(search_folder, item, "__init__.py")):
plugins[item] = self._load_plugin(item, search_folder)
for search_folder in self._search_folders:
for root, dirnames, filenames in os.walk(search_folder):
for filename in fnmatch.filter(filenames, '*.senpy'):
name, plugin = self._load_plugin(root, filename)
if plugin and name:
plugins[name] = plugin
self._outdated = False
return plugins
def teardown(self, exception):
pass
def enable_all(self):
for plugin in self.plugins:
self.enable_plugin(plugin)
def enable_plugin(self, item):
self.plugins[item].enabled = True
def disable_plugin(self, item):
self.plugins[item].enabled = False
@property
def plugins(self):
ctx = stack.top
if ctx is not None:
if not hasattr(self, '_plugins'):
self._plugins = self._load_plugins()
print("Already plugins")
return self._plugins
""" Return the plugins registered for a given application. """
if self._outdated:
self._plugin_list = self._load_plugins()
return self._plugin_list
if __name__ == '__main__':
from flask import Flask
app = Flask(__name__)
sp = Senpy()
sp.init_app(app)
with app.app_context():
sp._load_plugins()
def filter_plugins(self, **kwargs):
return plugins.pfilter(self.plugins, **kwargs)
@property
def analysis_plugins(self):
""" Return only the analysis plugins """
return self.filter_plugins(plugin_type='analysisPlugin')

367
senpy/models.py Normal file
View File

@@ -0,0 +1,367 @@
'''
Senpy Models.
This implementation should mirror the JSON schema definition.
For compatibility with Py3 and for easier debugging, this new version drops
introspection and adds all arguments to the models.
'''
from __future__ import print_function
from six import string_types
import time
import copy
import json
import os
import jsonref
import jsonschema
from flask import Response as FlaskResponse
from pyld import jsonld
from rdflib import Graph
import logging
logger = logging.getLogger(__name__)
DEFINITIONS_FILE = 'definitions.json'
CONTEXT_PATH = os.path.join(
os.path.dirname(os.path.realpath(__file__)), 'schemas', 'context.jsonld')
def get_schema_path(schema_file, absolute=False):
if absolute:
return os.path.realpath(schema_file)
else:
return os.path.join(
os.path.dirname(os.path.realpath(__file__)), 'schemas',
schema_file)
def read_schema(schema_file, absolute=False):
schema_path = get_schema_path(schema_file, absolute)
schema_uri = 'file://{}'.format(schema_path)
with open(schema_path) as f:
return jsonref.load(f, base_uri=schema_uri)
base_schema = read_schema(DEFINITIONS_FILE)
class Context(dict):
@staticmethod
def load(context):
logging.debug('Loading context: {}'.format(context))
if not context:
return context
elif isinstance(context, list):
contexts = []
for c in context:
contexts.append(Context.load(c))
return contexts
elif isinstance(context, dict):
return Context(context)
elif isinstance(context, string_types):
try:
with open(context) as f:
return Context(json.loads(f.read()))
except IOError:
return context
else:
raise AttributeError('Please, provide a valid context')
base_context = Context.load(CONTEXT_PATH)
class SenpyMixin(object):
_context = base_context["@context"]
def flask(self,
in_headers=True,
headers=None,
outformat='json-ld',
**kwargs):
"""
Return the values and error to be used in flask.
So far, it returns a fixed context. We should store/generate different
contexts if the plugin adds more aliases.
"""
headers = headers or {}
kwargs["with_context"] = not in_headers
content, mimetype = self.serialize(format=outformat,
with_mime=True,
**kwargs)
if outformat == 'json-ld' and in_headers:
headers.update({
"Link":
('<%s>;'
'rel="http://www.w3.org/ns/json-ld#context";'
' type="application/ld+json"' % kwargs.get('context_uri'))
})
return FlaskResponse(
response=content,
status=getattr(self, "status", 200),
headers=headers,
mimetype=mimetype)
def serialize(self, format='json-ld', with_mime=False, **kwargs):
js = self.jsonld(**kwargs)
if format == 'json-ld':
content = json.dumps(js, indent=2, sort_keys=True)
mimetype = "application/json"
elif format in ['turtle', ]:
logger.debug(js)
content = json.dumps(js, indent=2, sort_keys=True)
g = Graph().parse(
data=content,
format='json-ld',
base=kwargs.get('prefix'),
context=self._context)
logger.debug(
'Parsing with prefix: {}'.format(kwargs.get('prefix')))
content = g.serialize(format='turtle').decode('utf-8')
mimetype = 'text/{}'.format(format)
else:
raise Error('Unknown outformat: {}'.format(format))
if with_mime:
return content, mimetype
else:
return content
def serializable(self):
def ser_or_down(item):
if hasattr(item, 'serializable'):
return item.serializable()
elif isinstance(item, dict):
temp = dict()
for kp in item:
vp = item[kp]
temp[kp] = ser_or_down(vp)
return temp
elif isinstance(item, list):
return list(ser_or_down(i) for i in item)
else:
return item
return ser_or_down(self._plain_dict())
def jsonld(self,
with_context=True,
context_uri=None,
prefix=None,
expanded=False):
ser = self.serializable()
result = jsonld.compact(
ser,
self._context,
options={
'base': prefix,
'expandContext': self._context,
'senpy': prefix
})
if context_uri:
result['@context'] = context_uri
if expanded:
result = jsonld.expand(
result, options={'base': prefix,
'expandContext': self._context})
if not with_context:
del result['@context']
return result
def to_JSON(self, *args, **kwargs):
js = json.dumps(self.jsonld(*args, **kwargs), indent=4, sort_keys=True)
return js
def validate(self, obj=None):
if not obj:
obj = self
if hasattr(obj, "jsonld"):
obj = obj.jsonld()
jsonschema.validate(obj, self.schema)
def __str__(self):
return str(self.to_JSON())
class BaseModel(SenpyMixin, dict):
schema = base_schema
def __init__(self, *args, **kwargs):
if 'id' in kwargs:
self.id = kwargs.pop('id')
elif kwargs.pop('_auto_id', True):
self.id = '_:{}_{}'.format(type(self).__name__, time.time())
temp = dict(*args, **kwargs)
for obj in [
self.schema,
] + self.schema.get('allOf', []):
for k, v in obj.get('properties', {}).items():
if 'default' in v and k not in temp:
temp[k] = copy.deepcopy(v['default'])
for i in temp:
nk = self._get_key(i)
if nk != i:
temp[nk] = temp[i]
del temp[i]
try:
temp['@type'] = getattr(self, '@type')
except AttributeError:
logger.warn('Creating an instance of an unknown model')
super(BaseModel, self).__init__(temp)
def _get_key(self, key):
key = key.replace("__", ":", 1)
return key
def __setitem__(self, key, value):
dict.__setitem__(self, key, value)
def __delitem__(self, key):
dict.__delitem__(self, key)
def __getattr__(self, key):
try:
return self.__getitem__(self._get_key(key))
except KeyError:
raise AttributeError(key)
def __setattr__(self, key, value):
self.__setitem__(self._get_key(key), value)
def __delattr__(self, key):
try:
object.__delattr__(self, key)
except AttributeError:
self.__delitem__(self._get_key(key))
def _plain_dict(self):
d = {k: v for (k, v) in self.items() if k[0] != "_"}
if 'id' in d:
d["@id"] = d.pop('id')
return d
_subtypes = {}
def register(rsubclass, rtype=None):
_subtypes[rtype or rsubclass.__name__] = rsubclass
def from_dict(indict, cls=None):
if not cls:
target = indict.get('@type', None)
try:
if target and target in _subtypes:
cls = _subtypes[target]
else:
cls = BaseModel
except Exception:
cls = BaseModel
outdict = dict()
for k, v in indict.items():
if k == '@context':
pass
elif isinstance(v, dict):
v = from_dict(indict[k])
elif isinstance(v, list):
for ix, v2 in enumerate(v):
if isinstance(v2, dict):
v[ix] = from_dict(v2)
outdict[k] = v
return cls(**outdict)
def from_string(string, **kwargs):
return from_dict(json.loads(string), **kwargs)
def from_json(injson):
indict = json.loads(injson)
return from_dict(indict)
def from_schema(name, schema_file=None, base_classes=None):
base_classes = base_classes or []
base_classes.append(BaseModel)
schema_file = schema_file or '{}.json'.format(name)
class_name = '{}{}'.format(name[0].upper(), name[1:])
newclass = type(class_name, tuple(base_classes), {})
setattr(newclass, '@type', name)
setattr(newclass, 'schema', read_schema(schema_file))
setattr(newclass, 'class_name', class_name)
register(newclass, name)
return newclass
def _add_from_schema(*args, **kwargs):
generatedClass = from_schema(*args, **kwargs)
globals()[generatedClass.__name__] = generatedClass
del generatedClass
for i in [
'analysis',
'emotion',
'emotionConversion',
'emotionConversionPlugin',
'emotionAnalysis',
'emotionModel',
'emotionPlugin',
'emotionSet',
'entry',
'plugin',
'plugins',
'response',
'results',
'sentiment',
'sentimentPlugin',
'suggestion',
]:
_add_from_schema(i)
_ErrorModel = from_schema('error')
class Error(SenpyMixin, Exception):
def __init__(self, message, *args, **kwargs):
super(Error, self).__init__(self, message, message)
self._error = _ErrorModel(message=message, *args, **kwargs)
self.message = message
def __getitem__(self, key):
return self._error[key]
def __setitem__(self, key, value):
self._error[key] = value
def __delitem__(self, key):
del self._error[key]
def __getattr__(self, key):
if key != '_error' and hasattr(self._error, key):
return getattr(self._error, key)
raise AttributeError(key)
def __setattr__(self, key, value):
if key != '_error':
return setattr(self._error, key, value)
else:
super(Error, self).__setattr__(key, value)
def __delattr__(self, key):
delattr(self._error, key)
def __str__(self):
return str(self.to_JSON(with_context=False))
register(Error, 'error')

View File

@@ -1,34 +0,0 @@
class SenpyPlugin(object):
def __init__(self, name=None, version=None, params=None):
self.name = name
self.version = version
self.params = params or []
def analyse(self, *args, **kwargs):
pass
def activate(self):
pass
def deactivate(self):
pass
class SentimentPlugin(SenpyPlugin):
def __init__(self,
minPolarity=0,
maxPolarity=1,
**kwargs):
super(SentimentPlugin, self).__init__(**kwargs)
self.minPolarity = minPolarity
self.maxPolarity = maxPolarity
class EmotionPlugin(SenpyPlugin):
def __init__(self,
minEmotionValue=0,
maxEmotionValue=1,
emotionCategory=None,
**kwargs):
super(EmotionPlugin, self).__init__(**kwargs)
self.minEmotionValue = minEmotionValue
self.maxEmotionValue = maxEmotionValue
self.emotionCategory = emotionCategory

View File

@@ -0,0 +1,162 @@
from future import standard_library
standard_library.install_aliases()
import inspect
import os.path
import os
import pickle
import logging
import tempfile
import copy
from .. import models
from ..api import API_PARAMS
logger = logging.getLogger(__name__)
class Plugin(models.Plugin):
def __init__(self, info=None):
"""
Provides a canonical name for plugins and serves as base for other
kinds of plugins.
"""
if not info:
raise models.Error(message=("You need to provide configuration"
"information for the plugin."))
logger.debug("Initialising {}".format(info))
id = 'plugins/{}_{}'.format(info['name'], info['version'])
super(Plugin, self).__init__(id=id, **info)
self.is_activated = False
def get_folder(self):
return os.path.dirname(inspect.getfile(self.__class__))
def activate(self):
pass
def deactivate(self):
pass
SenpyPlugin = Plugin
class AnalysisPlugin(Plugin):
def analyse(self, *args, **kwargs):
raise NotImplemented(
'Your method should implement either analyse or analyse_entry')
def analyse_entry(self, entry, parameters):
""" An implemented plugin should override this method.
This base method is here to adapt old style plugins which only
implement the *analyse* function.
Note that this method may yield an annotated entry or a list of
entries (e.g. in a tokenizer)
"""
text = entry['text']
params = copy.copy(parameters)
params['input'] = text
results = self.analyse(**params)
for i in results.entries:
yield i
def analyse_entries(self, entries, parameters):
for entry in entries:
logger.debug('Analysing entry with plugin {}: {}'.format(self, entry))
for result in self.analyse_entry(entry, parameters):
yield result
class ConversionPlugin(Plugin):
pass
class SentimentPlugin(models.SentimentPlugin, AnalysisPlugin):
def __init__(self, info, *args, **kwargs):
super(SentimentPlugin, self).__init__(info, *args, **kwargs)
self.minPolarityValue = float(info.get("minPolarityValue", 0))
self.maxPolarityValue = float(info.get("maxPolarityValue", 1))
class EmotionPlugin(models.EmotionPlugin, AnalysisPlugin):
def __init__(self, info, *args, **kwargs):
super(EmotionPlugin, self).__init__(info, *args, **kwargs)
self.minEmotionValue = float(info.get("minEmotionValue", -1))
self.maxEmotionValue = float(info.get("maxEmotionValue", 1))
class EmotionConversionPlugin(models.EmotionConversionPlugin, ConversionPlugin):
pass
class ShelfMixin(object):
@property
def sh(self):
if not hasattr(self, '_sh') or self._sh is None:
self.__dict__['_sh'] = {}
if os.path.isfile(self.shelf_file):
try:
self.__dict__['_sh'] = pickle.load(open(self.shelf_file, 'rb'))
except (IndexError, EOFError, pickle.UnpicklingError):
logger.warning('{} has a corrupted shelf file!'.format(self.id))
if not self.get('force_shelf', False):
raise
return self._sh
@sh.deleter
def sh(self):
if os.path.isfile(self.shelf_file):
os.remove(self.shelf_file)
del self.__dict__['_sh']
self.save()
@property
def shelf_file(self):
if 'shelf_file' not in self or not self['shelf_file']:
sd = os.environ.get('SENPY_DATA', tempfile.gettempdir())
self.shelf_file = os.path.join(sd, self.name + '.p')
return self['shelf_file']
def save(self):
logger.debug('saving pickle')
if hasattr(self, '_sh') and self._sh is not None:
with open(self.shelf_file, 'wb') as f:
pickle.dump(self._sh, f)
default_plugin_type = API_PARAMS['plugin_type']['default']
def pfilter(plugins, **kwargs):
""" Filter plugins by different criteria """
if isinstance(plugins, models.Plugins):
plugins = plugins.plugins
elif isinstance(plugins, dict):
plugins = plugins.values()
ptype = kwargs.pop('plugin_type', default_plugin_type)
logger.debug('#' * 100)
logger.debug('ptype {}'.format(ptype))
if ptype:
try:
ptype = ptype[0].upper() + ptype[1:]
pclass = globals()[ptype]
logger.debug('Class: {}'.format(pclass))
candidates = filter(lambda x: isinstance(x, pclass),
plugins)
except KeyError:
raise models.Error('{} is not a valid type'.format(ptype))
else:
candidates = plugins
logger.debug(candidates)
def matches(plug):
res = all(getattr(plug, k, None) == v for (k, v) in kwargs.items())
logger.debug(
"matching {} with {}: {}".format(plug.name, kwargs, res))
return res
if kwargs:
candidates = filter(matches, candidates)
return {p.name: p for p in candidates}

View File

View File

@@ -0,0 +1,102 @@
from senpy.plugins import EmotionConversionPlugin
from senpy.models import EmotionSet, Emotion, Error
import logging
logger = logging.getLogger(__name__)
class CentroidConversion(EmotionConversionPlugin):
def __init__(self, info):
if 'centroids' not in info:
raise Error('Centroid conversion plugins should provide '
'the centroids in their senpy file')
if 'onyx:doesConversion' not in info:
if 'centroids_direction' not in info:
raise Error('Please, provide centroids direction')
cf, ct = info['centroids_direction']
info['onyx:doesConversion'] = [{
'onyx:conversionFrom': cf,
'onyx:conversionTo': ct
}, {
'onyx:conversionFrom': ct,
'onyx:conversionTo': cf
}]
if 'aliases' in info:
aliases = info['aliases']
ncentroids = {}
for k1, v1 in info['centroids'].items():
nv1 = {}
for k2, v2 in v1.items():
nv1[aliases.get(k2, k2)] = v2
ncentroids[aliases.get(k1, k1)] = nv1
info['centroids'] = ncentroids
super(CentroidConversion, self).__init__(info)
self.dimensions = set()
for c in self.centroids.values():
self.dimensions.update(c.keys())
self.neutralPoints = self.get("neutralPoints", dict())
if not self.neutralPoints:
for i in self.dimensions:
self.neutralPoints[i] = self.get("neutralValue", 0)
def _forward_conversion(self, original):
"""Sum the VAD value of all categories found weighted by intensity.
Intensities are scaled by onyx:maxIntensityValue if it is present, else maxIntensityValue
is assumed to be one. Emotion entries that do not have onxy:hasEmotionIntensity specified
are assumed to have maxIntensityValue. Emotion entries that do not have
onyx:hasEmotionCategory specified are ignored."""
res = Emotion()
maxIntensity = float(original.get("onyx:maxIntensityValue", 1))
for e in original.onyx__hasEmotion:
category = e.get("onyx:hasEmotionCategory", None)
if not category:
continue
intensity = e.get("onyx:hasEmotionIntensity", maxIntensity) / maxIntensity
if not intensity:
continue
centroid = self.centroids.get(category, None)
if centroid:
for dim, value in centroid.items():
neutral = self.neutralPoints[dim]
if dim not in res:
res[dim] = 0
res[dim] += (value - neutral) * intensity + neutral
return res
def _backwards_conversion(self, original):
"""Find the closest category"""
centroids = self.centroids
neutralPoints = self.neutralPoints
dimensions = self.dimensions
def distance_k(centroid, original, k):
# k component of the distance between the value and a given centroid
return (centroid.get(k, neutralPoints[k]) - original.get(k, neutralPoints[k]))**2
def distance(centroid):
return sum(distance_k(centroid, original, k) for k in dimensions)
emotion = min(centroids, key=lambda x: distance(centroids[x]))
result = Emotion(onyx__hasEmotionCategory=emotion)
result.onyx__algorithmConfidence = distance(centroids[emotion])
return result
def convert(self, emotionSet, fromModel, toModel, params):
cf, ct = self.centroids_direction
logger.debug(
'{}\n{}\n{}\n{}'.format(emotionSet, fromModel, toModel, params))
e = EmotionSet()
if fromModel == cf and toModel == ct:
e.onyx__hasEmotion.append(self._forward_conversion(emotionSet))
elif fromModel == ct and toModel == cf:
for i in emotionSet.onyx__hasEmotion:
e.onyx__hasEmotion.append(self._backwards_conversion(i))
else:
raise Error('EMOTION MODEL NOT KNOWN')
yield e

View File

@@ -0,0 +1,39 @@
---
name: Ekman2FSRE
module: senpy.plugins.conversion.emotion.centroids
description: Plugin to convert emotion sets from Ekman to VAD
version: 0.1
# No need to specify onyx:doesConversion because centroids.py adds it automatically from centroids_direction
centroids:
anger:
A: 6.95
D: 5.1
V: 2.7
disgust:
A: 5.3
D: 8.05
V: 2.7
fear:
A: 6.5
D: 3.6
V: 3.2
happiness:
A: 7.22
D: 6.28
V: 8.6
sadness:
A: 5.21
D: 2.82
V: 2.21
centroids_direction:
- emoml:big6
- emoml:fsre-dimensions
aliases: # These are aliases for any key in the centroid, to avoid repeating a long name several times
A: emoml:arousal
V: emoml:valence
D: emoml:dominance
anger: emoml:big6anger
disgust: emoml:big6disgust
fear: emoml:big6fear
happiness: emoml:big6happiness
sadness: emoml:big6sadness

View File

@@ -0,0 +1,44 @@
---
name: Ekman2PAD
module: senpy.plugins.conversion.emotion.centroids
description: Plugin to convert emotion sets from Ekman to VAD
version: 0.1
# No need to specify onyx:doesConversion because centroids.py adds it automatically from centroids_direction
origin:
# Point in VAD space with no emotion (aka Neutral)
A: 5.0
D: 5.0
V: 5.0
centroids:
anger:
A: 6.95
D: 5.1
V: 2.7
disgust:
A: 5.3
D: 8.05
V: 2.7
fear:
A: 6.5
D: 3.6
V: 3.2
happiness:
A: 7.22
D: 6.28
V: 8.6
sadness:
A: 5.21
D: 2.82
V: 2.21
centroids_direction:
- emoml:big6
- emoml:pad
aliases: # These are aliases for any key in the centroid, to avoid repeating a long name several times
A: emoml:arousal
V: emoml:valence
D: emoml:dominance
anger: emoml:big6anger
disgust: emoml:big6disgust
fear: emoml:big6fear
happiness: emoml:big6happiness
sadness: emoml:big6sadness

View File

@@ -0,0 +1,18 @@
import random
from senpy.plugins import EmotionPlugin
from senpy.models import EmotionSet, Emotion
class RmoRandPlugin(EmotionPlugin):
def analyse_entry(self, entry, params):
category = "emoml:big6happiness"
number = max(-1, min(1, random.gauss(0, 0.5)))
if number > 0:
category = "emoml:big6anger"
emotionSet = EmotionSet()
emotion = Emotion({"onyx:hasEmotionCategory": category})
emotionSet.onyx__hasEmotion.append(emotion)
emotionSet.prov__wasGeneratedBy = self.id
entry.emotions.append(emotionSet)
yield entry

View File

@@ -0,0 +1,9 @@
---
name: emoRand
module: emoRand
description: A sample plugin that returns a random emotion annotation
author: "@balkian"
version: '0.1'
url: "https://github.com/gsi-upm/senpy-plugins-community"
requirements: {}
onyx:usesEmotionModel: "emoml:big6"

View File

@@ -0,0 +1,24 @@
import random
from senpy.plugins import SentimentPlugin
from senpy.models import Sentiment
class RandPlugin(SentimentPlugin):
def analyse_entry(self, entry, params):
lang = params.get("language", "auto")
polarity_value = max(-1, min(1, random.gauss(0.2, 0.2)))
polarity = "marl:Neutral"
if polarity_value > 0:
polarity = "marl:Positive"
elif polarity_value < 0:
polarity = "marl:Negative"
sentiment = Sentiment({
"marl:hasPolarity": polarity,
"marl:polarityValue": polarity_value
})
sentiment["prov:wasGeneratedBy"] = self.id
entry.sentiments.append(sentiment)
entry.language = lang
yield entry

View File

@@ -0,0 +1,10 @@
---
name: rand
module: rand
description: A sample plugin that returns a random sentiment annotation
author: "@balkian"
version: '0.1'
url: "https://github.com/gsi-upm/senpy-plugins-community"
requirements: {}
marl:maxPolarityValue: '1'
marl:minPolarityValue: "-1"

View File

@@ -0,0 +1,36 @@
import requests
import json
from senpy.plugins import SentimentPlugin
from senpy.models import Sentiment
class Sentiment140Plugin(SentimentPlugin):
def analyse_entry(self, entry, params):
lang = params.get("language", "auto")
res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
json.dumps({
"language": lang,
"data": [{
"text": entry.text
}]
}))
p = params.get("prefix", None)
polarity_value = self.maxPolarityValue * int(
res.json()["data"][0]["polarity"]) * 0.25
polarity = "marl:Neutral"
neutral_value = self.maxPolarityValue / 2.0
if polarity_value > neutral_value:
polarity = "marl:Positive"
elif polarity_value < neutral_value:
polarity = "marl:Negative"
sentiment = Sentiment(
prefix=p,
marl__hasPolarity=polarity,
marl__polarityValue=polarity_value)
sentiment.prov__wasGeneratedBy = self.id
entry.sentiments = []
entry.sentiments.append(sentiment)
entry.language = lang
yield entry

View File

@@ -0,0 +1,21 @@
---
name: sentiment140
module: sentiment140
description: "Connects to the sentiment140 free API: http://sentiment140.com"
author: "@balkian"
version: '0.2'
url: "https://github.com/gsi-upm/senpy-plugins-community"
extra_params:
language:
"@id": lang_sentiment140
aliases:
- language
- l
required: false
options:
- es
- en
- auto
requirements: {}
maxPolarityValue: 1
minPolarityValue: 0

View File

@@ -1,48 +0,0 @@
import requests
import json
import sys
print(sys.path)
from senpy.plugin import SentimentPlugin
class Sentiment140Plugin(SentimentPlugin):
def __init__(self, **kwargs):
super(Sentiment140Plugin, self).__init__(name="Sentiment140",
version="1.0",
**kwargs)
def analyse(self, **params):
res = requests.post("http://www.sentiment140.com/api/bulkClassifyJson",
json.dumps({
"language": "auto",
"data": [{"text": params["input"]}]}
))
response = {"analysis": [{}], "entries": []}
response["analysis"][0].update({ "marl:algorithm": "SimpleAlgorithm",
"marl:minPolarityValue": 0,
"marl:maxPolarityValue": 100})
polarityValue = int(res.json()["data"][0]["polarity"]) * 25
polarity = "marl:Neutral"
if polarityValue > 50:
polarity = "marl:Positive"
elif polarityValue < 50:
polarity = "marl:Negative"
response["entries"] = [
{
"isString": params["input"],
"opinions": [{
"marl:polarityValue": polarityValue,
"marl:hasPolarity": polarity
}]
}
]
return response
plugin = Sentiment140Plugin()

View File

@@ -1,41 +0,0 @@
'''
SENTIMENT140
=============
* http://www.sentiment140.com/api/bulkClassifyJson
* Method: POST
* Parameters: JSON Object (that is copied to the result)
* text
* query
* language
* topic
* Example response:
```json
{"data": [{"text": "I love Titanic.", "id":1234, "polarity": 4},
{"text": "I hate Titanic.", "id":4567, "polarity": 0}]}
```
'''
import requests
import json
ENDPOINT_URI = "http://www.sentiment140.com/api/bulkClassifyJson"
def analyse(texts):
parameters = {"data": []}
if isinstance(texts, list):
for text in texts:
parameters["data"].append({"text": text})
else:
parameters["data"].append({"text": texts})
res = requests.post(ENDPOINT_URI, json.dumps(parameters))
res.json()
return res.json()
def test():
print analyse("I love Titanic")
print analyse(["I love Titanic", "I hate Titanic"])
if __name__ == "__main__":
test()

View File

@@ -0,0 +1,30 @@
from senpy.plugins import AnalysisPlugin
from senpy.models import Entry
from nltk.tokenize.punkt import PunktSentenceTokenizer
from nltk.tokenize.simple import LineTokenizer
import nltk
class SplitPlugin(AnalysisPlugin):
def activate(self):
nltk.download('punkt')
def analyse_entry(self, entry, params):
chunker_type = params.get("delimiter", "sentence")
original_id = entry.id
original_text = entry.get("text", None)
if chunker_type == "sentence":
tokenizer = PunktSentenceTokenizer()
chars = tokenizer.span_tokenize(original_text)
for i, sentence in enumerate(tokenizer.tokenize(original_text)):
e = Entry()
e.text = sentence
e.id = original_id + "#char={},{}".format(chars[i][0], chars[i][1])
yield e
if chunker_type == "paragraph":
tokenizer = LineTokenizer()
chars = tokenizer.span_tokenize(original_text)
for i, paragraph in enumerate(tokenizer.tokenize(original_text)):
e = Entry()
e.text = paragraph
chars = [char for char in chars]
e.id = original_id + "#char={},{}".format(chars[i][0], chars[i][1])
yield e

View File

@@ -0,0 +1,18 @@
---
name: split
module: split
description: A sample plugin that chunks input text
author: "@militarpancho"
version: '0.1'
url: "https://github.com/gsi-upm/senpy"
requirements: {nltk}
extra_params:
delimiter:
aliases:
- type
- t
required: false
default: sentence
options:
- sentence
- paragraph

View File

@@ -0,0 +1,15 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Senpy analysis",
"type": "object",
"properties": {
"@id": {
"type": "string"
},
"@type": {
"type": "string",
"description": "Type of the analysis. e.g. marl:SentimentAnalysis"
}
},
"required": ["@id", "@type"]
}

15
senpy/schemas/atom.json Normal file
View File

@@ -0,0 +1,15 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Base schema for all Senpy objects",
"type": "object",
"properties": {
"@id": {
"type": "string"
},
"@type": {
"type": "string",
"description": "Type of the atom. e.g., 'onyx:EmotionAnalysis', 'nif:Entry'"
}
},
"required": ["@id", "@type"]
}

View File

@@ -0,0 +1,5 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "JSON-LD Context",
"type": ["array", "string", "object"]
}

View File

@@ -0,0 +1,62 @@
{
"@context": {
"@vocab": "http://www.gsi.dit.upm.es/ontologies/senpy#",
"dc": "http://dublincore.org/2012/06/14/dcelements#",
"me": "http://www.mixedemotions-project.eu/ns/model#",
"prov": "http://www.w3.org/ns/prov#",
"nif": "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#",
"marl": "http://www.gsi.dit.upm.es/ontologies/marl/ns#",
"onyx": "http://www.gsi.dit.upm.es/ontologies/onyx/ns#",
"wna": "http://www.gsi.dit.upm.es/ontologies/wnaffect/ns#",
"emoml": "http://www.gsi.dit.upm.es/ontologies/onyx/vocabularies/emotionml/ns#",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"topics": {
"@id": "dc:subject"
},
"entities": {
"@id": "me:hasEntities"
},
"suggestions": {
"@id": "me:hasSuggestions",
"@container": "@set"
},
"emotions": {
"@id": "onyx:hasEmotionSet",
"@container": "@set"
},
"sentiments": {
"@id": "marl:hasOpinion",
"@container": "@set"
},
"entries": {
"@id": "prov:used",
"@container": "@set"
},
"analysis": {
"@id": "AnalysisInvolved",
"@type": "@id",
"@container": "@set"
},
"options": {
"@container": "@set"
},
"plugins": {
"@container": "@set"
},
"prov:wasGeneratedBy": {
"@type": "@id"
},
"onyx:usesEmotionModel": {
"@type": "@id"
},
"onyx:hasEmotionCategory": {
"@type": "@id"
},
"onyx:conversionFrom": {
"@type": "@id"
},
"onyx:conversionTo": {
"@type": "@id"
}
}
}

View File

@@ -0,0 +1,45 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"Results": {
"$ref": "results.json"
},
"Context": {
"$ref": "context.json"
},
"Analysis": {
"$ref": "analysis.json"
},
"Entry": {
"$ref": "entry.json"
},
"Sentiment": {
"$ref": "sentiment.json"
},
"EmotionSet": {
"$ref": "emotionSet.json"
},
"Emotion": {
"$ref": "emotion.json"
},
"EmotionModel": {
"$ref": "emotionModel.json"
},
"Entity": {
"$ref": "entity.json"
},
"Topic": {
"$ref": "topic.json"
},
"Suggestion": {
"$ref": "suggestion.json"
},
"Plugins": {
"$ref": "plugin.json"
},
"Plugin": {
"$ref": "plugin.json"
},
"Response": {
"$ref": "response.json"
}
}

View File

@@ -0,0 +1,9 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"properties": {
"name": {"type": "string"},
"maxValue": {"type": "number"},
"minValue": {"type": "number"}
},
"required": ["name", "maxValue", "minValue"]
}

View File

@@ -0,0 +1,4 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object"
}

View File

@@ -0,0 +1,19 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Senpy Emotion analysis",
"type": "object",
"allOf": [
{"$ref": "analysis.json"},
{"properties":
{
"onyx:usesEmotionModel": {
"anyOf": [
{"type": "string"},
{"$ref": "emotionModel.json"}
]
}
},
"required": ["onyx:hasEmotionModel",
"@type"]
}]
}

View File

@@ -0,0 +1,12 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"properties": {
"onyx:conversionFrom": {
"$ref": "emotionModel.json"
},
"onyx:conversionTo": {
"$ref": "emotionModel.json"
}
},
"required": ["onyx:conversionFrom", "onyx:conversionTo"]
}

View File

@@ -0,0 +1,19 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"allOf": [
{
"$ref": "plugin.json"
},
{
"properties": {
"onyx:doesConversion": {
"type": "array",
"items": {
"$ref": "emotionConversion.json"
}
}
}
}
]
}

View File

@@ -0,0 +1,27 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"properties": {
"@id": {"type": "string"},
"nif:beginIndex": {"type": "integer"},
"nif:endIndex": {"type": "integer"},
"nif:anchorOf": {
"description": "Piece of context that contains the Sentiment",
"type": "string"
},
"onyx:hasDimension": {
"type": "array",
"items": {
"$ref": "dimensions.json"
},
"uniqueItems": true
},
"onyx:hasEmotionCategory": {
"type": "array",
"items": {
"$ref": "emotion.json"
},
"default": []
}
},
"required": ["@id", "onyx:hasEmotion"]
}

View File

@@ -0,0 +1,19 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"allOf": [
{
"$ref": "plugin.json"
},
{
"properties": {
"onyx:usesEmotionModel": {
"type": "array",
"items": {
"$ref": "emotionModel.json"
}
}
}
}
]
}

View File

@@ -0,0 +1,24 @@
{
"$schema": "http://json-schema.org/draft-04/schema#",
"properties": {
"@id": {"type": "string"},
"nif:beginIndex": {"type": "integer"},
"nif:endIndex": {"type": "integer"},
"nif:anchorOf": {
"description": "Piece of context that contains the Sentiment",
"type": "string"
},
"onyx:hasEmotion": {
"type": "array",
"items": {
"$ref": "emotion.json"
},
"default": []
},
"prov:wasGeneratedBy": {
"type": "string",
"description": "The ID of the analysis that generated this Emotion. The full object should be included in the \"analysis\" property of the root object"
}
},
"required": ["@id", "prov:wasGeneratedBy", "onyx:hasEmotion"]
}

Some files were not shown because too many files have changed in this diff Show More