1
0
mirror of https://github.com/gsi-upm/senpy synced 2025-11-28 18:28:16 +00:00

Compare commits

..

170 Commits

Author SHA1 Message Date
J. Fernando Sánchez
297e9e8106 readthedocs: remove pdf output 2023-09-27 11:22:56 +02:00
J. Fernando Sánchez
1eb8e432af add readthedocs config file 2023-09-27 11:18:31 +02:00
J. Fernando Sánchez
8236569818 k8s: add latest-senpy.gsi.upm.es 2023-09-27 11:04:34 +02:00
J. Fernando Sánchez
98d368dd9a k8s: fix volume mount 2023-09-27 11:01:15 +02:00
J. Fernando Sánchez
9747140b54 explicit KUBECONFIG in kubectl version 2023-09-26 20:25:37 +02:00
J. Fernando Sánchez
e915766449 ignore uninitialized plugin tests when strict=false 2023-09-26 19:55:41 +02:00
J. Fernando Sánchez
b33a70620b use default strict for extension tests 2023-09-26 19:47:23 +02:00
J. Fernando Sánchez
e324c730e2 use strict=false in blueprint tests 2023-09-26 19:41:19 +02:00
J. Fernando Sánchez
a5c135faac add noop to test-requirements 2023-09-26 19:38:22 +02:00
J. Fernando Sánchez
894942b3ab move nltk data volume 2023-09-26 19:33:50 +02:00
J. Fernando Sánchez
9bb980f6b4 make noop plugin optional 2023-09-26 19:31:50 +02:00
J. Fernando Sánchez
66371c1cd8 add pandas for testing 2023-09-26 19:02:11 +02:00
J. Fernando Sánchez
f3d4415ffb Modify dependencies to allow for 3.7 compatibility
Some dependencies are not available for python 3.7 anymore. Instead
of trying to support different versions of the libraries, we opt to
focus on the latest python version, and allow for CORE functionality
for earlier versions.
2023-09-26 18:52:04 +02:00
J. Fernando Sánchez
3f227986f3 relax pandas dependency 2023-09-26 18:17:41 +02:00
J. Fernando Sánchez
f2f28644a1 remove duplicated panda requirement 2023-09-26 18:15:56 +02:00
J. Fernando Sánchez
82a456705c remove duplicated requirements 2023-09-26 18:13:52 +02:00
J. Fernando Sánchez
3c35b4ac91 add requirements for community plugins 2023-09-26 18:10:16 +02:00
J. Fernando Sánchez
268d2a4848 adapt deployment 2023-09-26 17:57:36 +02:00
J. Fernando Sánchez
5330ae93fc remove senticnet: API is down 2023-09-23 00:25:16 +02:00
J. Fernando Sánchez
4f95fbcbd1 update to pass tests with community plugins 2023-09-22 23:28:19 +02:00
J. Fernando Sánchez
5b28b6d1b4 merge community plugins 2023-09-20 13:44:23 +02:00
J. Fernando Sánchez
e1d888ebd6 Add 'community-plugins/' from commit '4c73797246c6aff8d055abfef73d3f0d34b933a8'
git-subtree-dir: community-plugins
git-subtree-mainline: 7f712952be
git-subtree-split: 4c73797246
2023-09-20 13:32:30 +02:00
J. Fernando Sánchez
7f712952be version 1.0.6 2022-05-24 19:33:39 +02:00
J. Fernando Sánchez
07348de59a Add enable-cors to deployment 2022-05-24 14:44:46 +02:00
J. Fernando Sánchez
39123cea8a Fix typo latest docker image 2022-05-23 17:03:34 +02:00
J. Fernando Sánchez
8a6ef3852a Renamed senpy-deploymebt.yaml 2022-05-23 16:35:19 +02:00
J. Fernando Sánchez
99c8782a92 Add version to latest docker image 2022-05-23 16:29:10 +02:00
J. Fernando Sánchez
efa93f5456 Fix docker latest image 2022-05-23 16:03:30 +02:00
J. Fernando Sánchez
55e5ce3a66 Fix docker build and k8s svc 2022-05-23 15:19:46 +02:00
J. Fernando Sánchez
92b654b36c Update k8s files 2022-05-23 14:19:46 +02:00
J. Fernando Sánchez
7febb4d673 Update k8s files 2022-05-23 14:03:44 +02:00
J. Fernando Sánchez
7ca5057705 k8s deploy from raw KUBECONFIG 2022-05-23 13:09:29 +02:00
J. Fernando Sánchez
8489457370 Refine k8s deploy 2022-05-23 12:51:47 +02:00
J. Fernando Sánchez
e9266af924 Add k8s deployment 2022-05-23 12:45:40 +02:00
J. Fernando Sánchez
a7849bb029 Fix bug docker build /cache 2022-05-23 11:35:11 +02:00
J. Fernando Sánchez
bd083a0e55 Fix bug DOCKERHUB gitlab-ci 2022-05-23 11:20:15 +02:00
J. Fernando Sánchez
a390f51097 Fix bug gitlab-ci kaniko auth 2022-05-23 10:12:04 +02:00
J. Fernando Sánchez
adbcd7d196 Version 1.0.5 2022-05-23 09:52:22 +02:00
J. Fernando Sánchez
4b1eecd1c2 Version 1.0.4 2022-05-20 14:05:27 +02:00
J. Fernando Sánchez
c1e4e092a7 Version 1.0.3
Fix fonts mixed-content
Fixed deprecation error collections.MutableMapping (python 3.10)
2022-05-20 13:50:56 +02:00
J. Fernando Sánchez
a0abbede49 Version 1.0.2
Update RDFlib to 6.1.1 (removed rdflib-jsonld, as it is deprecated)
Bumped minimum python version: 3.7 (as a result of RDFLIB 6)
Added ProxyFix to run behind nginx (Added --no-proxy to run without the fix)
Replaced http media links to protocol-agnostic links in playground
Enable CORS (via --enable-cors)
Update old urls (replaced *.cluster.gsi.dit.upm.es with *.gsi.upm.es)
2022-05-20 13:27:31 +02:00
J. Fernando Sánchez
c5a2cf23cb typo docs/readme 2019-09-02 15:58:10 +02:00
J. Fernando Sánchez
49a183aeb6 typo readme 2019-09-02 15:43:35 +02:00
J. Fernando Sánchez
3088d9474a compatibility notice 2019-09-02 15:39:18 +02:00
J. Fernando Sánchez
4c73797246 update emotion-anew description 2019-09-02 14:07:13 +02:00
J. Fernando Sánchez
0f5bc514b7 add windows+mac tests in travis 2019-09-02 13:56:30 +02:00
J. Fernando Sánchez
228eb6321b update emotion-anew description 2019-09-02 12:03:37 +02:00
J. Fernando Sánchez
d575220712 update senpy version 2019-07-18 14:31:38 +02:00
J. Fernando Sánchez
7ae493b3f3 Minor fix setup and docs 2019-07-18 11:40:41 +02:00
J. Fernando Sánchez
435d107677 Add headers and minor fixes 2019-07-17 16:29:30 +02:00
J. Fernando Sánchez
bf2feb9839 Add senticnet plugin 2019-07-17 11:17:02 +02:00
J. Fernando Sánchez
5c98326acf Clean up emotion-anew 2019-07-10 13:09:48 +02:00
J. Fernando Sánchez
d961d8ac5b Fix URI emotion-anew 2019-07-10 13:07:55 +02:00
J. Fernando Sánchez
96ec10d791 Fix pipeline 2019-04-04 14:08:30 +02:00
J. Fernando Sánchez
6858a139ed Move Taiger to a separate repository 2019-04-04 13:10:27 +02:00
J. Fernando Sánchez
4f286057c9 Update to senpy 0.20 2019-04-04 12:56:46 +02:00
J. Fernando Sánchez
fa993c6e2a Add default plugins 2019-01-15 17:58:48 +01:00
J. Fernando Sánchez
238f76442c Add senpy.gsi.upm.es 2019-01-15 17:32:07 +01:00
J. Fernando Sánchez
a015ee81f7 Revert to not adding data folder to image 2019-01-11 16:58:28 +01:00
J. Fernando Sánchez
d665017154 Compose for taiger plugin 2019-01-11 12:10:17 +01:00
J. Fernando Sánchez
00832e2e1c Add data in image 2019-01-11 10:49:43 +01:00
J. Fernando Sánchez
4ecabadae9 remove unnecessary import 2019-01-09 19:31:51 +01:00
J. Fernando Sánchez
bb6f9ee367 tweaks for py2/py3 compatibility 2019-01-09 19:29:24 +01:00
Oscar Araque
80acb9307c Merge branch 'master' of ssh://lab.gsi.upm.es:2200/senpy/senpy-plugins-community 2019-01-09 17:23:57 +01:00
Oscar Araque
94394af20b depechemood updated 2019-01-09 17:19:22 +01:00
J. Fernando Sánchez
d5f9ef88b2 Add new taiger plugin 2019-01-09 16:18:12 +01:00
J. Fernando Sánchez
675a905ab4 Add depeche mood 2018-12-14 18:50:35 +01:00
J. Fernando Sánchez
4507449266 Bump senpy version to 0.11.4 2018-11-06 17:23:19 +01:00
J. Fernando Sánchez
2e91a83eb6 Bump senpy version to 0.11.3 2018-11-06 14:58:29 +01:00
J. Fernando Sánchez
d8c47220b1 Modify default TAIGER endpoint 2018-08-01 13:22:21 +02:00
J. Fernando Sánchez
0e4146ed8d Merge branch 'taiger' into 'master'
Taiger

Closes #12

See merge request senpy/senpy-plugins-community!1
2018-08-01 11:19:09 +00:00
J. Fernando Sánchez
6d3fc6f861 Taiger 2018-08-01 11:19:09 +00:00
J. Fernando Sánchez
666632a032 Update Makefile to avoid CI build errors 2018-07-24 17:57:23 +02:00
J. Fernando Sánchez
9dbe22b81f Adapt to new mocking of requests 2018-07-24 17:28:32 +02:00
J. Fernando Sánchez
9355d27e71 Bump senpy version to 0.10.9 2018-07-04 16:42:38 +02:00
J. Fernando Sánchez
0ed434ef0c Do not bind port in docker 2018-06-20 12:33:34 +02:00
J. Fernando Sánchez
dbc238989b Fix resources sentiment-basic 2018-06-20 12:29:01 +02:00
J. Fernando Sánchez
48ba936a7b Improved docs, docker-compose and dockerfile 2018-06-20 12:16:27 +02:00
J. Fernando Sánchez
bbe91e1924 Upgrade to senpy 0.10.7 2018-06-18 17:53:07 +02:00
J. Fernando Sánchez
c7091e6323 Force building before pushing 2018-06-18 17:53:01 +02:00
J. Fernando Sánchez
f11439d944 Unify data folders 2018-06-15 16:44:25 +02:00
J. Fernando Sánchez
b15a0d7dbe Fix problems with echo and newlines
printf is more portable
2018-06-15 11:39:19 +02:00
J. Fernando Sánchez
f92617d147 Change submodules to relative URIs 2018-06-15 10:34:36 +02:00
J. Fernando Sánchez
2a773d45aa Fix image name in tests 2018-06-15 10:29:18 +02:00
J. Fernando Sánchez
e4e1a74971 Build before testing! 2018-06-15 09:54:42 +02:00
J. Fernando Sánchez
1659285f0b Remove TTY from docker test 2018-06-15 09:52:42 +02:00
J. Fernando Sánchez
57016e1380 Add clean stage 2018-06-15 09:49:23 +02:00
J. Fernando Sánchez
54da48b548 Add CI/CD and k8s 2018-06-15 09:46:15 +02:00
J. Fernando Sánchez
62142482dc Updated makefiles from senpy-plugins-community 2018-06-15 09:22:46 +02:00
J. Fernando Sánchez
982baa04cf Add '.makefiles/' from commit 'a75ba6994d93ca027b6f3ba0b08b75dd60d3aa78'
git-subtree-dir: .makefiles
git-subtree-mainline: c52a894017
git-subtree-split: a75ba6994d
2018-06-14 19:54:41 +02:00
J. Fernando Sánchez
c52a894017 Merged into monorepo 2018-06-14 19:38:08 +02:00
J. Fernando Sánchez
e51b659030 Merge commit '7c959aace896e9d318497a417e0eec8f78b62314' as 'sentiment-basic' 2018-06-12 10:01:45 +02:00
J. Fernando Sánchez
2a4cc96905 Removed sentiment-basic submodule 2018-06-12 10:01:45 +02:00
J. Fernando Sánchez
7c959aace8 Squashed 'sentiment-basic/' content from commit beb8e31
git-subtree-dir: sentiment-basic
git-subtree-split: beb8e311619059a0c660411edef1cf95b3826c0a
2018-06-12 10:01:45 +02:00
J. Fernando Sánchez
15ac26428a Merge commit '98ec4817cff3abd06f961fbbdb5c860aeb887bca' as 'emotion-anew' 2018-06-12 10:01:45 +02:00
J. Fernando Sánchez
402b49f43f Removed emotion-anew submodule 2018-06-12 10:01:45 +02:00
J. Fernando Sánchez
98ec4817cf Squashed 'emotion-anew/' content from commit e8a3c83
git-subtree-dir: emotion-anew
git-subtree-split: e8a3c837e3543a5f5f19086e1fcaa34b22be639e
2018-06-12 10:01:45 +02:00
J. Fernando Sánchez
08c1b4ce79 Merge commit '23c6cdd58dd3071fe5f707d904afacde6bd1a870' as 'emotion-wnaffect' 2018-06-12 10:01:44 +02:00
J. Fernando Sánchez
50a0599597 Removed emotion-wnaffect submodule 2018-06-12 10:01:44 +02:00
J. Fernando Sánchez
23c6cdd58d Squashed 'emotion-wnaffect/' content from commit 74c40d7
git-subtree-dir: emotion-wnaffect
git-subtree-split: 74c40d7e97d54d3c3e30739a85cf9322c92d5a87
2018-06-12 10:01:44 +02:00
J. Fernando Sánchez
7825802341 Merge commit '4a0b6c1bf4ec7213ad2b5538eb737a27dc28faa8' as 'sentiment-vader' 2018-06-12 10:01:44 +02:00
J. Fernando Sánchez
4a0b6c1bf4 Squashed 'sentiment-vader/' content from commit ddb7432
git-subtree-dir: sentiment-vader
git-subtree-split: ddb7432d260fd2d8fca719f1b3ee46117019f475
2018-06-12 10:01:44 +02:00
J. Fernando Sánchez
cd73cd3fc6 Removed sentiment-vader submodule 2018-06-12 10:01:43 +02:00
J. Fernando Sánchez
704aba2ff0 Merge commit '1eec6ecbad039b946c0d7b690335f2bb4ea8f320' as 'sentiment-meaningCloud' 2018-06-12 10:01:43 +02:00
J. Fernando Sánchez
bf67422f2f Removed sentiment-meaningCloud submodule 2018-06-12 10:01:43 +02:00
J. Fernando Sánchez
1eec6ecbad Squashed 'sentiment-meaningCloud/' content from commit 2a5d212
git-subtree-dir: sentiment-meaningCloud
git-subtree-split: 2a5d212833fac38efe69b9d90588c1f0a27ff390
2018-06-12 10:01:43 +02:00
J. Fernando Sánchez
bec22e44a0 Removed enterprise/unnecessary modules 2018-06-12 10:01:34 +02:00
Manuel Garcia Amado
f3961378e0 Add submodule in README 2018-05-14 11:34:23 +02:00
Manuel Garcia Amado
fbde8a9462 Add plugins as submodules 2018-05-14 11:32:56 +02:00
Manuel Garcia Amado
582ae8a340 Adding tutorial to submodules 2018-02-08 11:19:58 +01:00
J. Fernando Sánchez
a75ba6994d Merge branch 'meaningcloud' into 'master'
Meaningcloud

See merge request docs/templates/makefiles!8
2017-10-05 13:26:12 +00:00
J. Fernando Sánchez
919c4a07a2 Update base.mk 2017-10-05 13:25:33 +00:00
J. Fernando Sánchez
42224e343c Updated makefiles from meaningcloud
Version was "unknown" due to a bug
2017-10-05 11:19:02 +02:00
militarpancho
f0c211c00a PYVERSION changed 2017-10-04 15:37:05 +02:00
J. Fernando Sánchez
24d85b18bb Merge branch 'meaningcloud' into 'master'
Updated makefiles from meaningcloud

See merge request docs/templates/makefiles!7
2017-10-03 16:25:49 +00:00
J. Fernando Sánchez
d150321741 Updated makefiles from meaningcloud
* Fixed some python+docker variables
* Improved defaults for docker image names
2017-10-03 18:24:30 +02:00
J. Fernando Sánchez
4f88009bd7 Merge branch 'senpy' into 'master'
Senpy

See merge request docs/templates/makefiles!6
2017-10-03 15:25:58 +00:00
J. Fernando Sánchez
1f0703d535 Fixed typo in .gitlab-ci 2017-10-03 17:19:14 +02:00
J. Fernando Sánchez
b20982cae1 Merge branch 'senpy' into 'master'
Senpy

See merge request docs/templates/makefiles!5
2017-10-03 15:16:01 +00:00
J. Fernando Sánchez
c23f7986b4 Trying to fix push to github 2017-10-03 16:39:09 +02:00
J. Fernando Sánchez
8fe7616bae Updated makefiles from senpy 2017-10-03 15:08:16 +02:00
J. Fernando Sánchez
1543f5550e Updated makefiles from senpy 2017-10-03 13:46:09 +02:00
J. Fernando Sánchez
f04cbeeddb Testing new k8s mk 2017-10-03 13:41:51 +02:00
militarpancho
b671ff51f9 Add support for py3 in emotion-wnaffect
Normalize polarity values in sentiment-basic and sentiment-140
2017-07-14 11:13:59 +02:00
militarpancho
dee007eacf Fixed bug in meaningCloud plugin. Now retrieves Neutral sentiment 2017-05-11 11:02:29 +02:00
J. Fernando Sánchez
1ef9dac86a Made ANEW paths absolute 2017-05-10 17:21:36 +02:00
militarpancho
18486aa3e0 Fixed vader tab error 2017-05-08 17:57:07 +02:00
militarpancho
86fdd8678a Add aggregate sentiment. This closes senpy/senpy-plugins-community#10 2017-05-08 13:20:47 +02:00
militarpancho
1b8a24c530 Fixed ANEW Readme 2017-05-05 11:02:30 +02:00
militarpancho
7f765d004f updated links in readme for obtain resources 2017-05-05 11:00:40 +02:00
militarpancho
b22ac843b6 Updated anew and wnaffect README explaining how to obtain resources 2017-05-05 10:57:16 +02:00
militarpancho
e88ca98438 Readme updated 2017-05-04 11:58:23 +02:00
militarpancho
65bb517fd2 Readme updated 2017-05-04 11:56:16 +02:00
militarpancho
23d15e9274 Added emotion-anew and sentiment-vader 2017-05-04 11:49:57 +02:00
militarpancho
23a5595d18 Added language information in emotion wn-affect 2017-04-28 13:26:00 +02:00
militarpancho
e341cc82fa Fixed sentiment-basic plugin that only retrieved Neutral sentiment. This closes senpy/senpy-plugins-community#6. Also added nltk download for some plugins. 2017-04-17 13:57:27 +02:00
militarpancho
85db4db01d added timeout message to finally fix #5 2017-03-09 16:20:16 +01:00
militarpancho
9f7a0e6907 Added timeout meaningCloud. This should fix #5 2017-03-09 14:21:22 +01:00
militarpancho
241f478a68 Added sufixes dictionary for wordnet lemmatizer. This close #4 2017-03-08 11:37:14 +01:00
militarpancho
5427b02a1a Regarding to #4. English bug with sentiment basic. 2017-03-07 13:58:46 +01:00
militarpancho
df2dc17ac0 This should fix #3 2017-03-03 12:03:55 +01:00
militarpancho
cc112c5ac5 Addapted plugins to senpy 0.8.2 2017-03-01 13:23:41 +01:00
militarpancho
65b8873092 cambios 2017-02-28 14:27:18 +01:00
militarpancho
9ea177a780 Added shelfmixin to emotion-wnaffect. This closes #2 2017-02-07 14:02:29 +01:00
militarpancho
bb0b0fadc2 Added affect description 2017-02-06 14:05:41 +01:00
militarpancho
076813cb1a Added sentiment-140 2017-02-01 13:46:34 +01:00
militarpancho
51b4737c43 Change paths to wordnet files. Deleted loading from local path to load dictionarios from absolute paths 2017-01-31 11:22:28 +01:00
militarpancho
140ea9c159 Added some documentation and changes to .senpy files. 2017-01-26 14:40:31 +01:00
militarpancho
7250f56f17 Plugins names updated 2017-01-25 18:58:27 +01:00
militarpancho
b40ac19130 Documentation more improved meaningCloud 2017-01-25 11:43:57 +01:00
militarpancho
37b130abaa Documentation improved meaningCloud 2017-01-25 11:41:57 +01:00
militarpancho
99e13f32e5 meaningCloud documentation improved 2017-01-24 14:19:27 +01:00
militarpancho
71d976acb2 Added documentation to meaningCloud plugin 2017-01-24 13:03:38 +01:00
militarpancho
c57cfff0cc Added polarityValue to meaningCloud plugin 2017-01-19 17:16:42 +01:00
militarpancho
f9c4e4bd59 Fixed some bugs 2017-01-17 11:49:01 +01:00
militarpancho
7b583f504c Added apikey aliase to Affect plugin 2017-01-16 18:37:13 +01:00
militarpancho
6f3c08f8aa Fixed syntax error 2017-01-16 18:04:31 +01:00
militarpancho
67ff20440a Change module name 2017-01-16 17:44:57 +01:00
militarpancho
82e3062a6b Added meaningCloud to affect 2017-01-16 17:11:10 +01:00
militarpancho
864ca75b8f affect plugin added 2017-01-13 14:23:07 +01:00
militarpancho
ac5ac2d06b Readme fixed 2017-01-12 13:33:09 +01:00
militarpancho
1f5188c251 Readme to use plugin 2017-01-12 13:32:16 +01:00
militarpancho
90b55a4b27 Added meaningCloud plugin 2017-01-12 13:30:14 +01:00
J. Fernando Sánchez
aa628518ec Added Travis CI 2016-09-22 11:10:13 +02:00
J. Fernando Sánchez
5e8bc717a8 Added WordNet-Affect plugin and Makefile 2016-09-21 21:53:37 +02:00
NachoCP
0e9db7081c Compatibility with senpy 0.5 2016-02-24 17:41:22 +01:00
Oscar Araque
17976d85b1 Added SentiText plugin (for Spanish) 2015-10-30 17:58:37 +01:00
J. Fernando Sánchez
94d82238b8 Added entry to example plugin 2015-10-08 19:16:56 +02:00
J. Fernando Sánchez
ed22679e7c Example plugin and README file 2015-10-08 19:07:48 +02:00
J. Fernando Sánchez
6561201cc2 Initial commit 2015-10-08 18:47:41 +02:00
111 changed files with 13825 additions and 2867 deletions

1
.gitignore vendored
View File

@@ -8,3 +8,4 @@ __pycache__
VERSION VERSION
Dockerfile-* Dockerfile-*
Dockerfile Dockerfile
senpy_data

View File

@@ -4,100 +4,130 @@
# - docker:dind # - docker:dind
# When using dind, it's wise to use the overlayfs driver for # When using dind, it's wise to use the overlayfs driver for
# improved performance. # improved performance.
stages: stages:
- test - test
- push - publish
- test_image
- deploy - deploy
- clean
before_script: variables:
- make -e login KUBENS: senpy
LATEST_IMAGE: "${HUB_REPO}:${CI_COMMIT_SHORT_SHA}"
SENPY_DATA: "/senpy-data/" # This is configured in the CI job
NLTK_DATA: "/senpy-data/nltk_data" # Store NLTK downloaded data
.test: &test_definition docker:
stage: test stage: publish
script: image:
- make -e test-$PYTHON_VERSION name: gcr.io/kaniko-project/executor:debug
except: entrypoint: [""]
- tags # Avoid unnecessary double testing
test-3.6:
<<: *test_definition
variables: variables:
PYTHON_VERSION: "3.6" PYTHON_VERSION: "3.10"
tags:
test-3.7: - docker
<<: *test_definition
allow_failure: true
variables:
PYTHON_VERSION: "3.7"
push:
stage: push
script: script:
- make -e push - echo $CI_COMMIT_TAG > senpy/VERSION
- sed "s/{{PYVERSION}}/$PYTHON_VERSION/" Dockerfile.template > Dockerfile
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"},\"https://index.docker.io/v1/\":{\"auth\":\"$HUB_AUTH\"}}}" > /kaniko/.docker/config.json
# The skip-tls-verify flag is there because our registry certificate is self signed
- /kaniko/executor --context $CI_PROJECT_DIR --skip-tls-verify --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_TAG --destination $HUB_REPO:$CI_COMMIT_TAG
only: only:
- tags - tags
- triggers
- fix-makefiles
push-latest: docker-latest:
stage: push stage: publish
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: [""]
variables:
PYTHON_VERSION: "3.10"
tags:
- docker
script: script:
- make -e push-latest - echo git.${CI_COMMIT_SHORT_SHA} > senpy/VERSION
- sed "s/{{PYVERSION}}/$PYTHON_VERSION/" Dockerfile.template > Dockerfile
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"},\"https://index.docker.io/v1/\":{\"auth\":\"$HUB_AUTH\"}}}" > /kaniko/.docker/config.json
# The skip-tls-verify flag is there because our registry certificate is self signed
- /kaniko/executor --context $CI_PROJECT_DIR --skip-tls-verify --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $LATEST_IMAGE --destination "${HUB_REPO}:latest"
only: only:
- master refs:
- triggers - master
- fix-makefiles
push-github: testimage:
only:
- tags
tags:
- docker
stage: test_image
image: "$CI_REGISTRY_IMAGE:$CI_COMMIT_TAG"
script:
- python -m senpy --no-run --test
testpy37:
tags:
- docker
variables:
SENPY_STRICT: "false"
image: python:3.7
stage: test
script:
- pip install -r requirements.txt -r test-requirements.txt
- python setup.py test
testpy310:
tags:
- docker
variables:
SENPY_STRICT: "true"
image: python:3.10
stage: test
script:
- pip install -r requirements.txt -r test-requirements.txt -r extra-requirements.txt
- python setup.py test
push_pypi:
only:
- tags
tags:
- docker
image: python:3.10
stage: publish
script:
- echo $CI_COMMIT_TAG > senpy/VERSION
- pip install twine
- python setup.py sdist bdist_wheel
- TWINE_PASSWORD=$PYPI_PASSWORD TWINE_USERNAME=$PYPI_USERNAME python -m twine upload dist/*
check_pypi:
only:
- tags
tags:
- docker
image: python:3.10
stage: deploy stage: deploy
script: script:
- make -e push-github - pip install senpy==$CI_COMMIT_TAG
only: # Allow PYPI to update its index before we try to install
- master when: delayed
- triggers start_in: 10 minutes
- fix-makefiles
deploy_pypi: latest-demo:
only:
refs:
- master
tags:
- docker
image: alpine/k8s:1.22.6
stage: deploy stage: deploy
script: # Configure the PyPI credentials, then push the package, and cleanup the creds. environment: production
- echo "[server-login]" >> ~/.pypirc variables:
- echo "repository=https://upload.pypi.org/legacy/" >> ~/.pypirc KUBECONFIG: "/kubeconfig"
- echo "username=" ${PYPI_USER} >> ~/.pypirc # Same image as docker-latest
- echo "password=" ${PYPI_PASSWORD} >> ~/.pypirc IMAGEWTAG: "${LATEST_IMAGE}"
- make pip_upload KUBEAPP: "senpy"
- echo "" > ~/.pypirc && rm ~/.pypirc # If the above fails, this won't run.
only:
- /^v?\d+\.\d+\.\d+([abc]\d*)?$/ # PEP-440 compliant version (tags)
except:
- branches
deploy:
stage: deploy
environment: test
script: script:
- make -e deploy - echo "${KUBECONFIG_RAW}" > $KUBECONFIG
only: - kubectl --kubeconfig $KUBECONFIG version
- master - cd k8s/
- fix-makefiles - cat *.yaml *.tmpl 2>/dev/null | envsubst | kubectl --kubeconfig $KUBECONFIG apply --namespace ${KUBENS:-default} -f -
- kubectl --kubeconfig $KUBECONFIG get all,ing -l app=${KUBEAPP} --namespace=${KUBENS:-default}
push-github:
stage: deploy
script:
- make -e push-github
only:
- master
- triggers
clean :
stage: clean
script:
- make -e clean
when: manual
cleanup_py:
stage: clean
when: always # this is important; run even if preceding stages failed.
script:
- rm -vf ~/.pypirc # we don't want to leave these around, but GitLab may clean up anyway.

View File

@@ -2,7 +2,7 @@ These makefiles are recipes for several common tasks in different types of proje
To add them to your project, simply do: To add them to your project, simply do:
``` ```
git remote add makefiles ssh://git@lab.cluster.gsi.dit.upm.es:2200/docs/templates/makefiles.git git remote add makefiles ssh://git@lab.gsi.upm.es:2200/docs/templates/makefiles.git
git subtree add --prefix=.makefiles/ makefiles master git subtree add --prefix=.makefiles/ makefiles master
touch Makefile touch Makefile
echo "include .makefiles/base.mk" >> Makefile echo "include .makefiles/base.mk" >> Makefile
@@ -16,7 +16,7 @@ include .makefiles/python.mk
``` ```
You may need to set special variables like the name of your project or the python versions you're targetting. You may need to set special variables like the name of your project or the python versions you're targetting.
Take a look at each specific `.mk` file for more information, and the `Makefile` in the [senpy](https://lab.cluster.gsi.dit.upm.es/senpy/senpy) project for a real use case. Take a look at each specific `.mk` file for more information, and the `Makefile` in the [senpy](https://lab.gsi.upm.es/senpy/senpy) project for a real use case.
If you update the makefiles from your repository, make sure to push the changes for review in upstream (this repository): If you update the makefiles from your repository, make sure to push the changes for review in upstream (this repository):

View File

@@ -1,5 +1,5 @@
makefiles-remote: makefiles-remote:
git ls-remote --exit-code makefiles 2> /dev/null || git remote add makefiles ssh://git@lab.cluster.gsi.dit.upm.es:2200/docs/templates/makefiles.git git ls-remote --exit-code makefiles 2> /dev/null || git remote add makefiles ssh://git@lab.gsi.upm.es:2200/docs/templates/makefiles.git
makefiles-commit: makefiles-remote makefiles-commit: makefiles-remote
git add -f .makefiles git add -f .makefiles

22
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,22 @@
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
version: 2
# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.10"
# Build documentation in the "docs/" directory with Sphinx
sphinx:
configuration: docs/conf.py
# formats:
# - pdf
# - epub
python:
install:
- requirements: docs/requirements.txt

View File

@@ -1,15 +1,43 @@
sudo: required sudo: required
services: matrix:
- docker allow_failures:
# Windows is experimental in Travis.
language: python # As of this writing, senpy installs but hangs on tests that use the flask test client (e.g. blueprints)
- os: windows
env: include:
- PYV=3.4 - os: linux
- PYV=3.5 language: python
- PYV=3.6 python: 3.4
- PYV=3.7 before_install:
# - PYV=3.3 # Apt fails in this docker image - pip install --upgrade --force-reinstall pandas
# run nosetests - Tests - os: linux
script: make test-$PYV language: python
python: 3.5
- os: linux
language: python
python: 3.6
- os: linux
language: python
python: 3.7
- os: osx
language: generic
addons:
homebrew:
# update: true
packages: python3
before_install:
- python3 -m pip install --upgrade virtualenv
- virtualenv -p python3 --system-site-packages "$HOME/venv"
- source "$HOME/venv/bin/activate"
- os: windows
language: bash
before_install:
- choco install -y python3
- python -m pip install --upgrade pip
env: PATH=/c/Python37:/c/Python37/Scripts:$PATH
# command to run tests
# 'python' points to Python 2.7 on macOS but points to Python 3.7 on Linux and Windows
# 'python3' is a 'command not found' error on Windows but 'py' works on Windows only
script:
- python3 setup.py test || python setup.py test

View File

@@ -5,7 +5,29 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased] ## [Unreleased]
### Added
* The code of many senpy community plugins have been included by default. However, additional files (e.g., licensed data) and/or installing additional dependencies may be necessary for some plugins. Read each plugin's documentation for more information.
* `--strict` flag, to fail and not start when a
* `optional` attribute in plugins. Optional plugins may fail to load or activate but the server will be started regardless, unless running in strict mode
* Option in shelf plugins to ignore pickling errors
### Removed
* `--only-install`, `--only-test` and `--only-list` flags were removed in favor of `--no-run` + `--install`/`--test`/`--dependencies`
### Changed
* data directory selection logic is slightly modified, and will choose one of the following (in this order): `data_folder` (argument), `$SENPY_DATA` or `$CWD`
## [1.0.6]
### Fixed
* Plugins now get activated for testing
## [1.0.1]
### Added
* License headers
* Description for PyPI (setup.py)
### Changed
* The evaluation tab shows datasets inline, and a tooltip shows the number of instances
* The docs should be clearer now
## [1.0.0]
### Fixed ### Fixed
* Restored hash changing function in `main.js` * Restored hash changing function in `main.js`

View File

@@ -6,21 +6,20 @@ RUN apt-get update && apt-get install -y \
libblas-dev liblapack-dev liblapacke-dev gfortran \ libblas-dev liblapack-dev liblapacke-dev gfortran \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
RUN mkdir /cache/ /senpy-plugins /data/ RUN mkdir -p /cache/ /senpy-plugins /data/
VOLUME /data/ VOLUME /data/
ENV PIP_CACHE_DIR=/cache/ SENPY_DATA=/data ENV PIP_CACHE_DIR=/cache/ SENPY_DATA=/data
ONBUILD COPY . /senpy-plugins/
ONBUILD RUN python -m senpy --only-install -f /senpy-plugins
ONBUILD WORKDIR /senpy-plugins/
WORKDIR /usr/src/app WORKDIR /usr/src/app
COPY test-requirements.txt requirements.txt extra-requirements.txt /usr/src/app/ COPY test-requirements.txt requirements.txt extra-requirements.txt /usr/src/app/
RUN pip install --no-cache-dir -r test-requirements.txt -r requirements.txt -r extra-requirements.txt RUN pip install --no-cache-dir -r test-requirements.txt -r requirements.txt -r extra-requirements.txt
COPY . /usr/src/app/ COPY . /usr/src/app/
RUN pip install --no-cache-dir --no-index --no-deps --editable . RUN pip install --no-cache-dir --no-index --no-deps --editable .
ONBUILD COPY . /senpy-plugins/
ONBUILD RUN python -m senpy -i --no-run -f /senpy-plugins
ONBUILD WORKDIR /senpy-plugins/
ENTRYPOINT ["python", "-m", "senpy", "-f", "/senpy-plugins/", "--host", "0.0.0.0"] ENTRYPOINT ["python", "-m", "senpy", "-f", "/senpy-plugins/", "--host", "0.0.0.0"]

View File

@@ -5,7 +5,7 @@ IMAGENAME=gsiupm/senpy
# The first version is the main one (used for quick builds) # The first version is the main one (used for quick builds)
# See .makefiles/python.mk for more info # See .makefiles/python.mk for more info
PYVERSIONS=3.6 3.7 PYVERSIONS ?= 3.10 3.7
DEVPORT=5000 DEVPORT=5000

View File

@@ -2,23 +2,21 @@
:width: 100% :width: 100%
:target: http://senpy.gsi.upm.es :target: http://senpy.gsi.upm.es
.. image:: https://travis-ci.org/gsi-upm/senpy.svg?branch=master .. image:: https://readthedocs.org/projects/senpy/badge/?version=latest
:target: https://travis-ci.org/gsi-upm/senpy :target: http://senpy.readthedocs.io/en/latest/
.. image:: https://badge.fury.io/py/senpy.svg
.. image:: https://lab.gsi.upm.es/senpy/senpy/badges/master/pipeline.svg :target: https://badge.fury.io/py/senpy
:target: https://lab.gsi.upm.es/senpy/senpy/commits/master .. image:: https://travis-ci.org/gsi-upm/senpy.svg
:target: https://github.com/gsi-upm/senpy/senpy/tree/master
.. image:: https://lab.gsi.upm.es/senpy/senpy/badges/master/coverage.svg
:target: https://lab.gsi.upm.es/senpy/senpy/commits/master
.. image:: https://img.shields.io/pypi/l/requests.svg .. image:: https://img.shields.io/pypi/l/requests.svg
:target: https://lab.gsi.upm.es/senpy/senpy/ :target: https://lab.gsi.upm.es/senpy/senpy/
Senpy lets you create sentiment analysis web services easily, fast and using a well known API. Senpy lets you create sentiment analysis web services easily, fast and using a well known API.
As a bonus, senpy services use semantic vocabularies (e.g. `NIF <http://persistence.uni-leipzig.org/nlp2rdf/>`_, `Marl <http://www.gsi.dit.upm.es/ontologies/marl>`_, `Onyx <http://www.gsi.dit.upm.es/ontologies/onyx>`_) and formats (turtle, JSON-LD, xml-rdf). As a bonus, Senpy services use semantic vocabularies (e.g. `NIF <http://persistence.uni-leipzig.org/nlp2rdf/>`_, `Marl <http://www.gsi.upm.es/ontologies/marl>`_, `Onyx <http://www.gsi.upm.es/ontologies/onyx>`_) and formats (turtle, JSON-LD, xml-rdf).
Have you ever wanted to turn your sentiment analysis algorithms into a service? Have you ever wanted to turn your sentiment analysis algorithms into a service?
With senpy, now you can. With Senpy, now you can.
It provides all the tools so you just have to worry about improving your algorithms: It provides all the tools so you just have to worry about improving your algorithms:
`See it in action. <http://senpy.gsi.upm.es/>`_ `See it in action. <http://senpy.gsi.upm.es/>`_
@@ -43,20 +41,36 @@ Alternatively, you can use the development version:
cd senpy cd senpy
pip install --user . pip install --user .
If you want to install senpy globally, use sudo instead of the ``--user`` flag. If you want to install Senpy globally, use sudo instead of the ``--user`` flag.
Docker Image Docker Image
************ ************
Build the image or use the pre-built one: ``docker run -ti -p 5000:5000 gsiupm/senpy``. Build the image or use the pre-built one: ``docker run -ti -p 5000:5000 gsiupm/senpy``.
To add custom plugins, add a volume and tell senpy where to find the plugins: ``docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy -f /plugins`` To add custom plugins, add a volume and tell Senpy where to find the plugins: ``docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy -f /plugins``
Compatibility
-------------
Senpy should run on any major operating system.
Its code is pure Python, and the only limitations are imposed by its dependencies (e.g., nltk, pandas).
Currently, the CI/CD pipeline tests the code on:
* GNU/Linux with Python versions 3.7+ (3.10+ recommended for all plugins)
* MacOS and homebrew's python3
* Windows 10 and chocolatey's python3
The latest PyPI package is verified to install on Ubuntu, Debian and Arch Linux.
If you have trouble installing Senpy on your platform, see `Having problems?`_.
Developing Developing
---------- ----------
Developing/debugging Running/debugging
******************** *****************
This command will run the senpy container using the latest image available, mounting your current folder so you get your latest code: This command will run the senpy container using the latest image available, mounting your current folder so you get your latest code:
.. code:: bash .. code:: bash
@@ -119,7 +133,7 @@ or, alternatively:
This will create a server with any modules found in the current path. This will create a server with any modules found in the current path.
For more options, see the `--help` page. For more options, see the `--help` page.
Alternatively, you can use the modules included in senpy to build your own application. Alternatively, you can use the modules included in Senpy to build your own application.
Deploying on Heroku Deploying on Heroku
------------------- -------------------
@@ -127,9 +141,6 @@ Use a free heroku instance to share your service with the world.
Just use the example Procfile in this repository, or build your own. Just use the example Procfile in this repository, or build your own.
`DEMO on heroku <http://senpy.herokuapp.com>`_
For more information, check out the `documentation <http://senpy.readthedocs.org>`_. For more information, check out the `documentation <http://senpy.readthedocs.org>`_.
------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------
@@ -144,6 +155,17 @@ Instead, the maintainers will focus their efforts on keeping the codebase compat
We apologize for the inconvenience. We apologize for the inconvenience.
Having problems?
----------------
Please, file a new issue `on GitHub <https://github.com/gsi-upm/senpy/issues>`_ including enough details to reproduce the bug, including:
* Operating system
* Version of Senpy (or docker tag)
* Installed libraries
* Relevant logs
* A simple code example
Acknowledgement Acknowledgement
--------------- ---------------
This development has been partially funded by the European Union through the MixedEmotions Project (project number H2020 655632), as part of the `RIA ICT 15 Big data and Open Data Innovation and take-up` programme. This development has been partially funded by the European Union through the MixedEmotions Project (project number H2020 655632), as part of the `RIA ICT 15 Big data and Open Data Innovation and take-up` programme.

File diff suppressed because it is too large Load Diff

592
docs/Evaluation.ipynb Normal file
View File

@@ -0,0 +1,592 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluating Services"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sentiment analysis plugins can also be evaluated on a series of pre-defined datasets.\n",
"This can be done in three ways: through the Web UI (playground), through the web API and programmatically.\n",
"\n",
"Regardless of the way you perform the evaluation, you will need to specify a plugin (service) that you want to evaluate, and a series of datasets on which it should be evaluated.\n",
"\n",
"to evaluate a plugin on a dataset, senpy use the plugin to predict the sentiment in each entry in the dataset.\n",
"These predictions are compared with the expected values to produce several metrics, such as: accuracy, precision and f1-score.\n",
"\n",
"**note**: the evaluation process might take long for plugins that use external services, such as `sentiment140`.\n",
"\n",
"**note**: plugins are assumed to be pre-trained and invariant. i.e., the prediction for an entry should "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Web UI (Playground)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The playground should contain a tab for Evaluation, where you can select any plugin that can be evaluated, and the set of datasets that you want to test the plugin on.\n",
"\n",
"For example, the image below shows the results of the `sentiment-vader` plugin on the `vader` and `sts` datasets:\n",
"\n",
"\n",
"![](eval_table.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Web API"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The api exposes an endpoint (`/evaluate`), which accents the plugin and the set of datasets on which it should be evaluated."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code is not necessary, but it will display the results better:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is a simple call using the requests library:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>.output_html .hll { background-color: #ffffcc }\n",
".output_html { background: #f8f8f8; }\n",
".output_html .c { color: #408080; font-style: italic } /* Comment */\n",
".output_html .err { border: 1px solid #FF0000 } /* Error */\n",
".output_html .k { color: #008000; font-weight: bold } /* Keyword */\n",
".output_html .o { color: #666666 } /* Operator */\n",
".output_html .ch { color: #408080; font-style: italic } /* Comment.Hashbang */\n",
".output_html .cm { color: #408080; font-style: italic } /* Comment.Multiline */\n",
".output_html .cp { color: #BC7A00 } /* Comment.Preproc */\n",
".output_html .cpf { color: #408080; font-style: italic } /* Comment.PreprocFile */\n",
".output_html .c1 { color: #408080; font-style: italic } /* Comment.Single */\n",
".output_html .cs { color: #408080; font-style: italic } /* Comment.Special */\n",
".output_html .gd { color: #A00000 } /* Generic.Deleted */\n",
".output_html .ge { font-style: italic } /* Generic.Emph */\n",
".output_html .gr { color: #FF0000 } /* Generic.Error */\n",
".output_html .gh { color: #000080; font-weight: bold } /* Generic.Heading */\n",
".output_html .gi { color: #00A000 } /* Generic.Inserted */\n",
".output_html .go { color: #888888 } /* Generic.Output */\n",
".output_html .gp { color: #000080; font-weight: bold } /* Generic.Prompt */\n",
".output_html .gs { font-weight: bold } /* Generic.Strong */\n",
".output_html .gu { color: #800080; font-weight: bold } /* Generic.Subheading */\n",
".output_html .gt { color: #0044DD } /* Generic.Traceback */\n",
".output_html .kc { color: #008000; font-weight: bold } /* Keyword.Constant */\n",
".output_html .kd { color: #008000; font-weight: bold } /* Keyword.Declaration */\n",
".output_html .kn { color: #008000; font-weight: bold } /* Keyword.Namespace */\n",
".output_html .kp { color: #008000 } /* Keyword.Pseudo */\n",
".output_html .kr { color: #008000; font-weight: bold } /* Keyword.Reserved */\n",
".output_html .kt { color: #B00040 } /* Keyword.Type */\n",
".output_html .m { color: #666666 } /* Literal.Number */\n",
".output_html .s { color: #BA2121 } /* Literal.String */\n",
".output_html .na { color: #7D9029 } /* Name.Attribute */\n",
".output_html .nb { color: #008000 } /* Name.Builtin */\n",
".output_html .nc { color: #0000FF; font-weight: bold } /* Name.Class */\n",
".output_html .no { color: #880000 } /* Name.Constant */\n",
".output_html .nd { color: #AA22FF } /* Name.Decorator */\n",
".output_html .ni { color: #999999; font-weight: bold } /* Name.Entity */\n",
".output_html .ne { color: #D2413A; font-weight: bold } /* Name.Exception */\n",
".output_html .nf { color: #0000FF } /* Name.Function */\n",
".output_html .nl { color: #A0A000 } /* Name.Label */\n",
".output_html .nn { color: #0000FF; font-weight: bold } /* Name.Namespace */\n",
".output_html .nt { color: #008000; font-weight: bold } /* Name.Tag */\n",
".output_html .nv { color: #19177C } /* Name.Variable */\n",
".output_html .ow { color: #AA22FF; font-weight: bold } /* Operator.Word */\n",
".output_html .w { color: #bbbbbb } /* Text.Whitespace */\n",
".output_html .mb { color: #666666 } /* Literal.Number.Bin */\n",
".output_html .mf { color: #666666 } /* Literal.Number.Float */\n",
".output_html .mh { color: #666666 } /* Literal.Number.Hex */\n",
".output_html .mi { color: #666666 } /* Literal.Number.Integer */\n",
".output_html .mo { color: #666666 } /* Literal.Number.Oct */\n",
".output_html .sa { color: #BA2121 } /* Literal.String.Affix */\n",
".output_html .sb { color: #BA2121 } /* Literal.String.Backtick */\n",
".output_html .sc { color: #BA2121 } /* Literal.String.Char */\n",
".output_html .dl { color: #BA2121 } /* Literal.String.Delimiter */\n",
".output_html .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */\n",
".output_html .s2 { color: #BA2121 } /* Literal.String.Double */\n",
".output_html .se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */\n",
".output_html .sh { color: #BA2121 } /* Literal.String.Heredoc */\n",
".output_html .si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */\n",
".output_html .sx { color: #008000 } /* Literal.String.Other */\n",
".output_html .sr { color: #BB6688 } /* Literal.String.Regex */\n",
".output_html .s1 { color: #BA2121 } /* Literal.String.Single */\n",
".output_html .ss { color: #19177C } /* Literal.String.Symbol */\n",
".output_html .bp { color: #008000 } /* Name.Builtin.Pseudo */\n",
".output_html .fm { color: #0000FF } /* Name.Function.Magic */\n",
".output_html .vc { color: #19177C } /* Name.Variable.Class */\n",
".output_html .vg { color: #19177C } /* Name.Variable.Global */\n",
".output_html .vi { color: #19177C } /* Name.Variable.Instance */\n",
".output_html .vm { color: #19177C } /* Name.Variable.Magic */\n",
".output_html .il { color: #666666 } /* Literal.Number.Integer.Long */</style><div class=\"highlight\"><pre><span></span><span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@context&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;http://senpy.gsi.upm.es/api/contexts/YXBpL2V2YWx1YXRlLz9hbGdvPXNlbnRpbWVudC12YWRlciZkYXRhc2V0PXZhZGVyJTJDc3RzJm91dGZvcm1hdD1qc29uLWxkIw%3D%3D&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;AggregatedEvaluation&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;senpy:evaluations&quot;</span><span class=\"p\">:</span> <span class=\"p\">[</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Evaluation&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;evaluates&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;endpoint:plugins/sentiment-vader_0.1.1__vader&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;evaluatesOn&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;vader&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;metrics&quot;</span><span class=\"p\">:</span> <span class=\"p\">[</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Accuracy&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.6907142857142857</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Precision_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.34535714285714286</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Recall_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.5</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.40853400929446554</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_weighted&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.5643605528396403</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_micro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.6907142857142857</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.40853400929446554</span>\n",
" <span class=\"p\">}</span>\n",
" <span class=\"p\">]</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Evaluation&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;evaluates&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;endpoint:plugins/sentiment-vader_0.1.1__sts&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;evaluatesOn&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;sts&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;metrics&quot;</span><span class=\"p\">:</span> <span class=\"p\">[</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Accuracy&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.3107177974434612</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Precision_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.1553588987217306</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;Recall_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.5</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.23705926481620407</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_weighted&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.14731706525451424</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_micro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.3107177974434612</span>\n",
" <span class=\"p\">},</span>\n",
" <span class=\"p\">{</span>\n",
" <span class=\"nt\">&quot;@type&quot;</span><span class=\"p\">:</span> <span class=\"s2\">&quot;F1_macro&quot;</span><span class=\"p\">,</span>\n",
" <span class=\"nt\">&quot;value&quot;</span><span class=\"p\">:</span> <span class=\"mf\">0.23705926481620407</span>\n",
" <span class=\"p\">}</span>\n",
" <span class=\"p\">]</span>\n",
" <span class=\"p\">}</span>\n",
" <span class=\"p\">]</span>\n",
"<span class=\"p\">}</span>\n",
"</pre></div>\n"
],
"text/latex": [
"\\begin{Verbatim}[commandchars=\\\\\\{\\}]\n",
"\\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@context\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}http://senpy.gsi.upm.es/api/contexts/YXBpL2V2YWx1YXRlLz9hbGdvPXNlbnRpbWVudC12YWRlciZkYXRhc2V0PXZhZGVyJTJDc3RzJm91dGZvcm1hdD1qc29uLWxkIw\\PYZpc{}3D\\PYZpc{}3D\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}AggregatedEvaluation\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}senpy:evaluations\\PYZdq{}}\\PY{p}{:} \\PY{p}{[}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Evaluation\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}evaluates\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}endpoint:plugins/sentiment\\PYZhy{}vader\\PYZus{}0.1.1\\PYZus{}\\PYZus{}vader\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}evaluatesOn\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}vader\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}metrics\\PYZdq{}}\\PY{p}{:} \\PY{p}{[}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Accuracy\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.6907142857142857}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Precision\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.34535714285714286}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Recall\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.5}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.40853400929446554}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}weighted\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.5643605528396403}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}micro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.6907142857142857}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.40853400929446554}\n",
" \\PY{p}{\\PYZcb{}}\n",
" \\PY{p}{]}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Evaluation\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}evaluates\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}endpoint:plugins/sentiment\\PYZhy{}vader\\PYZus{}0.1.1\\PYZus{}\\PYZus{}sts\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}evaluatesOn\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}sts\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}metrics\\PYZdq{}}\\PY{p}{:} \\PY{p}{[}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Accuracy\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.3107177974434612}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Precision\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.1553588987217306}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}Recall\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.5}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.23705926481620407}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}weighted\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.14731706525451424}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}micro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.3107177974434612}\n",
" \\PY{p}{\\PYZcb{}}\\PY{p}{,}\n",
" \\PY{p}{\\PYZob{}}\n",
" \\PY{n+nt}{\\PYZdq{}@type\\PYZdq{}}\\PY{p}{:} \\PY{l+s+s2}{\\PYZdq{}F1\\PYZus{}macro\\PYZdq{}}\\PY{p}{,}\n",
" \\PY{n+nt}{\\PYZdq{}value\\PYZdq{}}\\PY{p}{:} \\PY{l+m+mf}{0.23705926481620407}\n",
" \\PY{p}{\\PYZcb{}}\n",
" \\PY{p}{]}\n",
" \\PY{p}{\\PYZcb{}}\n",
" \\PY{p}{]}\n",
"\\PY{p}{\\PYZcb{}}\n",
"\\end{Verbatim}\n"
],
"text/plain": [
"{\n",
" \"@context\": \"http://senpy.gsi.upm.es/api/contexts/YXBpL2V2YWx1YXRlLz9hbGdvPXNlbnRpbWVudC12YWRlciZkYXRhc2V0PXZhZGVyJTJDc3RzJm91dGZvcm1hdD1qc29uLWxkIw%3D%3D\",\n",
" \"@type\": \"AggregatedEvaluation\",\n",
" \"senpy:evaluations\": [\n",
" {\n",
" \"@type\": \"Evaluation\",\n",
" \"evaluates\": \"endpoint:plugins/sentiment-vader_0.1.1__vader\",\n",
" \"evaluatesOn\": \"vader\",\n",
" \"metrics\": [\n",
" {\n",
" \"@type\": \"Accuracy\",\n",
" \"value\": 0.6907142857142857\n",
" },\n",
" {\n",
" \"@type\": \"Precision_macro\",\n",
" \"value\": 0.34535714285714286\n",
" },\n",
" {\n",
" \"@type\": \"Recall_macro\",\n",
" \"value\": 0.5\n",
" },\n",
" {\n",
" \"@type\": \"F1_macro\",\n",
" \"value\": 0.40853400929446554\n",
" },\n",
" {\n",
" \"@type\": \"F1_weighted\",\n",
" \"value\": 0.5643605528396403\n",
" },\n",
" {\n",
" \"@type\": \"F1_micro\",\n",
" \"value\": 0.6907142857142857\n",
" },\n",
" {\n",
" \"@type\": \"F1_macro\",\n",
" \"value\": 0.40853400929446554\n",
" }\n",
" ]\n",
" },\n",
" {\n",
" \"@type\": \"Evaluation\",\n",
" \"evaluates\": \"endpoint:plugins/sentiment-vader_0.1.1__sts\",\n",
" \"evaluatesOn\": \"sts\",\n",
" \"metrics\": [\n",
" {\n",
" \"@type\": \"Accuracy\",\n",
" \"value\": 0.3107177974434612\n",
" },\n",
" {\n",
" \"@type\": \"Precision_macro\",\n",
" \"value\": 0.1553588987217306\n",
" },\n",
" {\n",
" \"@type\": \"Recall_macro\",\n",
" \"value\": 0.5\n",
" },\n",
" {\n",
" \"@type\": \"F1_macro\",\n",
" \"value\": 0.23705926481620407\n",
" },\n",
" {\n",
" \"@type\": \"F1_weighted\",\n",
" \"value\": 0.14731706525451424\n",
" },\n",
" {\n",
" \"@type\": \"F1_micro\",\n",
" \"value\": 0.3107177974434612\n",
" },\n",
" {\n",
" \"@type\": \"F1_macro\",\n",
" \"value\": 0.23705926481620407\n",
" }\n",
" ]\n",
" }\n",
" ]\n",
"}"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import requests\n",
"from IPython.display import Code\n",
"\n",
"endpoint = 'http://senpy.gsi.upm.es/api'\n",
"res = requests.get(f'{endpoint}/evaluate',\n",
" params={\"algo\": \"sentiment-vader\",\n",
" \"dataset\": \"vader,sts\",\n",
" 'outformat': 'json-ld'\n",
" })\n",
"Code(res.text, language='json')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Programmatically (expert)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A third option is to evaluate plugins manually without launching the server.\n",
"\n",
"This option is particularly interesting for advanced users that want faster iterations and evaluation results, and for automation.\n",
"\n",
"We would first need an instance of a plugin.\n",
"In this example we will use the Sentiment140 plugin that is included in every senpy installation:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"from senpy.plugins.sentiment import sentiment140_plugin"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"s140 = sentiment140_plugin.Sentiment140()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, we need to know what datasets are available.\n",
"We can list all datasets and basic stats (e.g., number of instances and labels used) like this:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"vader {'instances': 4200, 'labels': [1, -1]}\n",
"sts {'instances': 4200, 'labels': [1, -1]}\n",
"imdb_unsup {'instances': 50000, 'labels': [1, -1]}\n",
"imdb {'instances': 50000, 'labels': [1, -1]}\n",
"sst {'instances': 11855, 'labels': [1, -1]}\n",
"multidomain {'instances': 38548, 'labels': [1, -1]}\n",
"sentiment140 {'instances': 1600000, 'labels': [1, -1]}\n",
"semeval07 {'instances': 'None', 'labels': [1, -1]}\n",
"semeval14 {'instances': 7838, 'labels': [1, -1]}\n",
"pl04 {'instances': 4000, 'labels': [1, -1]}\n",
"pl05 {'instances': 10662, 'labels': [1, -1]}\n",
"semeval13 {'instances': 6259, 'labels': [1, -1]}\n"
]
}
],
"source": [
"from senpy.gsitk_compat import datasets\n",
"for k, d in datasets.items():\n",
" print(k, d['stats'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we will evaluate our plugin in one of the smallest datasets, `sts`:"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"[{\n",
" \"@type\": \"Evaluation\",\n",
" \"evaluates\": \"endpoint:plugins/sentiment140_0.2\",\n",
" \"evaluatesOn\": \"sts\",\n",
" \"metrics\": [\n",
" {\n",
" \"@type\": \"Accuracy\",\n",
" \"value\": 0.872173058013766\n",
" },\n",
" {\n",
" \"@type\": \"Precision_macro\",\n",
" \"value\": 0.9035254323131467\n",
" },\n",
" {\n",
" \"@type\": \"Recall_macro\",\n",
" \"value\": 0.8021249029415483\n",
" },\n",
" {\n",
" \"@type\": \"F1_macro\",\n",
" \"value\": 0.8320673712021136\n",
" },\n",
" {\n",
" \"@type\": \"F1_weighted\",\n",
" \"value\": 0.8631351567604358\n",
" },\n",
" {\n",
" \"@type\": \"F1_micro\",\n",
" \"value\": 0.872173058013766\n",
" },\n",
" {\n",
" \"@type\": \"F1_macro\",\n",
" \"value\": 0.8320673712021136\n",
" }\n",
" ]\n",
" }]"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s140.evaluate(['sts', ])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "68px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 4,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,152 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Senpy in 1 minute\n",
"\n",
"This mini-tutorial only shows how to annotate with a service.\n",
"We will use the [demo server](http://senpy.gsi.upm.es), which runs some open source plugins for sentiment and emotion analysis."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Annotating with senpy is as simple as issuing an HTTP request to the API using your favourite tool.\n",
"This is just an example using curl:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"@context\": \"http://senpy.gsi.upm.es/api/contexts/YXBpL3NlbnRpbWVudDE0MD8j\",\r\n",
" \"@type\": \"Results\",\r\n",
" \"entries\": [\r\n",
" {\r\n",
" \"@id\": \"prefix:\",\r\n",
" \"@type\": \"Entry\",\r\n",
" \"marl:hasOpinion\": [\r\n",
" {\r\n",
" \"@type\": \"Sentiment\",\r\n",
" \"marl:hasPolarity\": \"marl:Positive\",\r\n",
" \"prov:wasGeneratedBy\": \"prefix:Analysis_1554389334.6431913\"\r\n",
" }\r\n",
" ],\r\n",
" \"nif:isString\": \"Senpy is awesome\",\r\n",
" \"onyx:hasEmotionSet\": []\r\n",
" }\r\n",
" ]\r\n",
"}"
]
}
],
"source": [
"!curl \"http://senpy.gsi.upm.es/api/sentiment140\" --data-urlencode \"input=Senpy is awesome\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Congratulations**, you've used your first senpy service!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is the equivalent using the `requests` library:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"@context\": \"http://senpy.gsi.upm.es/api/contexts/YXBpL3NlbnRpbWVudDE0MD9pbnB1dD1TZW5weStpcythd2Vzb21lIw%3D%3D\",\n",
" \"@type\": \"Results\",\n",
" \"entries\": [\n",
" {\n",
" \"@id\": \"prefix:\",\n",
" \"@type\": \"Entry\",\n",
" \"marl:hasOpinion\": [\n",
" {\n",
" \"@type\": \"Sentiment\",\n",
" \"marl:hasPolarity\": \"marl:Positive\",\n",
" \"prov:wasGeneratedBy\": \"prefix:Analysis_1554389335.9803226\"\n",
" }\n",
" ],\n",
" \"nif:isString\": \"Senpy is awesome\",\n",
" \"onyx:hasEmotionSet\": []\n",
" }\n",
" ]\n",
"}\n"
]
}
],
"source": [
"import requests\n",
"res = requests.get('http://senpy.gsi.upm.es/api/sentiment140',\n",
" params={\"input\": \"Senpy is awesome\",})\n",
"print(res.text)"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "68px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 4,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,10 +0,0 @@
Advanced usage
--------------
.. toctree::
:maxdepth: 1
server-cli
conversion
commandline
development

View File

@@ -1,10 +0,0 @@
Command line
============
Although the main use of senpy is to publish services, the tool can also be used locally to analyze text in the command line.
This is a short video demonstration:
.. image:: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk.png
:width: 100%
:target: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk
:alt: CLI demo

View File

@@ -130,6 +130,7 @@ html_theme_options = {
'github_user': 'gsi-upm', 'github_user': 'gsi-upm',
'github_repo': 'senpy', 'github_repo': 'senpy',
'github_banner': True, 'github_banner': True,
'sidebar_collapse': True,
} }
@@ -292,11 +293,10 @@ texinfo_documents = [
#texinfo_no_detailmenu = False #texinfo_no_detailmenu = False
nbsphinx_prolog = """ nbsphinx_prolog = """
.. note:: This page has been auto-generated from a Jupyter notebook using nbsphinx_. .. note:: This is an `auto-generated <https://nbsphinx.readthedocs.io>`_ static view of a Jupyter notebook.
The original source is available at: https://github.com/gsi-upm/senpy/tree/master/docs//{{ env.doc2path(env.docname, base=None) }} To run the code examples in your computer, you may download the original notebook from the repository: https://github.com/gsi-upm/senpy/tree/master/docs/{{ env.doc2path(env.docname, base=None) }}
.. _nbsphinx: https://nbsphinx.readthedocs.io/
---- ----
""" """

View File

@@ -1,93 +1,152 @@
Conversion Automatic Model Conversion
---------- --------------------------
Senpy includes experimental support for emotion/sentiment conversion plugins. Senpy includes support for emotion and sentiment conversion.
When a user requests a specific model, senpy will choose a strategy to convert the model that the service usually outputs and the model requested by the user.
Out of the box, senpy can convert from the `emotionml:pad` (pleasure-arousal-dominance) dimensional model to `emoml:big6` (Ekman's big-6) categories, and vice versa.
This specific conversion uses a series of dimensional centroids (`emotionml:pad`) for each emotion category (`emotionml:big6`).
A dimensional value is converted to a category by looking for the nearest centroid.
The centroids are calculated according to this article:
.. code-block:: text
Kim, S. M., Valitutti, A., & Calvo, R. A. (2010, June).
Evaluation of unsupervised emotion models to textual affect recognition.
In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text (pp. 62-70).
Association for Computational Linguistics.
It is possible to add new conversion strategies by `Developing a conversion plugin`_.
Use Use
=== ===
Consider the original query: http://127.0.0.1:5000/api/?i=hello&algo=emotion-random Consider the following query to an emotion service: http://senpy.gsi.upm.es/api/emotion-anew?i=good
The requested plugin (emotion-random) returns emotions using Ekman's model (or big6 in EmotionML): The requested plugin (emotion-random) returns emotions using the VAD space (FSRE dimensions in EmotionML):
.. code:: json .. code:: json
... rest of the document ... [
{ {
"@type": "emotionSet", "@type": "EmotionSet",
"onyx:hasEmotion": { "onyx:hasEmotion": [
"@type": "emotion", {
"onyx:hasEmotionCategory": "emoml:big6anger" "@type": "Emotion",
}, "emoml:pad-dimensions_arousal": 5.43,
"prov:wasGeneratedBy": "plugins/emotion-random_0.1" "emoml:pad-dimensions_dominance": 6.41,
} "emoml:pad-dimensions_pleasure": 7.47,
"prov:wasGeneratedBy": "prefix:Analysis_1562744784.8789825"
}
],
"prov:wasGeneratedBy": "prefix:Analysis_1562744784.8789825"
}
]
To get these emotions in VAD space (FSRE dimensions in EmotionML), we'd do this:
http://127.0.0.1:5000/api/?i=hello&algo=emotion-random&emotionModel=emoml:fsre-dimensions To get the equivalent of these emotions in Ekman's categories (i.e., Ekman's Big 6 in EmotionML), we'd do this:
http://senpy.gsi.upm.es/api/emotion-anew?i=good&emotion-model=emoml:big6
This call, provided there is a valid conversion plugin from Ekman's to VAD, would return something like this: This call, provided there is a valid conversion plugin from Ekman's to VAD, would return something like this:
.. code:: json .. code:: json
[
... rest of the document ... {
"@type": "EmotionSet",
"onyx:hasEmotion": [
{ {
"@type": "emotionSet", "@type": "Emotion",
"onyx:hasEmotion": { "onyx:algorithmConfidence": 4.4979,
"@type": "emotion", "onyx:hasEmotionCategory": "emoml:big6happiness"
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emotion-random.1"
}, {
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"A": 7.22,
"D": 6.28,
"V": 8.6
},
"prov:wasGeneratedBy": "plugins/Ekman2VAD_0.1"
} }
],
"prov:wasDerivedFrom": {
"@id": "Emotions0",
"@type": "EmotionSet",
"onyx:hasEmotion": [
{
"@id": "Emotion0",
"@type": "Emotion",
"emoml:pad-dimensions_arousal": 5.43,
"emoml:pad-dimensions_dominance": 6.41,
"emoml:pad-dimensions_pleasure": 7.47,
"prov:wasGeneratedBy": "prefix:Analysis_1562745220.1553965"
}
],
"prov:wasGeneratedBy": "prefix:Analysis_1562745220.1553965"
},
"prov:wasGeneratedBy": "prefix:Analysis_1562745220.1570725"
}
]
That is called a *full* response, as it simply adds the converted emotion alongside. That is called a *full* response, as it simply adds the converted emotion alongside.
It is also possible to get the original emotion nested within the new converted emotion, using the `conversion=nested` parameter: It is also possible to get the original emotion nested within the new converted emotion, using the `conversion=nested` parameter:
http://senpy.gsi.upm.es/api/emotion-anew?i=good&emotion-model=emoml:big6&conversion=nested
.. code:: json .. code:: json
[
{
"@type": "EmotionSet",
"onyx:hasEmotion": [
{
"@type": "Emotion",
"onyx:algorithmConfidence": 4.4979,
"onyx:hasEmotionCategory": "emoml:big6happiness"
}
],
"prov:wasDerivedFrom": {
"@id": "Emotions0",
"@type": "EmotionSet",
"onyx:hasEmotion": [
{
"@id": "Emotion0",
"@type": "Emotion",
"emoml:pad-dimensions_arousal": 5.43,
"emoml:pad-dimensions_dominance": 6.41,
"emoml:pad-dimensions_pleasure": 7.47,
"prov:wasGeneratedBy": "prefix:Analysis_1562744962.896306"
}
],
"prov:wasGeneratedBy": "prefix:Analysis_1562744962.896306"
},
"prov:wasGeneratedBy": "prefix:Analysis_1562744962.8978968"
}
]
... rest of the document ...
{
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"onyx:hasEmotionCategory": "emoml:big6anger"
},
"prov:wasGeneratedBy": "plugins/emotion-random.1"
"onyx:wasDerivedFrom": {
"@type": "emotionSet",
"onyx:hasEmotion": {
"@type": "emotion",
"A": 7.22,
"D": 6.28,
"V": 8.6
},
"prov:wasGeneratedBy": "plugins/Ekman2VAD_0.1"
}
}
Lastly, `conversion=filtered` would only return the converted emotions. Lastly, `conversion=filtered` would only return the converted emotions.
.. code:: json
[
{
"@type": "EmotionSet",
"onyx:hasEmotion": [
{
"@type": "Emotion",
"onyx:algorithmConfidence": 4.4979,
"onyx:hasEmotionCategory": "emoml:big6happiness"
}
],
"prov:wasGeneratedBy": "prefix:Analysis_1562744925.7322266"
}
]
Developing a conversion plugin Developing a conversion plugin
================================ ==============================
Conversion plugins are discovered by the server just like any other plugin. Conversion plugins are discovered by the server just like any other plugin.
The difference is the slightly different API, and the need to specify the `source` and `target` of the conversion. The difference is the slightly different API, and the need to specify the `source` and `target` of the conversion.
@@ -106,7 +165,6 @@ For instance, an emotion conversion plugin needs the following:
.. code:: python .. code:: python
@@ -114,3 +172,6 @@ For instance, an emotion conversion plugin needs the following:
def convert(self, emotionSet, fromModel, toModel, params): def convert(self, emotionSet, fromModel, toModel, params):
pass pass
More implementation details are shown in the `centroids plugin <https://github.com/gsi-upm/senpy/blob/master/senpy/plugins/postprocessing/emotion/centroids.py>`_.

View File

@@ -2,7 +2,7 @@ Demo
---- ----
There is a demo available on http://senpy.gsi.upm.es/, where you can test a live instance of Senpy, with several open source plugins. There is a demo available on http://senpy.gsi.upm.es/, where you can test a live instance of Senpy, with several open source plugins.
You can use the playground (a web interface) or make HTTP requests to the service API. You can use the playground (a web interface) or the HTTP API.
.. image:: playground-0.20.png .. image:: playground-0.20.png
:target: http://senpy.gsi.upm.es :target: http://senpy.gsi.upm.es

View File

@@ -19,6 +19,7 @@ Sharing your sentiment analysis with the world has never been easier!
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
server-cli
plugins-quickstart plugins-quickstart
plugins-faq plugins-faq
plugins-definition plugins-definition

BIN
docs/eval_table.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

View File

@@ -2,7 +2,7 @@
"@context": [ "@context": [
"http://mixedemotions-project.eu/ns/context.jsonld", "http://mixedemotions-project.eu/ns/context.jsonld",
{ {
"emovoc": "http://www.gsi.dit.upm.es/ontologies/onyx/vocabularies/emotionml/ns#" "emovoc": "http://www.gsi.upm.es/ontologies/onyx/vocabularies/emotionml/ns#"
} }
], ],
"@id": "me:Result1", "@id": "me:Result1",

View File

@@ -5,31 +5,102 @@ Welcome to Senpy's documentation!
:target: http://senpy.readthedocs.io/en/latest/ :target: http://senpy.readthedocs.io/en/latest/
.. image:: https://badge.fury.io/py/senpy.svg .. image:: https://badge.fury.io/py/senpy.svg
:target: https://badge.fury.io/py/senpy :target: https://badge.fury.io/py/senpy
.. image:: https://lab.gsi.upm.es/senpy/senpy/badges/master/build.svg .. image:: https://travis-ci.org/gsi-upm/senpy.svg
:target: https://lab.gsi.upm.es/senpy/senpy/commits/master :target: https://github.com/gsi-upm/senpy/senpy/tree/master
.. image:: https://lab.gsi.upm.es/senpy/senpy/badges/master/coverage.svg
:target: https://lab.gsi.upm.es/senpy/senpy/commits/master
.. image:: https://img.shields.io/pypi/l/requests.svg .. image:: https://img.shields.io/pypi/l/requests.svg
:target: https://lab.gsi.upm.es/senpy/senpy/ :target: https://lab.gsi.upm.es/senpy/senpy/
Senpy is a framework to build sentiment and emotion analysis services.
It provides functionalities for:
Senpy is a framework for sentiment and emotion analysis services. - developing sentiment and emotion classifier and exposing them as an HTTP service
Senpy services are interchangeable and easy to use because they share a common semantic :doc:`apischema`. - requesting sentiment and emotion analysis from different providers (i.e. Vader, Sentimet140, ...) using the same interface (:doc:`apischema`). In this way, applications do not depend on the API offered for these services.
- combining services that use different sentiment model (e.g. polarity between [-1, 1] or [0,1] or emotion models (e.g. Ekkman or VAD)
- evaluating sentiment algorithms with well known datasets
If you interested in consuming Senpy services, read :doc:`Quickstart`.
Using senpy services is as simple as sending an HTTP request with your favourite tool or library.
Let's analyze the sentiment of the text "Senpy is awesome".
We can call the `Sentiment140 <http://www.sentiment140.com/>`_ service with an HTTP request using curl:
.. code:: shell
:emphasize-lines: 14,18
$ curl "http://senpy.gsi.upm.es/api/sentiment140" \
--data-urlencode "input=Senpy is awesome"
{
"@context": "http://senpy.gsi.upm.es/api/contexts/YXBpL3NlbnRpbWVudDE0MD8j",
"@type": "Results",
"entries": [
{
"@id": "prefix:",
"@type": "Entry",
"marl:hasOpinion": [
{
"@type": "Sentiment",
"marl:hasPolarity": "marl:Positive",
"prov:wasGeneratedBy": "prefix:Analysis_1554389334.6431913"
}
],
"nif:isString": "Senpy is awesome",
"onyx:hasEmotionSet": []
}
]
}
Congratulations, youve used your first senpy service!
You can observe the result: the polarity is positive (marl:Positive). The reason of this prefix is that Senpy follows a linked data approach.
You can analyze the same sentence using a different sentiment service (e.g. Vader) and requesting a different format (e.g. turtle):
.. code:: shell
$ curl "http://senpy.gsi.upm.es/api/sentiment-vader" \
--data-urlencode "input=Senpy is awesome" \
--data-urlencode "outformat=turtle"
@prefix : <http://www.gsi.upm.es/onto/senpy/ns#> .
@prefix endpoint: <http://senpy.gsi.upm.es/api/> .
@prefix marl: <http://www.gsi.upm.es/ontologies/marl/ns#> .
@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix prefix: <http://senpy.invalid/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix senpy: <http://www.gsi.upm.es/onto/senpy/ns#> .
prefix: a senpy:Entry ;
nif:isString "Senpy is awesome" ;
marl:hasOpinion [ a senpy:Sentiment ;
marl:hasPolarity "marl:Positive" ;
marl:polarityValue 6.72e-01 ;
prov:wasGeneratedBy prefix:Analysis_1562668175.9808676 ] .
[] a senpy:Results ;
prov:used prefix: .
As you see, Vader returns also the polarity value (0.67) in addition to the category (positive).
If you are interested in consuming Senpy services, read :doc:`Quickstart`.
To get familiar with the concepts behind Senpy, and what it can offer for service developers, check out :doc:`development`. To get familiar with the concepts behind Senpy, and what it can offer for service developers, check out :doc:`development`.
:doc:`apischema` contains information about the semantic models and vocabularies used by Senpy. :doc:`apischema` contains information about the semantic models and vocabularies used by Senpy.
.. toctree:: .. toctree::
:caption: Learn more about senpy: :caption: Learn more about senpy:
:maxdepth: 2 :maxdepth: 2
:hidden:
senpy senpy
demo demo
Quickstart.ipynb Quickstart.ipynb
installation installation
conversion
Evaluation.ipynb
apischema apischema
advanced development
publications publications
projects

View File

@@ -1,10 +1,10 @@
Installation Installation
------------ ------------
The stable version can be used in two ways: as a system/user library through pip, or as a docker image. The stable version can be used in two ways: as a system/user library through pip, or from a docker image.
The docker image is the recommended way because it is self-contained and isolated from the system, which means: Using docker is recommended because the image is self-contained, reproducible and isolated from the system, which means:
* Downloading and using it is just one command * It can be downloaded and run with just one simple command
* All dependencies are included * All dependencies are included
* It is OS-independent (MacOS, Windows, GNU/Linux) * It is OS-independent (MacOS, Windows, GNU/Linux)
* Several versions may coexist in the same machine without additional virtual environments * Several versions may coexist in the same machine without additional virtual environments
@@ -17,40 +17,39 @@ Through PIP
.. code:: bash .. code:: bash
pip install senpy
# Or with --user if you get permission errors:
pip install --user senpy pip install --user senpy
..
Alternatively, you can use the development version:
Alternatively, you can use the development version: .. code:: bash
.. code:: bash git clone git@github.com:gsi-upm/senpy
cd senpy
pip install --user .
git clone git@github.com:gsi-upm/senpy Each version is automatically tested on GNU/Linux, macOS and Windows 10.
cd senpy If you have trouble with the installation, please file an `issue on GitHub <https://github.com/gsi-upm/senpy/issues>`_.
pip install --user .
If you want to install senpy globally, use sudo instead of the ``--user`` flag.
Docker Image Docker Image
************ ************
The base image of senpy comes with some builtin plugins that you can use:
The base image of senpy comes with some built-in plugins that you can use:
.. code:: bash .. code:: bash
docker run -ti -p 5000:5000 gsiupm/senpy --host 0.0.0.0 docker run -ti -p 5000:5000 gsiupm/senpy --host 0.0.0.0
To add your custom plugins, you can use a docker volume: To use your custom plugins, you can add volume to the container:
.. code:: bash .. code:: bash
docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --host 0.0.0.0 --plugins -f /plugins docker run -ti -p 5000:5000 -v <PATH OF PLUGINS>:/plugins gsiupm/senpy --host 0.0.0.0 --plugins-folder /plugins
There is a Senpy image for **python 2**, too:
.. code:: bash
docker run -ti -p 5000:5000 gsiupm/senpy:python2.7 --host 0.0.0.0
Alias Alias

View File

@@ -9,20 +9,21 @@ Lastly, it is also possible to add new plugins programmatically.
.. contents:: :local: .. contents:: :local:
What is a plugin? ..
================= What is a plugin?
=================
A plugin is a program that, given a text, will add annotations to it. A plugin is a program that, given a text, will add annotations to it.
In practice, a plugin consists of at least two files: In practice, a plugin consists of at least two files:
- Definition file: a `.senpy` file that describes the plugin (e.g. what input parameters it accepts, what emotion model it uses). - Definition file: a `.senpy` file that describes the plugin (e.g. what input parameters it accepts, what emotion model it uses).
- Python module: the actual code that will add annotations to each input. - Python module: the actual code that will add annotations to each input.
This separation allows us to deploy plugins that use the same code but employ different parameters. This separation allows us to deploy plugins that use the same code but employ different parameters.
For instance, one could use the same classifier and processing in several plugins, but train with different datasets. For instance, one could use the same classifier and processing in several plugins, but train with different datasets.
This scenario is particularly useful for evaluation purposes. This scenario is particularly useful for evaluation purposes.
The only limitation is that the name of each plugin needs to be unique. The only limitation is that the name of each plugin needs to be unique.
Definition files Definition files
================ ================
@@ -109,5 +110,3 @@ Now, in a file named ``helloworld.py``:
sentiment['marl:hasPolarity'] = 'marl:Negative' sentiment['marl:hasPolarity'] = 'marl:Negative'
entry.sentiments.append(sentiment) entry.sentiments.append(sentiment)
yield entry yield entry
The complete code of the example plugin is available `here <https://lab.gsi.upm.es/senpy/plugin-prueba>`__.

View File

@@ -23,7 +23,7 @@ In practice, this is what a plugin looks like, tests included:
.. literalinclude:: ../example-plugins/rand_plugin.py .. literalinclude:: ../example-plugins/rand_plugin.py
:emphasize-lines: 5-11 :emphasize-lines: 21-28
:language: python :language: python

View File

@@ -37,7 +37,8 @@ The framework consists of two main modules: Senpy core, which is the building bl
What is a plugin? What is a plugin?
################# #################
A plugin is a python object that can process entries. Given an entry, it will modify it, add annotations to it, or generate new entries. A plugin is a python object that can process entries.
Given an entry, it will modify it, add annotations to it, or generate new entries.
What is an entry? What is an entry?

49
docs/projects.rst Normal file
View File

@@ -0,0 +1,49 @@
Projects using Senpy
--------------------
Are you using Senpy in your work?, we would love to hear from you!
Here is a list of on-going and past projects that have benefited from senpy:
MixedEmotions
,,,,,,,,,,,,,
`MixedEmotions <https://mixedemotions-project.eu/>`_ develops innovative multilingual multi-modal Big Data analytics applications.
The analytics relies on a common toolbox for multi-modal sentiment and emotion analysis.
The NLP parts of the toolbox are based on senpy and its API.
The toolbox is featured in this publication:
.. code-block:: text
Buitelaar, P., Wood, I. D., Arcan, M., McCrae, J. P., Abele, A., Robin, C., … Tummarello, G. (2018).
MixedEmotions: An Open-Source Toolbox for Multi-Modal Emotion Analysis.
IEEE Transactions on Multimedia.
EuroSentiment
,,,,,,,,,,,,,
The aim of the EUROSENTIMENT project was to create a pool for multilingual language resources and services for Sentiment Analysis.
The EuroSentiment project was the main motivation behind the development of Senpy, and some early versions were used:
.. code-block:: text
Sánchez-Rada, J. F., Vulcu, G., Iglesias, C. A., & Buitelaar, P. (2014).
EUROSENTIMENT: Linked Data Sentiment Analysis.
Proceedings of the ISWC 2014 Posters & Demonstrations Track
13th International Semantic Web Conference (ISWC 2014) (Vol. 1272, pp. 145148).
SoMeDi
,,,,,,
`SoMeDi <https://itea3.org/project/somedi.html>`_ is an ITEA3 project to research machine learning and artificial intelligence techniques that can be used to turn digital interaction data into Digital Interaction Intelligence and approaches that can be used to effectively enter and act in social media, and to automate this process.
SoMeDi exploits senpy's interoperability of services in their customizable data enrichment and NLP workflows.
TRIVALENT
,,,,,,,,,
`TRIVALENT <https://trivalent-project.eu/>`_ is an EU funded project which aims to a better understanding of root causes of the phenomenon of violent radicalisation in Europe in order to develop appropriate countermeasures, ranging from early detection methodologies to techniques of counter-narrative.
In addition to sentiment and emotion analysis services, trivalent provides other types of senpy services such as radicalism and writing style analysis.

View File

@@ -2,7 +2,7 @@ Publications
============ ============
If you use Senpy in your research, please cite `Senpy: A Pragmatic Linked Sentiment Analysis Framework <http://gsi.dit.upm.es/index.php/es/investigacion/publicaciones?view=publication&task=show&id=417>`__ (`BibTex <http://gsi.dit.upm.es/index.php/es/investigacion/publicaciones?controller=publications&task=export&format=bibtex&id=417>`__): And if you use Senpy in your research, please cite `Senpy: A Pragmatic Linked Sentiment Analysis Framework <http://gsi.upm.es/index.php/es/investigacion/publicaciones?view=publication&task=show&id=417>`__ (`BibTex <http://gsi.upm.es/index.php/es/investigacion/publicaciones?controller=publications&task=export&format=bibtex&id=417>`__):
.. code-block:: text .. code-block:: text
@@ -12,7 +12,6 @@ If you use Senpy in your research, please cite `Senpy: A Pragmatic Linked Sentim
2016 IEEE International Conference on (pp. 735-742). IEEE. 2016 IEEE International Conference on (pp. 735-742). IEEE.
Senpy uses Onyx for emotion representation, first introduced in: Senpy uses Onyx for emotion representation, first introduced in:
.. code-block:: text .. code-block:: text
@@ -28,15 +27,6 @@ Senpy uses Marl for sentiment representation, which was presented in:
Westerski, A., Iglesias Fernandez, C. A., & Tapia Rico, F. (2011). Westerski, A., Iglesias Fernandez, C. A., & Tapia Rico, F. (2011).
Linked opinions: Describing sentiments on the structured web of data. Linked opinions: Describing sentiments on the structured web of data.
Senpy has been used extensively in the toolbox of the MixedEmotions project:
.. code-block:: text
Buitelaar, P., Wood, I. D., Arcan, M., McCrae, J. P., Abele, A., Robin, C., … Tummarello, G. (2018).
MixedEmotions: An Open-Source Toolbox for Multi-Modal Emotion Analysis.
IEEE Transactions on Multimedia.
The representation models, formats and challenges are partially covered in a chapter of the book Sentiment Analysis in Social Networks: The representation models, formats and challenges are partially covered in a chapter of the book Sentiment Analysis in Social Networks:
.. code-block:: text .. code-block:: text

View File

@@ -1,5 +1,8 @@
Server Command line tool
====== =================
Basic usage
-----------
The senpy server is launched via the `senpy` command: The senpy server is launched via the `senpy` command:
@@ -7,8 +10,8 @@ The senpy server is launched via the `senpy` command:
usage: senpy [-h] [--level logging_level] [--log-format log_format] [--debug] usage: senpy [-h] [--level logging_level] [--log-format log_format] [--debug]
[--no-default-plugins] [--host HOST] [--port PORT] [--no-default-plugins] [--host HOST] [--port PORT]
[--plugins-folder PLUGINS_FOLDER] [--only-install] [--only-test] [--plugins-folder PLUGINS_FOLDER] [--install]
[--test] [--only-list] [--data-folder DATA_FOLDER] [--test] [--no-run] [--data-folder DATA_FOLDER]
[--no-threaded] [--no-deps] [--version] [--allow-fail] [--no-threaded] [--no-deps] [--version] [--allow-fail]
Run a Senpy server Run a Senpy server
@@ -25,10 +28,9 @@ The senpy server is launched via the `senpy` command:
--port PORT, -p PORT Port to listen on. --port PORT, -p PORT Port to listen on.
--plugins-folder PLUGINS_FOLDER, -f PLUGINS_FOLDER --plugins-folder PLUGINS_FOLDER, -f PLUGINS_FOLDER
Where to look for plugins. Where to look for plugins.
--only-install, -i Do not run a server, only install plugin dependencies --install, -i Install plugin dependencies before launching the server.
--only-test Do not run a server, just test all plugins
--test, -t Test all plugins before launching the server --test, -t Test all plugins before launching the server
--only-list, --list Do not run a server, only list plugins found --no-run Do not launch the server
--data-folder DATA_FOLDER, --data DATA_FOLDER --data-folder DATA_FOLDER, --data DATA_FOLDER
Where to look for data. It be set with the SENPY_DATA Where to look for data. It be set with the SENPY_DATA
environment variable as well. environment variable as well.
@@ -70,3 +72,14 @@ For instance, to accept connections on port 6000 on any interface:
senpy --host 0.0.0.0 --port 6000 senpy --host 0.0.0.0 --port 6000
For more options, see the `--help` page. For more options, see the `--help` page.
Sentiment analysis in the command line
--------------------------------------
Although the main use of senpy is to publish services, the tool can also be used locally to analyze text in the command line.
This is a short video demonstration:
.. image:: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk.png
:width: 100%
:target: https://asciinema.org/a/9uwef1ghkjk062cw2t4mhzpyk
:alt: CLI demo

View File

@@ -13,9 +13,9 @@ An overview of the vocabularies and their use can be found in [4].
[1] Guidelines for developing NIF-based NLP services, Final Community Group Report 22 December 2015 Available at: https://www.w3.org/2015/09/bpmlod-reports/nif-based-nlp-webservices/ [1] Guidelines for developing NIF-based NLP services, Final Community Group Report 22 December 2015 Available at: https://www.w3.org/2015/09/bpmlod-reports/nif-based-nlp-webservices/
[2] Marl Ontology Specification, available at http://www.gsi.dit.upm.es/ontologies/marl/ [2] Marl Ontology Specification, available at http://www.gsi.upm.es/ontologies/marl/
[3] Onyx Ontology Specification, available at http://www.gsi.dit.upm.es/ontologies/onyx/ [3] Onyx Ontology Specification, available at http://www.gsi.upm.es/ontologies/onyx/
[4] Iglesias, C. A., Sánchez-Rada, J. F., Vulcu, G., & Buitelaar, P. (2017). Linked Data Models for Sentiment and Emotion Analysis in Social Networks. In Sentiment Analysis in Social Networks (pp. 49-69). [4] Iglesias, C. A., Sánchez-Rada, J. F., Vulcu, G., & Buitelaar, P. (2017). Linked Data Models for Sentiment and Emotion Analysis in Social Networks. In Sentiment Analysis in Social Networks (pp. 49-69).

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import AnalysisPlugin from senpy import AnalysisPlugin
import multiprocessing import multiprocessing

View File

@@ -1,5 +1,21 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
emoticons = { emoticons = {
'pos': [':)', ':]', '=)', ':D'], 'pos': [':)', ':]', '=)', ':D'],

View File

@@ -1,5 +1,20 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import easy_test, models, plugins from senpy import easy_test, models, plugins

View File

@@ -1,5 +1,20 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import easy_test, SentimentBox from senpy import easy_test, SentimentBox

View File

@@ -1,5 +1,21 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import easy_test, SentimentBox from senpy import easy_test, SentimentBox

View File

@@ -1,5 +1,21 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import easy_test, models, plugins from senpy import easy_test, models, plugins

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import AnalysisPlugin, easy from senpy import AnalysisPlugin, easy

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import AnalysisPlugin, easy from senpy import AnalysisPlugin, easy

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import random import random
from senpy.plugins import EmotionPlugin from senpy.plugins import EmotionPlugin

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import noop import noop
from senpy.plugins import SentimentPlugin from senpy.plugins import SentimentPlugin

View File

@@ -1,3 +1,4 @@
module: mynoop module: mynoop
optional: true
requirements: requirements:
- noop - noop

View File

@@ -1,5 +1,21 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import easy_test, models, plugins from senpy import easy_test, models, plugins

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import random import random
from senpy import SentimentPlugin, Sentiment, Entry from senpy import SentimentPlugin, Sentiment, Entry

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
''' '''
Create a dummy dataset. Create a dummy dataset.
Messages with a happy emoticon are labelled positive Messages with a happy emoticon are labelled positive

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from sklearn.pipeline import Pipeline from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split from sklearn.model_selection import train_test_split
@@ -15,7 +31,7 @@ pipeline = Pipeline([('cv', count_vec),
('clf', clf3)]) ('clf', clf3)])
pipeline.fit(X_train, y_train) pipeline.fit(X_train, y_train)
print('Feature names: {}'.format(count_vec.get_feature_names())) print('Feature names: {}'.format(count_vec.get_feature_names_out()))
print('Class count: {}'.format(clf3.class_count_)) print('Class count: {}'.format(clf3.class_count_))

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import SentimentBox, easy_test from senpy import SentimentBox, easy_test
from mypipeline import pipeline from mypipeline import pipeline

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy.plugins import AnalysisPlugin from senpy.plugins import AnalysisPlugin
from time import sleep from time import sleep

View File

@@ -1 +1,6 @@
gsitk>0.1.9.1 gsitk>0.1.9.1
flask_cors==3.0.10
Pattern==3.6
lxml==4.9.3
pandas==2.1.1
textblob==0.17.1

View File

@@ -1,20 +1,24 @@
--- ---
apiVersion: extensions/v1beta1 apiVersion: apps/v1
kind: Deployment kind: Deployment
metadata: metadata:
name: senpy-latest name: senpy-latest
spec: spec:
replicas: 1 replicas: 1
selector:
matchLabels:
app: senpy-latest
template: template:
metadata: metadata:
labels: labels:
app: senpy-latest
role: senpy-latest role: senpy-latest
app: test
spec: spec:
containers: containers:
- name: senpy-latest - name: senpy-latest
image: $IMAGEWTAG image: $IMAGEWTAG
imagePullPolicy: Always imagePullPolicy: Always
args: ["--enable-cors"]
resources: resources:
limits: limits:
memory: "512Mi" memory: "512Mi"
@@ -22,3 +26,11 @@ spec:
ports: ports:
- name: web - name: web
containerPort: 5000 containerPort: 5000
volumeMounts:
- name: senpy-data
mountPath: /senpy-data
subPath: data
volumes:
- name: senpy-data
persistentVolumeClaim:
claimName: pvc-senpy

View File

@@ -1,21 +1,29 @@
--- ---
apiVersion: extensions/v1beta1 apiVersion: networking.k8s.io/v1
kind: Ingress kind: Ingress
metadata: metadata:
name: senpy-ingress name: senpy-ingress
labels:
app: senpy-latest
spec: spec:
rules: rules:
- host: latest.senpy.cluster.gsi.dit.upm.es - host: senpy-latest.gsi.upm.es
http: http:
paths: paths:
- path: / - path: /
pathType: Prefix
backend: backend:
serviceName: senpy-latest service:
servicePort: 5000 name: senpy-latest
port:
number: 5000
- host: latest.senpy.gsi.upm.es - host: latest.senpy.gsi.upm.es
http: http:
paths: paths:
- path: / - path: /
pathType: Prefix
backend: backend:
serviceName: senpy-latest service:
servicePort: 5000 name: senpy-latest
port:
number: 5000

View File

@@ -3,10 +3,12 @@ apiVersion: v1
kind: Service kind: Service
metadata: metadata:
name: senpy-latest name: senpy-latest
labels:
app: senpy-latest
spec: spec:
type: ClusterIP type: ClusterIP
ports: ports:
- port: 5000 - port: 5000
protocol: TCP protocol: TCP
selector: selector:
role: senpy-latest app: senpy-latest

View File

@@ -7,8 +7,7 @@ future
jsonschema jsonschema
jsonref jsonref
PyYAML PyYAML
rdflib rdflib==6.1.1
rdflib-jsonld
numpy numpy
scipy scipy
scikit-learn>=0.20 scikit-learn>=0.20

View File

@@ -1,7 +1,7 @@
#!/usr/bin/python #!/usr/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes #
# DIT, UPM # Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
@@ -14,6 +14,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
#
""" """
Sentiment analysis server in Python Sentiment analysis server in Python
""" """

View File

@@ -1,7 +1,6 @@
#!/usr/bin/python #!/usr/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes # Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
# DIT, UPM
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
@@ -23,6 +22,8 @@ the server.
from flask import Flask from flask import Flask
from senpy.extensions import Senpy from senpy.extensions import Senpy
from senpy.utils import easy_test from senpy.utils import easy_test
from senpy.plugins import list_dependencies
from senpy import config
import logging import logging
import os import os
@@ -42,6 +43,11 @@ def main():
type=str, type=str,
default="INFO", default="INFO",
help='Logging level') help='Logging level')
parser.add_argument(
'--no-proxy-fix',
action='store_true',
default=False,
help='Do not assume senpy will be running behind a proxy (e.g., nginx)')
parser.add_argument( parser.add_argument(
'--log-format', '--log-format',
metavar='log_format', metavar='log_format',
@@ -77,16 +83,21 @@ def main():
action='append', action='append',
help='Where to look for plugins.') help='Where to look for plugins.')
parser.add_argument( parser.add_argument(
'--only-install', '--install',
'-i', '-i',
action='store_true', action='store_true',
default=False, default=False,
help='Do not run a server, only install plugin dependencies') help='Install plugin dependencies before running.')
parser.add_argument( parser.add_argument(
'--only-test', '--dependencies',
action='store_true', action='store_true',
default=False, default=False,
help='Do not run a server, just test all plugins') help='List plugin dependencies')
parser.add_argument(
'--strict',
action='store_true',
default=config.strict,
help='Fail if optional plugins cannot be loaded.')
parser.add_argument( parser.add_argument(
'--test', '--test',
'-t', '-t',
@@ -94,11 +105,10 @@ def main():
default=False, default=False,
help='Test all plugins before launching the server') help='Test all plugins before launching the server')
parser.add_argument( parser.add_argument(
'--only-list', '--no-run',
'--list',
action='store_true', action='store_true',
default=False, default=False,
help='Do not run a server, only list plugins found') help='Do not launch the server.')
parser.add_argument( parser.add_argument(
'--data-folder', '--data-folder',
'--data', '--data',
@@ -128,6 +138,12 @@ def main():
action='store_true', action='store_true',
default=False, default=False,
help='Do not exit if some plugins fail to activate') help='Do not exit if some plugins fail to activate')
parser.add_argument(
'--enable-cors',
'--cors',
action='store_true',
default=False,
help='Enable CORS for all domains (requires flask-cors to be installed)')
args = parser.parse_args() args = parser.parse_args()
print('Senpy version {}'.format(senpy.__version__)) print('Senpy version {}'.format(senpy.__version__))
print(sys.version) print(sys.version)
@@ -142,9 +158,12 @@ def main():
app = Flask(__name__) app = Flask(__name__)
app.debug = args.debug app.debug = args.debug
sp = Senpy(app, sp = Senpy(app,
plugin_folder=None, plugin_folder=None,
default_plugins=not args.no_default_plugins, default_plugins=not args.no_default_plugins,
install=args.install,
strict=args.strict,
data_folder=args.data_folder) data_folder=args.data_folder)
folders = list(args.plugins_folder) if args.plugins_folder else [] folders = list(args.plugins_folder) if args.plugins_folder else []
if not folders: if not folders:
@@ -164,20 +183,54 @@ def main():
fpath, fpath,
maxname=maxname, maxname=maxname,
maxversion=maxversion)) maxversion=maxversion))
if args.only_list: if args.dependencies:
return print('Listing dependencies')
if not args.no_deps: missing = []
installed = []
for plug in sp.plugins(is_activated=False):
inst, miss, nltkres = list_dependencies(plug)
if not any([inst, miss, nltkres]):
continue
print(f'Plugin: {plug.id}')
for m in miss:
missing.append(f'{m} # {plug.id}')
for i in inst:
installed.append(f'{i} # {plug.id}')
if installed:
print('Installed packages:')
for i in installed:
print(f'\t{i}')
if missing:
print('Missing packages:')
for m in missing:
print(f'\t{m}')
if args.install:
sp.install_deps() sp.install_deps()
if args.only_install:
if args.test:
sp.activate_all(sync=True)
easy_test(sp.plugins(is_activated=True), debug=args.debug)
if args.no_run:
return return
sp.activate_all(allow_fail=args.allow_fail)
if args.test or args.only_test: sp.activate_all(sync=True)
easy_test(sp.plugins(), debug=args.debug) if sp.strict:
if args.only_test: inactive = sp.plugins(is_activated=False)
return assert not inactive
print('Senpy version {}'.format(senpy.__version__)) print('Senpy version {}'.format(senpy.__version__))
print('Server running on port %s:%d. Ctrl+C to quit' % (args.host, print('Server running on port %s:%d. Ctrl+C to quit' % (args.host,
args.port)) args.port))
if args.enable_cors:
from flask_cors import CORS
CORS(app)
if not args.no_proxy_fix:
from werkzeug.middleware.proxy_fix import ProxyFix
app.wsgi_app = ProxyFix(app.wsgi_app)
try: try:
app.run(args.host, app.run(args.host,
args.port, args.port,

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from future.utils import iteritems from future.utils import iteritems
from .models import Error, Results, Entry, from_string from .models import Error, Results, Entry, from_string
import logging import logging

View File

@@ -1,19 +1,20 @@
#!/usr/bin/python #!/usr/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
# Copyright 2014 J. Fernando Sánchez Rada - Grupo de Sistemas Inteligentes
# DIT, UPM
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# #
# Unless required by applicable law or agreed to in writing, software # http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, # distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
#
""" """
Blueprints for Senpy Blueprints for Senpy
""" """
@@ -24,7 +25,7 @@ from . import api
from .version import __version__ from .version import __version__
from functools import wraps from functools import wraps
from .gsitk_compat import GSITK_AVAILABLE from .gsitk_compat import GSITK_AVAILABLE, datasets
import logging import logging
import json import json
@@ -203,8 +204,8 @@ def basic_api(f):
return decorated_function return decorated_function
@api_blueprint.route('/', defaults={'plugins': None}, methods=['POST', 'GET']) @api_blueprint.route('/', defaults={'plugins': None}, methods=['POST', 'GET'], strict_slashes=False)
@api_blueprint.route('/<path:plugins>', methods=['POST', 'GET']) @api_blueprint.route('/<path:plugins>', methods=['POST', 'GET'], strict_slashes=False)
@basic_api @basic_api
def api_root(plugins): def api_root(plugins):
if plugins: if plugins:
@@ -239,7 +240,7 @@ def api_root(plugins):
return results return results
@api_blueprint.route('/evaluate/', methods=['POST', 'GET']) @api_blueprint.route('/evaluate', methods=['POST', 'GET'], strict_slashes=False)
@basic_api @basic_api
def evaluate(): def evaluate():
if request.parameters['help']: if request.parameters['help']:
@@ -252,7 +253,7 @@ def evaluate():
return response return response
@api_blueprint.route('/plugins/', methods=['POST', 'GET']) @api_blueprint.route('/plugins', methods=['POST', 'GET'], strict_slashes=False)
@basic_api @basic_api
def plugins(): def plugins():
sp = current_app.senpy sp = current_app.senpy
@@ -263,17 +264,15 @@ def plugins():
return dic return dic
@api_blueprint.route('/plugins/<plugin>/', methods=['POST', 'GET']) @api_blueprint.route('/plugins/<plugin>', methods=['POST', 'GET'], strict_slashes=False)
@basic_api @basic_api
def plugin(plugin): def plugin(plugin):
sp = current_app.senpy sp = current_app.senpy
return sp.get_plugin(plugin) return sp.get_plugin(plugin)
@api_blueprint.route('/datasets/', methods=['POST', 'GET']) @api_blueprint.route('/datasets', methods=['POST', 'GET'], strict_slashes=False)
@basic_api @basic_api
def datasets(): def get_datasets():
sp = current_app.senpy
datasets = sp.datasets
dic = Datasets(datasets=list(datasets.values())) dic = Datasets(datasets=list(datasets.values()))
return dic return dic

View File

@@ -1,3 +1,18 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from __future__ import print_function from __future__ import print_function
import sys import sys

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import requests import requests
import logging import logging
from . import models from . import models

7
senpy/config.py Normal file
View File

@@ -0,0 +1,7 @@
import os
strict = os.environ.get('SENPY_STRICT', '').lower() not in ["", "false", "f"]
data_folder = os.environ.get('SENPY_DATA', None)
if data_folder:
data_folder = os.path.abspath(data_folder)
testing = os.environ.get('SENPY_TESTING', "") != ""

View File

@@ -1,3 +1,18 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
""" """
Main class for Senpy. Main class for Senpy.
It orchestrates plugin (de)activation and analysis. It orchestrates plugin (de)activation and analysis.
@@ -5,6 +20,7 @@ It orchestrates plugin (de)activation and analysis.
from future import standard_library from future import standard_library
standard_library.install_aliases() standard_library.install_aliases()
from . import config
from . import plugins, api from . import plugins, api
from .models import Error, AggregatedEvaluation from .models import Error, AggregatedEvaluation
from .plugins import AnalysisPlugin from .plugins import AnalysisPlugin
@@ -29,8 +45,11 @@ class Senpy(object):
app=None, app=None,
plugin_folder=".", plugin_folder=".",
data_folder=None, data_folder=None,
install=False,
strict=None,
default_plugins=False): default_plugins=False):
default_data = os.path.join(os.getcwd(), 'senpy_data') default_data = os.path.join(os.getcwd(), 'senpy_data')
self.data_folder = data_folder or os.environ.get('SENPY_DATA', default_data) self.data_folder = data_folder or os.environ.get('SENPY_DATA', default_data)
try: try:
@@ -42,6 +61,8 @@ class Senpy(object):
raise raise
self._default = None self._default = None
self.strict = strict if strict is not None else config.strict
self.install = install
self._plugins = {} self._plugins = {}
if plugin_folder: if plugin_folder:
self.add_folder(plugin_folder) self.add_folder(plugin_folder)
@@ -133,7 +154,8 @@ class Senpy(object):
logger.debug("Adding folder: %s", folder) logger.debug("Adding folder: %s", folder)
if os.path.isdir(folder): if os.path.isdir(folder):
new_plugins = plugins.from_folder([folder], new_plugins = plugins.from_folder([folder],
data_folder=self.data_folder) data_folder=self.data_folder,
strict=self.strict)
for plugin in new_plugins: for plugin in new_plugins:
self.add_plugin(plugin) self.add_plugin(plugin)
else: else:
@@ -158,7 +180,7 @@ class Senpy(object):
logger.info('Installing dependencies') logger.info('Installing dependencies')
# If a plugin is activated, its dependencies should already be installed # If a plugin is activated, its dependencies should already be installed
# Otherwise, it would've failed to activate. # Otherwise, it would've failed to activate.
plugins.install_deps(*self.plugins(is_activated=False)) plugins.install_deps(*self._plugins.values())
def analyse(self, request, analyses=None): def analyse(self, request, analyses=None):
""" """
@@ -274,36 +296,16 @@ class Senpy(object):
return response return response
def _get_datasets(self, request): def _get_datasets(self, request):
if not self.datasets:
raise Error(
status=404,
message=("No datasets found."
" Please verify DatasetManager"))
datasets_name = request.parameters.get('dataset', None).split(',') datasets_name = request.parameters.get('dataset', None).split(',')
for dataset in datasets_name: for dataset in datasets_name:
if dataset not in self.datasets: if dataset not in gsitk_compat.datasets:
logger.debug(("The dataset '{}' is not valid\n" logger.debug(("The dataset '{}' is not valid\n"
"Valid datasets: {}").format( "Valid datasets: {}").format(
dataset, self.datasets.keys())) dataset, gsitk_compat.datasets.keys()))
raise Error( raise Error(
status=404, status=404,
message="The dataset '{}' is not valid".format(dataset)) message="The dataset '{}' is not valid".format(dataset))
dm = gsitk_compat.DatasetManager() return datasets_name
datasets = dm.prepare_datasets(datasets_name)
return datasets
@property
def datasets(self):
self._dataset_list = {}
dm = gsitk_compat.DatasetManager()
for item in dm.get_datasets():
for key in item:
if key in self._dataset_list:
continue
properties = item[key]
properties['@id'] = key
self._dataset_list[key] = properties
return self._dataset_list
def evaluate(self, params): def evaluate(self, params):
logger.debug("evaluating request: {}".format(params)) logger.debug("evaluating request: {}".format(params))
@@ -345,13 +347,13 @@ class Senpy(object):
else: else:
self._default = self._plugins[value.lower()] self._default = self._plugins[value.lower()]
def activate_all(self, sync=True, allow_fail=False): def activate_all(self, sync=True):
ps = [] ps = []
for plug in self._plugins.keys(): for plug in self._plugins.keys():
try: try:
self.activate_plugin(plug, sync=sync) self.activate_plugin(plug, sync=sync)
except Exception as ex: except Exception as ex:
if not allow_fail: if self.strict:
raise raise
logger.error('Could not activate {}: {}'.format(plug, ex)) logger.error('Could not activate {}: {}'.format(plug, ex))
return ps return ps
@@ -363,15 +365,20 @@ class Senpy(object):
return ps return ps
def _activate(self, plugin): def _activate(self, plugin):
success = False
with plugin._lock: with plugin._lock:
if plugin.is_activated: if plugin.is_activated:
return return
plugin._activate() try:
msg = "Plugin activated: {}".format(plugin.name) logger.info("Activating plugin: {}".format(plugin.name))
logger.info(msg)
success = plugin.is_activated assert plugin._activate()
return success logger.info(f"Plugin activated: {plugin.name}")
except Exception as ex:
if getattr(plugin, "optional", False) and not self.strict:
logger.info(f"Plugin could NOT be activated: {plugin.name}")
return False
raise
return plugin.is_activated
def activate_plugin(self, plugin_name, sync=True): def activate_plugin(self, plugin_name, sync=True):
plugin_name = plugin_name.lower() plugin_name = plugin_name.lower()

View File

@@ -1,4 +1,21 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging import logging
import os
from pkg_resources import parse_version, get_distribution, DistributionNotFound from pkg_resources import parse_version, get_distribution, DistributionNotFound
@@ -17,15 +34,34 @@ try:
gsitk_distro = get_distribution("gsitk") gsitk_distro = get_distribution("gsitk")
GSITK_VERSION = parse_version(gsitk_distro.version) GSITK_VERSION = parse_version(gsitk_distro.version)
if not os.environ.get('DATA_PATH'):
os.environ['DATA_PATH'] = os.environ.get('SENPY_DATA', 'senpy_data')
from gsitk.datasets.datasets import DatasetManager from gsitk.datasets.datasets import DatasetManager
from gsitk.evaluation.evaluation import Evaluation as Eval # noqa: F401 from gsitk.evaluation.evaluation import Evaluation as Eval # noqa: F401
from gsitk.evaluation.evaluation import EvalPipeline # noqa: F401 from gsitk.evaluation.evaluation import EvalPipeline # noqa: F401
from sklearn.pipeline import Pipeline from sklearn.pipeline import Pipeline
modules = locals() modules = locals()
GSITK_AVAILABLE = True GSITK_AVAILABLE = True
datasets = {}
manager = DatasetManager()
for item in manager.get_datasets():
for key in item:
if key in datasets:
continue
properties = item[key]
properties['@id'] = key
datasets[key] = properties
def prepare(ds, *args, **kwargs):
return manager.prepare_datasets(ds, *args, **kwargs)
except (DistributionNotFound, ImportError) as err: except (DistributionNotFound, ImportError) as err:
logger.debug('Error importing GSITK: {}'.format(err)) logger.debug('Error importing GSITK: {}'.format(err))
logger.warning(IMPORTMSG) logger.warning(IMPORTMSG)
GSITK_AVAILABLE = False GSITK_AVAILABLE = False
GSITK_VERSION = () GSITK_VERSION = ()
DatasetManager = Eval = Pipeline = raise_exception DatasetManager = Eval = Pipeline = prepare = raise_exception
datasets = {}

View File

@@ -1,3 +1,18 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
''' '''
Meta-programming for the models. Meta-programming for the models.
''' '''
@@ -8,7 +23,8 @@ import inspect
import copy import copy
from abc import ABCMeta from abc import ABCMeta
from collections import MutableMapping, namedtuple from collections import namedtuple
from collections.abc import MutableMapping
class BaseMeta(ABCMeta): class BaseMeta(ABCMeta):

View File

@@ -1,3 +1,18 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
''' '''
Senpy Models. Senpy Models.
@@ -201,7 +216,7 @@ class BaseModel(with_metaclass(BaseMeta, CustomDict)):
logger.debug( logger.debug(
'Parsing with prefix: {}'.format(kwargs.get('prefix'))) 'Parsing with prefix: {}'.format(kwargs.get('prefix')))
content = g.serialize(format=format, content = g.serialize(format=format,
prefix=prefix).decode('utf-8') prefix=prefix)
mimetype = 'text/{}'.format(format) mimetype = 'text/{}'.format(format)
else: else:
raise Error('Unknown outformat: {}'.format(format)) raise Error('Unknown outformat: {}'.format(format))

View File

@@ -1,5 +1,21 @@
#!/usr/local/bin/python #!/usr/local/bin/python
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from future import standard_library from future import standard_library
standard_library.install_aliases() standard_library.install_aliases()
@@ -30,6 +46,7 @@ from .. import models, utils
from .. import api from .. import api
from .. import gsitk_compat from .. import gsitk_compat
from .. import testing from .. import testing
from .. import config
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -45,7 +62,7 @@ class PluginMeta(models.BaseMeta):
plugin_type.add(name) plugin_type.add(name)
alias = attrs.get('name', name).lower() alias = attrs.get('name', name).lower()
attrs['_plugin_type'] = plugin_type attrs['_plugin_type'] = plugin_type
logger.debug('Adding new plugin class', name, bases, attrs, plugin_type) logger.debug('Adding new plugin class: %s %s %s %s', name, bases, attrs, plugin_type)
attrs['name'] = alias attrs['name'] = alias
if 'description' not in attrs: if 'description' not in attrs:
doc = attrs.get('__doc__', None) doc = attrs.get('__doc__', None)
@@ -59,7 +76,7 @@ class PluginMeta(models.BaseMeta):
cls = super(PluginMeta, mcs).__new__(mcs, name, bases, attrs) cls = super(PluginMeta, mcs).__new__(mcs, name, bases, attrs)
if alias in mcs._classes: if alias in mcs._classes:
if os.environ.get('SENPY_TESTING', ""): if config.testing:
raise Exception( raise Exception(
('The type of plugin {} already exists. ' ('The type of plugin {} already exists. '
'Please, choose a different name').format(name)) 'Please, choose a different name').format(name))
@@ -94,7 +111,7 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
Provides a canonical name for plugins and serves as base for other Provides a canonical name for plugins and serves as base for other
kinds of plugins. kinds of plugins.
""" """
logger.debug("Initialising {}".format(info)) logger.debug("Initialising %s", info)
super(Plugin, self).__init__(**kwargs) super(Plugin, self).__init__(**kwargs)
if info: if info:
self.update(info) self.update(info)
@@ -106,7 +123,13 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
self._directory = os.path.abspath( self._directory = os.path.abspath(
os.path.dirname(inspect.getfile(self.__class__))) os.path.dirname(inspect.getfile(self.__class__)))
data_folder = data_folder or os.getcwd() if not data_folder:
data_folder = config.data_folder
if not data_folder:
data_folder = os.getcwd()
data_folder = os.path.abspath(data_folder)
subdir = os.path.join(data_folder, self.name) subdir = os.path.join(data_folder, self.name)
self._data_paths = [ self._data_paths = [
@@ -139,8 +162,11 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
return os.path.dirname(inspect.getfile(self.__class__)) return os.path.dirname(inspect.getfile(self.__class__))
def _activate(self): def _activate(self):
if self.is_activated:
return
self.activate() self.activate()
self.is_activated = True self.is_activated = True
return self.is_activated
def _deactivate(self): def _deactivate(self):
self.is_activated = False self.is_activated = False
@@ -164,8 +190,7 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
def process_entries(self, entries, activity): def process_entries(self, entries, activity):
for entry in entries: for entry in entries:
self.log.debug('Processing entry with plugin {}: {}'.format( self.log.debug('Processing entry with plugin %s: %s', self, entry)
self, entry))
results = self.process_entry(entry, activity) results = self.process_entry(entry, activity)
if inspect.isgenerator(results): if inspect.isgenerator(results):
for result in results: for result in results:
@@ -185,6 +210,8 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
) )
def test(self, test_cases=None): def test(self, test_cases=None):
if not self.is_activated:
self._activate()
if not test_cases: if not test_cases:
if not hasattr(self, 'test_cases'): if not hasattr(self, 'test_cases'):
raise AttributeError( raise AttributeError(
@@ -245,11 +272,13 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
assert not should_fail assert not should_fail
def find_file(self, fname): def find_file(self, fname):
tried = []
for p in self._data_paths: for p in self._data_paths:
alternative = os.path.join(p, fname) alternative = os.path.abspath(os.path.join(p, fname))
if os.path.exists(alternative): if os.path.exists(alternative):
return alternative return alternative
raise IOError('File does not exist: {}'.format(fname)) tried.append(alternative)
raise IOError(f'File does not exist: {fname}. Tried: {tried}')
def path(self, fpath): def path(self, fpath):
if not os.path.isabs(fpath): if not os.path.isabs(fpath):
@@ -273,6 +302,28 @@ class Plugin(with_metaclass(PluginMeta, models.Plugin)):
SenpyPlugin = Plugin SenpyPlugin = Plugin
class FailedPlugin(Plugin):
"""A plugin that has failed to initialize."""
version = 0
def __init__(self, info, function):
super().__init__(info)
a = info.get('name', info.get('module', self.name))
self['name'] == a
self._function = function
self.is_activated = False
def retry(self):
return self._function()
def test(self):
'''
A module that failed to load cannot be tested. But non-optional
plugins should not fail to load in strict mode.
'''
assert self.optional and not config.strict
class Analyser(Plugin): class Analyser(Plugin):
''' '''
A subclass of Plugin that analyses text and provides an annotation. A subclass of Plugin that analyses text and provides an annotation.
@@ -347,6 +398,9 @@ class Evaluable(Plugin):
def evaluate_func(self, X, activity=None): def evaluate_func(self, X, activity=None):
raise Exception('Implement the evaluate_func function') raise Exception('Implement the evaluate_func function')
def evaluate(self, *args, **kwargs):
return evaluate([self], *args, **kwargs)
class SentimentPlugin(Analyser, Evaluable, models.SentimentPlugin): class SentimentPlugin(Analyser, Evaluable, models.SentimentPlugin):
''' '''
@@ -613,11 +667,17 @@ class ShelfMixin(object):
def shelf_file(self, value): def shelf_file(self, value):
self._shelf_file = value self._shelf_file = value
def save(self): def save(self, ignore_errors=False):
self.log.debug('Saving pickle') try:
if hasattr(self, '_sh') and self._sh is not None: self.log.debug('Saving pickle')
with self.open(self.shelf_file, 'wb') as f: if hasattr(self, '_sh') and self._sh is not None:
pickle.dump(self._sh, f) with self.open(self.shelf_file, 'wb') as f:
pickle.dump(self._sh, f)
except Exception as ex:
self.log.warning("Could not save shelf state. Check folder permissions for: "
f" {self.shelf_file}. Error: { ex }")
if not ignore_errors:
raise
def pfilter(plugins, plugin_type=Analyser, **kwargs): def pfilter(plugins, plugin_type=Analyser, **kwargs):
@@ -679,23 +739,31 @@ def missing_requirements(reqs):
res = pool.apply_async(pkg_resources.get_distribution, (req,)) res = pool.apply_async(pkg_resources.get_distribution, (req,))
queue.append((req, res)) queue.append((req, res))
missing = [] missing = []
installed = []
for req, job in queue: for req, job in queue:
try: try:
job.get(1) installed.append(job.get(1))
except Exception: except Exception:
missing.append(req) missing.append(req)
return missing return installed, missing
def list_dependencies(*plugins):
'''List all dependencies (python and nltk) for the given list of plugins'''
nltk_resources = set()
missing = []
installed = []
for info in plugins:
reqs = info.get('requirements', [])
if reqs:
inst, miss= missing_requirements(reqs)
installed += inst
missing += miss
nltk_resources |= set(info.get('nltk_resources', []))
return installed, missing, nltk_resources
def install_deps(*plugins): def install_deps(*plugins):
_, requirements, nltk_resources = list_dependencies(*plugins)
installed = False installed = False
nltk_resources = set()
requirements = []
for info in plugins:
requirements = info.get('requirements', [])
if requirements:
requirements += missing_requirements(requirements)
nltk_resources |= set(info.get('nltk_resources', []))
if requirements: if requirements:
logger.info('Installing requirements: ' + str(requirements)) logger.info('Installing requirements: ' + str(requirements))
pip_args = [sys.executable, '-m', 'pip', 'install'] pip_args = [sys.executable, '-m', 'pip', 'install']
@@ -709,8 +777,7 @@ def install_deps(*plugins):
if exitcode != 0: if exitcode != 0:
raise models.Error( raise models.Error(
"Dependencies not properly installed: {}".format(pip_args)) "Dependencies not properly installed: {}".format(pip_args))
installed |= download(list(nltk_resources)) return installed or download(list(nltk_resources))
return installed
is_plugin_file = re.compile(r'.*\.senpy$|senpy_[a-zA-Z0-9_]+\.py$|' is_plugin_file = re.compile(r'.*\.senpy$|senpy_[a-zA-Z0-9_]+\.py$|'
@@ -727,7 +794,7 @@ def find_plugins(folders):
yield fpath yield fpath
def from_path(fpath, install_on_fail=False, **kwargs): def from_path(fpath, **kwargs):
logger.debug("Loading plugin from {}".format(fpath)) logger.debug("Loading plugin from {}".format(fpath))
if fpath.endswith('.py'): if fpath.endswith('.py'):
# We asume root is the dir of the file, and module is the name of the file # We asume root is the dir of the file, and module is the name of the file
@@ -737,18 +804,18 @@ def from_path(fpath, install_on_fail=False, **kwargs):
yield instance yield instance
else: else:
info = parse_plugin_info(fpath) info = parse_plugin_info(fpath)
yield from_info(info, install_on_fail=install_on_fail, **kwargs) yield from_info(info, **kwargs)
def from_folder(folders, loader=from_path, **kwargs): def from_folder(folders, loader=from_path, **kwargs):
plugins = [] plugins = []
for fpath in find_plugins(folders): for fpath in find_plugins(folders):
for plugin in loader(fpath, **kwargs): for plugin in loader(fpath, **kwargs):
plugins.append(plugin) if plugin:
plugins.append(plugin)
return plugins return plugins
def from_info(info, root=None, install_on_fail=True, **kwargs): def from_info(info, root=None, strict=False, **kwargs):
if any(x not in info for x in ('module', )): if any(x not in info for x in ('module', )):
raise ValueError('Plugin info is not valid: {}'.format(info)) raise ValueError('Plugin info is not valid: {}'.format(info))
module = info["module"] module = info["module"]
@@ -760,8 +827,10 @@ def from_info(info, root=None, install_on_fail=True, **kwargs):
try: try:
return fun() return fun()
except (ImportError, LookupError): except (ImportError, LookupError):
install_deps(info) if strict or not str(info.get("optional", "false")).lower() in ["True", "true", "t"]:
return fun() raise
print(f"Could not import plugin: { info }")
return FailedPlugin(info, fun)
def parse_plugin_info(fpath): def parse_plugin_info(fpath):
@@ -831,6 +900,9 @@ def evaluate(plugins, datasets, **kwargs):
if not hasattr(plug, 'as_pipe'): if not hasattr(plug, 'as_pipe'):
raise models.Error('Plugin {} cannot be evaluated'.format(plug.name)) raise models.Error('Plugin {} cannot be evaluated'.format(plug.name))
if not isinstance(datasets, dict):
datasets = gsitk_compat.prepare(datasets, download=True)
tuples = list(product(plugins, datasets)) tuples = list(product(plugins, datasets))
missing = [] missing = []
for (p, d) in tuples: for (p, d) in tuples:
@@ -844,12 +916,12 @@ def evaluate(plugins, datasets, **kwargs):
new_ev = evaluations_to_JSONLD(results, **kwargs) new_ev = evaluations_to_JSONLD(results, **kwargs)
for ev in new_ev: for ev in new_ev:
dataset = ev.evaluatesOn dataset = ev.evaluatesOn
model = ev.evaluates.rstrip('__' + dataset) model = ev.evaluates
cached_evs[(model, dataset)] = ev cached_evs[(model, dataset)] = ev
evaluations = [] evaluations = []
print(tuples, 'Cached evs', cached_evs) logger.debug('%s. Cached evs: %s', tuples, cached_evs)
for (p, d) in tuples: for (p, d) in tuples:
print('Adding', d, p) logger.debug('Adding %s, %s', d, p)
evaluations.append(cached_evs[(p.id, d)]) evaluations.append(cached_evs[(p.id, d)])
return evaluations return evaluations
@@ -868,7 +940,7 @@ def evaluations_to_JSONLD(results, flatten=False):
if row.get('CV', True): if row.get('CV', True):
evaluation['@type'] = ['StaticCV', 'Evaluation'] evaluation['@type'] = ['StaticCV', 'Evaluation']
evaluation.evaluatesOn = row['Dataset'] evaluation.evaluatesOn = row['Dataset']
evaluation.evaluates = row['Model'] evaluation.evaluates = row['Model'].rstrip('__' + row['Dataset'])
i = 0 i = 0
if flatten: if flatten:
metric = models.Metric() metric = models.Metric()

View File

@@ -0,0 +1,60 @@
# Plugin emotion-anew
This plugin consists on an **emotion classifier** that detects six possible emotions:
- Anger : general-dislike.
- Fear : negative-fear.
- Disgust : shame.
- Joy : gratitude, affective, enthusiasm, love, joy, liking.
- Sadness : ingrattitude, daze, humlity, compassion, despair, anxiety, sadness.
- Neutral: not detected a particulary emotion.
The plugin uses **ANEW lexicon** dictionary to calculate VAD (valence-arousal-dominance) of the sentence and determinate which emotion is closer to this value. To do this comparision, it is defined that each emotion has a centroid, calculated according to this article: http://www.aclweb.org/anthology/W10-0208.
The plugin is going to look for the words in the sentence that appear in the ANEW dictionary and calculate the average VAD score for the sentence. Once this score is calculated, it is going to seek the emotion that is closest to this value.
The response of this plugin uses [Onyx ontology](https://www.gsi.dit.upm.es/ontologies/onyx/) developed at GSI UPM, to express the information.
## Installation
* Download
```
git clone https://lab.cluster.gsi.dit.upm.es/senpy/emotion-anew.git
```
* Get data
```
cd emotion-anew
git submodule update --init --recursive
```
* Run
```
docker run -p 5000:5000 -v $PWD:/plugins gsiupm/senpy:python2.7 -f /plugins
```
## Data format
`data/Corpus/affective-isear.tsv` contains data from ISEAR Databank: http://emotion-research.net/toolbox/toolboxdatabase.2006-10-13.2581092615
##Usage
Params accepted:
- Language: English (en) and Spanish (es).
- Input: input text to analyse.
Example request:
```
http://senpy.cluster.gsi.dit.upm.es/api/?algo=emotion-anew&language=en&input=I%20love%20Madrid
```
Example respond: This plugin follows the standard for the senpy plugin response. For more information, please visit [senpy documentation](http://senpy.readthedocs.io). Specifically, NIF API section.
# Known issues
- To obtain Anew dictionary you can download from here: <https://github.com/hcorona/SMC2015/blob/master/resources/ANEW2010All.txt>
- This plugin only supports **Python2**
![alt GSI Logo][logoGSI]
[logoES]: https://www.gsi.dit.upm.es/ontologies/onyx/img/eurosentiment_logo.png "EuroSentiment logo"
[logoGSI]: http://www.gsi.dit.upm.es/images/stories/logos/gsi.png "GSI Logo"

View File

@@ -0,0 +1,269 @@
# -*- coding: utf-8 -*-
import re
import nltk
import csv
import sys
import os
import unicodedata
import string
import xml.etree.ElementTree as ET
import math
from sklearn.svm import LinearSVC
from sklearn.feature_extraction import DictVectorizer
from nltk import bigrams
from nltk import trigrams
from nltk.corpus import stopwords
from pattern.en import parse as parse_en
from pattern.es import parse as parse_es
from senpy.plugins import EmotionPlugin, SenpyPlugin
from senpy.models import Results, EmotionSet, Entry, Emotion
### BEGIN WORKAROUND FOR PATTERN
# See: https://github.com/clips/pattern/issues/308
import os.path
import pattern.text
from pattern.helpers import decode_string
from codecs import BOM_UTF8
BOM_UTF8 = BOM_UTF8.decode("utf-8")
decode_utf8 = decode_string
MODEL = "emoml:pad-dimensions_"
VALENCE = f"{MODEL}_valence"
AROUSAL = f"{MODEL}_arousal"
DOMINANCE = f"{MODEL}_dominance"
def _read(path, encoding="utf-8", comment=";;;"):
"""Returns an iterator over the lines in the file at the given path,
strippping comments and decoding each line to Unicode.
"""
if path:
if isinstance(path, str) and os.path.exists(path):
# From file path.
f = open(path, "r", encoding="utf-8")
elif isinstance(path, str):
# From string.
f = path.splitlines()
else:
# From file or buffer.
f = path
for i, line in enumerate(f):
line = line.strip(BOM_UTF8) if i == 0 and isinstance(line, str) else line
line = line.strip()
line = decode_utf8(line, encoding)
if not line or (comment and line.startswith(comment)):
continue
yield line
pattern.text._read = _read
## END WORKAROUND
class ANEW(EmotionPlugin):
description = "This plugin consists on an emotion classifier using ANEW lexicon dictionary. It averages the VAD (valence-arousal-dominance) value of each word in the text that is also in the ANEW dictionary. To obtain a categorical value (e.g., happy) use the emotion conversion API (e.g., `emotion-model=emoml:big6`)."
author = "@icorcuera"
version = "0.5.2"
name = "emotion-anew"
extra_params = {
"language": {
"description": "language of the input",
"aliases": ["language", "l"],
"required": True,
"options": ["es","en"],
"default": "en"
}
}
anew_path_es = "Dictionary/Redondo(2007).csv"
anew_path_en = "Dictionary/ANEW2010All.txt"
onyx__usesEmotionModel = MODEL
nltk_resources = ['stopwords']
def activate(self, *args, **kwargs):
self._stopwords = stopwords.words('english')
dictionary={}
dictionary['es'] = {}
with self.open(self.anew_path_es,'r') as tabfile:
reader = csv.reader(tabfile, delimiter='\t')
for row in reader:
dictionary['es'][row[2]]={}
dictionary['es'][row[2]]['V']=row[3]
dictionary['es'][row[2]]['A']=row[5]
dictionary['es'][row[2]]['D']=row[7]
dictionary['en'] = {}
with self.open(self.anew_path_en,'r') as tabfile:
reader = csv.reader(tabfile, delimiter='\t')
for row in reader:
dictionary['en'][row[0]]={}
dictionary['en'][row[0]]['V']=row[2]
dictionary['en'][row[0]]['A']=row[4]
dictionary['en'][row[0]]['D']=row[6]
self._dictionary = dictionary
def _my_preprocessor(self, text):
regHttp = re.compile('(http://)[a-zA-Z0-9]*.[a-zA-Z0-9/]*(.[a-zA-Z0-9]*)?')
regHttps = re.compile('(https://)[a-zA-Z0-9]*.[a-zA-Z0-9/]*(.[a-zA-Z0-9]*)?')
regAt = re.compile('@([a-zA-Z0-9]*[*_/&%#@$]*)*[a-zA-Z0-9]*')
text = re.sub(regHttp, '', text)
text = re.sub(regAt, '', text)
text = re.sub('RT : ', '', text)
text = re.sub(regHttps, '', text)
text = re.sub('[0-9]', '', text)
text = self._delete_punctuation(text)
return text
def _delete_punctuation(self, text):
exclude = set(string.punctuation)
s = ''.join(ch for ch in text if ch not in exclude)
return s
def _extract_ngrams(self, text, lang):
unigrams_lemmas = []
unigrams_words = []
pos_tagged = []
if lang == 'es':
sentences = list(parse_es(text, lemmata=True).split())
else:
sentences = list(parse_en(text, lemmata=True).split())
for sentence in sentences:
for token in sentence:
if token[0].lower() not in self._stopwords:
unigrams_words.append(token[0].lower())
unigrams_lemmas.append(token[4])
pos_tagged.append(token[1])
return unigrams_lemmas,unigrams_words,pos_tagged
def _find_ngrams(self, input_list, n):
return zip(*[input_list[i:] for i in range(n)])
def _extract_features(self, tweet,dictionary,lang):
feature_set={}
ngrams_lemmas,ngrams_words,pos_tagged = self._extract_ngrams(tweet,lang)
pos_tags={'NN':'NN', 'NNS':'NN', 'JJ':'JJ', 'JJR':'JJ', 'JJS':'JJ', 'RB':'RB', 'RBR':'RB',
'RBS':'RB', 'VB':'VB', 'VBD':'VB', 'VGB':'VB', 'VBN':'VB', 'VBP':'VB', 'VBZ':'VB'}
totalVAD=[0,0,0]
matches=0
for word in range(len(ngrams_lemmas)):
VAD=[]
if ngrams_lemmas[word] in dictionary:
matches+=1
totalVAD = [totalVAD[0]+float(dictionary[ngrams_lemmas[word]]['V']),
totalVAD[1]+float(dictionary[ngrams_lemmas[word]]['A']),
totalVAD[2]+float(dictionary[ngrams_lemmas[word]]['D'])]
elif ngrams_words[word] in dictionary:
matches+=1
totalVAD = [totalVAD[0]+float(dictionary[ngrams_words[word]]['V']),
totalVAD[1]+float(dictionary[ngrams_words[word]]['A']),
totalVAD[2]+float(dictionary[ngrams_words[word]]['D'])]
if matches==0:
emotion='neutral'
else:
totalVAD=[totalVAD[0]/matches,totalVAD[1]/matches,totalVAD[2]/matches]
feature_set['V'] = totalVAD[0]
feature_set['A'] = totalVAD[1]
feature_set['D'] = totalVAD[2]
return feature_set
def analyse_entry(self, entry, activity):
params = activity.params
text_input = entry.text
text = self._my_preprocessor(text_input)
dictionary = self._dictionary[params['language']]
feature_set=self._extract_features(text, dictionary, params['language'])
emotions = EmotionSet()
emotions.id = "Emotions0"
emotion1 = Emotion(id="Emotion0")
emotion1[VALENCE] = feature_set['V']
emotion1[AROUSAL] = feature_set['A']
emotion1[DOMINANCE] = feature_set['D']
emotion1.prov(activity)
emotions.prov(activity)
emotions.onyx__hasEmotion.append(emotion1)
entry.emotions = [emotions, ]
yield entry
test_cases = [
{
'name': 'anger with VAD=(2.12, 6.95, 5.05)',
'input': 'I hate you',
'expected': {
'onyx:hasEmotionSet': [{
'onyx:hasEmotion': [{
AROUSAL: 6.95,
DOMINANCE: 5.05,
VALENCE: 2.12,
}]
}]
}
}, {
'input': 'i am sad',
'expected': {
'onyx:hasEmotionSet': [{
'onyx:hasEmotion': [{
f"{MODEL}_arousal": 4.13,
}]
}]
}
}, {
'name': 'joy',
'input': 'i am happy with my marks',
'expected': {
'onyx:hasEmotionSet': [{
'onyx:hasEmotion': [{
AROUSAL: 6.49,
DOMINANCE: 6.63,
VALENCE: 8.21,
}]
}]
}
}, {
'name': 'negative-feat',
'input': 'This movie is scary',
'expected': {
'onyx:hasEmotionSet': [{
'onyx:hasEmotion': [{
AROUSAL: 5.8100000000000005,
DOMINANCE: 4.33,
VALENCE: 5.050000000000001,
}]
}]
}
}, {
'name': 'negative-fear',
'input': 'this cake is disgusting' ,
'expected': {
'onyx:hasEmotionSet': [{
'onyx:hasEmotion': [{
AROUSAL: 5.09,
DOMINANCE: 4.4,
VALENCE: 5.109999999999999,
}]
}]
}
}
]

View File

@@ -0,0 +1,12 @@
---
module: emotion-anew
optional: true
requirements:
- numpy
- pandas
- nltk
- scipy
- scikit-learn
- textblob
- pattern
- lxml

View File

@@ -0,0 +1,179 @@
#!/usr/local/bin/python
# coding: utf-8
from future import standard_library
standard_library.install_aliases()
import os
import re
import sys
import string
import numpy as np
from six.moves import urllib
from nltk.corpus import stopwords
from senpy import EmotionBox, models
def ignore(dchars):
deletechars = "".join(dchars)
tbl = str.maketrans("", "", deletechars)
ignore = lambda s: s.translate(tbl)
return ignore
class DepecheMood(EmotionBox):
'''
Plugin that uses the DepecheMood emotion lexicon.
DepecheMood is an emotion lexicon automatically generated from news articles where users expressed their associated emotions. It contains two languages (English and Italian), as well as three types of word representations (token, lemma and lemma#PoS). For English, the lexicon contains 165k tokens, while the Italian version contains 116k. Unsupervised techniques can be applied to generate simple but effective baselines. To learn more, please visit https://github.com/marcoguerini/DepecheMood and http://www.depechemood.eu/
'''
author = 'Oscar Araque'
name = 'emotion-depechemood'
version = '0.1'
requirements = ['pandas']
optional = True
nltk_resources = ["stopwords"]
onyx__usesEmotionModel = 'wna:WNAModel'
EMOTIONS = ['wna:negative-fear',
'wna:amusement',
'wna:anger',
'wna:annoyance',
'wna:indifference',
'wna:joy',
'wna:awe',
'wna:sadness']
DM_EMOTIONS = ['AFRAID', 'AMUSED', 'ANGRY', 'ANNOYED', 'DONT_CARE', 'HAPPY', 'INSPIRED', 'SAD',]
def __init__(self, *args, **kwargs):
super(DepecheMood, self).__init__(*args, **kwargs)
self.LEXICON_URL = "https://github.com/marcoguerini/DepecheMood/raw/master/DepecheMood%2B%2B/DepecheMood_english_token_full.tsv"
self._denoise = ignore(set(string.punctuation)|set('«»'))
self._stop_words = []
self._lex_vocab = None
self._lex = None
def activate(self):
self._lex = self.download_lex()
self._lex_vocab = set(list(self._lex.keys()))
self._stop_words = stopwords.words('english') + ['']
def clean_str(self, string):
string = re.sub(r"[^A-Za-z0-9().,!?\'\`]", " ", string)
string = re.sub(r"[0-9]+", " num ", string)
string = re.sub(r"\'s", " \'s", string)
string = re.sub(r"\'ve", " \'ve", string)
string = re.sub(r"n\'t", " n\'t", string)
string = re.sub(r"\'re", " \'re", string)
string = re.sub(r"\'d", " \'d", string)
string = re.sub(r"\'ll", " \'ll", string)
string = re.sub(r"\.", " . ", string)
string = re.sub(r",", " , ", string)
string = re.sub(r"!", " ! ", string)
string = re.sub(r"\(", " ( ", string)
string = re.sub(r"\)", " ) ", string)
string = re.sub(r"\?", " ? ", string)
string = re.sub(r"\s{2,}", " ", string)
return string.strip().lower()
def preprocess(self, text):
if text is None:
return None
tokens = self._denoise(self.clean_str(text)).split(' ')
tokens = [tok for tok in tokens if tok not in self._stop_words]
return tokens
def estimate_emotion(self, tokens, emotion):
s = []
for tok in tokens:
s.append(self._lex[tok][emotion])
dividend = np.sum(s) if np.sum(s) > 0 else 0
divisor = len(s) if len(s) > 0 else 1
S = np.sum(s) / divisor
return S
def estimate_all_emotions(self, tokens):
S = []
intersection = set(tokens) & self._lex_vocab
for emotion in self.DM_EMOTIONS:
s = self.estimate_emotion(intersection, emotion)
S.append(s)
return S
def download_lex(self, file_path='DepecheMood_english_token_full.tsv', freq_threshold=10):
import pandas as pd
try:
file_path = self.find_file(file_path)
except IOError:
file_path = self.path(file_path)
filename, _ = urllib.request.urlretrieve(self.LEXICON_URL, file_path)
lexicon = pd.read_csv(file_path, sep='\t', index_col=0)
lexicon = lexicon[lexicon['freq'] >= freq_threshold]
lexicon.drop('freq', axis=1, inplace=True)
lexicon = lexicon.T.to_dict()
return lexicon
def predict_one(self, features, **kwargs):
tokens = self.preprocess(features[0])
estimation = self.estimate_all_emotions(tokens)
return estimation
test_cases = [
{
'entry': {
'nif:isString': 'My cat is very happy',
},
'expected': {
'onyx:hasEmotionSet': [
{
'onyx:hasEmotion': [
{
'onyx:hasEmotionCategory': 'wna:negative-fear',
'onyx:hasEmotionIntensity': 0.05278117640010922
},
{
'onyx:hasEmotionCategory': 'wna:amusement',
'onyx:hasEmotionIntensity': 0.2114806151413433,
},
{
'onyx:hasEmotionCategory': 'wna:anger',
'onyx:hasEmotionIntensity': 0.05726119426520887
},
{
'onyx:hasEmotionCategory': 'wna:annoyance',
'onyx:hasEmotionIntensity': 0.12295990731053638,
},
{
'onyx:hasEmotionCategory': 'wna:indifference',
'onyx:hasEmotionIntensity': 0.1860159893608025,
},
{
'onyx:hasEmotionCategory': 'wna:joy',
'onyx:hasEmotionIntensity': 0.12904050973724163,
},
{
'onyx:hasEmotionCategory': 'wna:awe',
'onyx:hasEmotionIntensity': 0.17973650399862967,
},
{
'onyx:hasEmotionCategory': 'wna:sadness',
'onyx:hasEmotionIntensity': 0.060724103786128455,
},
]
}
]
}
}
]
if __name__ == '__main__':
from senpy.utils import easy_test
easy_test(debug=False)

View File

@@ -0,0 +1,3 @@
FROM gsiupm/senpy:python{{PYVERSION}}
MAINTAINER manuel.garcia-amado.sancho@alumnos.upm.es

View File

@@ -0,0 +1,9 @@
NAME:=wnaffect
VERSIONFILE:=VERSION
IMAGENAME:=registry.cluster.gsi.dit.upm.es/senpy/emotion-wnaffect
PYVERSIONS:=2.7 3.5
DEVPORT:=5000
include .makefiles/base.mk
include .makefiles/k8s.mk
include .makefiles/python.mk

View File

@@ -0,0 +1,62 @@
# WordNet-Affect plugin
This plugin uses WordNet-Affect (http://wndomains.fbk.eu/wnaffect.html) to calculate the percentage of each emotion. The plugin classifies among five diferent emotions: anger, fear, disgust, joy and sadness. It is has been used a emotion mapping enlarge the emotions:
- anger : general-dislike
- fear : negative-fear
- disgust : shame
- joy : gratitude, affective, enthusiasm, love, joy, liking
- sadness : ingrattitude, daze, humlity, compassion, despair, anxiety, sadness
## Installation
* Download
```
git clone https://lab.cluster.gsi.dit.upm.es/senpy/emotion-wnaffect.git
```
* Get data
```
cd emotion-wnaffect
git submodule update --init --recursive
```
* Run
```
docker run -p 5000:5000 -v $PWD:/plugins gsiupm/senpy -f /plugins
```
## Data format
`data/a-hierarchy.xml` is a xml file
`data/a-synsets.xml` is a xml file
## Usage
The parameters accepted are:
- Language: English (en).
- Input: Text to analyse.
Example request:
```
http://senpy.cluster.gsi.dit.upm.es/api/?algo=emotion-wnaffect&language=en&input=I%20love%20Madrid
```
Example respond: This plugin follows the standard for the senpy plugin response. For more information, please visit [senpy documentation](http://senpy.readthedocs.io). Specifically, NIF API section.
The response of this plugin uses [Onyx ontology](https://www.gsi.dit.upm.es/ontologies/onyx/) developed at GSI UPM for semantic web.
This plugin uses WNAffect labels for emotion analysis.
The emotion-wnaffect.senpy file can be copied and modified to use different versions of wnaffect with the same python code.
## Known issues
- This plugin run on **Python2.7** and **Python3.5**
- Wnaffect and corpora files are not included in the repository, but can be easily added either to the docker image (using a volume) or in a new docker image.
- You can download Wordnet 1.6 here: <http://wordnetcode.princeton.edu/1.6/wn16.unix.tar.gz> and extract the dict folder.
- The hierarchy and synsets files can be found here: <https://github.com/larsmans/wordnet-domains-sentiwords/tree/master/wn-domains/wn-affect-1.1>
![alt GSI Logo][logoGSI]
[logoGSI]: http://www.gsi.dit.upm.es/images/stories/logos/gsi.png "GSI Logo"

View File

@@ -0,0 +1,278 @@
# -*- coding: utf-8 -*-
from __future__ import division
import re
import nltk
import os
import string
import xml.etree.ElementTree as ET
from nltk.corpus import stopwords
from nltk.corpus import WordNetCorpusReader
from nltk.stem import wordnet
from emotion import Emotion as Emo
from senpy.plugins import EmotionPlugin, AnalysisPlugin, ShelfMixin
from senpy.models import Results, EmotionSet, Entry, Emotion
class WNAffect(EmotionPlugin, ShelfMixin):
'''
Emotion classifier using WordNet-Affect to calculate the percentage
of each emotion. This plugin classifies among 6 emotions: anger,fear,disgust,joy,sadness
or neutral. The only available language is English (en)
'''
name = 'emotion-wnaffect'
author = ["@icorcuera", "@balkian"]
version = '0.2'
extra_params = {
'language': {
"@id": 'lang_wnaffect',
'description': 'language of the input',
'aliases': ['language', 'l'],
'required': True,
'options': ['en',]
}
}
synsets_path = "a-synsets.xml"
hierarchy_path = "a-hierarchy.xml"
wn16_path = "wordnet1.6/dict"
onyx__usesEmotionModel = "emoml:big6"
nltk_resources = ['stopwords', 'averaged_perceptron_tagger', 'wordnet']
def _load_synsets(self, synsets_path):
"""Returns a dictionary POS tag -> synset offset -> emotion (str -> int -> str)."""
tree = ET.parse(synsets_path)
root = tree.getroot()
pos_map = {"noun": "NN", "adj": "JJ", "verb": "VB", "adv": "RB"}
synsets = {}
for pos in ["noun", "adj", "verb", "adv"]:
tag = pos_map[pos]
synsets[tag] = {}
for elem in root.findall(
".//{0}-syn-list//{0}-syn".format(pos, pos)):
offset = int(elem.get("id")[2:])
if not offset: continue
if elem.get("categ"):
synsets[tag][offset] = Emo.emotions[elem.get(
"categ")] if elem.get(
"categ") in Emo.emotions else None
elif elem.get("noun-id"):
synsets[tag][offset] = synsets[pos_map["noun"]][int(
elem.get("noun-id")[2:])]
return synsets
def _load_emotions(self, hierarchy_path):
"""Loads the hierarchy of emotions from the WordNet-Affect xml."""
tree = ET.parse(hierarchy_path)
root = tree.getroot()
for elem in root.findall("categ"):
name = elem.get("name")
if name == "root":
Emo.emotions["root"] = Emo("root")
else:
Emo.emotions[name] = Emo(name, elem.get("isa"))
def activate(self, *args, **kwargs):
self._stopwords = stopwords.words('english')
self._wnlemma = wordnet.WordNetLemmatizer()
self._syntactics = {'N': 'n', 'V': 'v', 'J': 'a', 'S': 's', 'R': 'r'}
local_path = os.environ.get("SENPY_DATA")
self._categories = {
'anger': [
'general-dislike',
],
'fear': [
'negative-fear',
],
'disgust': [
'shame',
],
'joy':
['gratitude', 'affective', 'enthusiasm', 'love', 'joy', 'liking'],
'sadness': [
'ingrattitude', 'daze', 'humility', 'compassion', 'despair',
'anxiety', 'sadness'
]
}
self._wnaffect_mappings = {
'anger': 'anger',
'fear': 'negative-fear',
'disgust': 'disgust',
'joy': 'joy',
'sadness': 'sadness'
}
self._load_emotions(self.find_file(self.hierarchy_path))
if 'total_synsets' not in self.sh:
total_synsets = self._load_synsets(self.find_file(self.synsets_path))
self.sh['total_synsets'] = total_synsets
self._total_synsets = self.sh['total_synsets']
self._wn16_path = self.wn16_path
self._wn16 = WordNetCorpusReader(self.find_file(self._wn16_path), nltk.data.find(self.find_file(self._wn16_path)))
def deactivate(self, *args, **kwargs):
self.save(ignore_errors=True)
def _my_preprocessor(self, text):
regHttp = re.compile(
'(http://)[a-zA-Z0-9]*.[a-zA-Z0-9/]*(.[a-zA-Z0-9]*)?')
regHttps = re.compile(
'(https://)[a-zA-Z0-9]*.[a-zA-Z0-9/]*(.[a-zA-Z0-9]*)?')
regAt = re.compile('@([a-zA-Z0-9]*[*_/&%#@$]*)*[a-zA-Z0-9]*')
text = re.sub(regHttp, '', text)
text = re.sub(regAt, '', text)
text = re.sub('RT : ', '', text)
text = re.sub(regHttps, '', text)
text = re.sub('[0-9]', '', text)
text = self._delete_punctuation(text)
return text
def _delete_punctuation(self, text):
exclude = set(string.punctuation)
s = ''.join(ch for ch in text if ch not in exclude)
return s
def _extract_ngrams(self, text):
unigrams_lemmas = []
pos_tagged = []
unigrams_words = []
tokens = text.split()
for token in nltk.pos_tag(tokens):
unigrams_words.append(token[0])
pos_tagged.append(token[1])
if token[1][0] in self._syntactics.keys():
unigrams_lemmas.append(
self._wnlemma.lemmatize(token[0], self._syntactics[token[1]
[0]]))
else:
unigrams_lemmas.append(token[0])
return unigrams_words, unigrams_lemmas, pos_tagged
def _find_ngrams(self, input_list, n):
return zip(*[input_list[i:] for i in range(n)])
def _clean_pos(self, pos_tagged):
pos_tags = {
'NN': 'NN',
'NNP': 'NN',
'NNP-LOC': 'NN',
'NNS': 'NN',
'JJ': 'JJ',
'JJR': 'JJ',
'JJS': 'JJ',
'RB': 'RB',
'RBR': 'RB',
'RBS': 'RB',
'VB': 'VB',
'VBD': 'VB',
'VGB': 'VB',
'VBN': 'VB',
'VBP': 'VB',
'VBZ': 'VB'
}
for i in range(len(pos_tagged)):
if pos_tagged[i] in pos_tags:
pos_tagged[i] = pos_tags[pos_tagged[i]]
return pos_tagged
def _extract_features(self, text):
feature_set = {k: 0 for k in self._categories}
ngrams_words, ngrams_lemmas, pos_tagged = self._extract_ngrams(text)
matches = 0
pos_tagged = self._clean_pos(pos_tagged)
tag_wn = {
'NN': self._wn16.NOUN,
'JJ': self._wn16.ADJ,
'VB': self._wn16.VERB,
'RB': self._wn16.ADV
}
for i in range(len(pos_tagged)):
if pos_tagged[i] in tag_wn:
synsets = self._wn16.synsets(ngrams_words[i],
tag_wn[pos_tagged[i]])
if synsets:
offset = synsets[0].offset()
if offset in self._total_synsets[pos_tagged[i]]:
if self._total_synsets[pos_tagged[i]][offset] is None:
continue
else:
emotion = self._total_synsets[pos_tagged[i]][
offset].get_level(5).name
matches += 1
for i in self._categories:
if emotion in self._categories[i]:
feature_set[i] += 1
if matches == 0:
matches = 1
for i in feature_set:
feature_set[i] = (feature_set[i] / matches)
return feature_set
def analyse_entry(self, entry, activity):
params = activity.params
text_input = entry['nif:isString']
text = self._my_preprocessor(text_input)
feature_text = self._extract_features(text)
emotionSet = EmotionSet(id="Emotions0")
emotions = emotionSet.onyx__hasEmotion
for i in feature_text:
emotions.append(
Emotion(
onyx__hasEmotionCategory=self._wnaffect_mappings[i],
onyx__hasEmotionIntensity=feature_text[i]))
entry.emotions = [emotionSet]
yield entry
def test(self, *args, **kwargs):
results = list()
params = {'algo': 'emotion-wnaffect',
'intype': 'direct',
'expanded-jsonld': 0,
'informat': 'text',
'prefix': '',
'plugin_type': 'analysisPlugin',
'urischeme': 'RFC5147String',
'outformat': 'json-ld',
'i': 'Hello World',
'input': 'Hello World',
'conversion': 'full',
'language': 'en',
'algorithm': 'emotion-wnaffect'}
self.activate()
texts = {'I hate you': 'anger',
'i am sad': 'sadness',
'i am happy with my marks': 'joy',
'This movie is scary': 'negative-fear'}
for text in texts:
response = next(self.analyse_entry(Entry(nif__isString=text),
self.activity(params)))
expected = texts[text]
emotionSet = response.emotions[0]
max_emotion = max(emotionSet['onyx:hasEmotion'], key=lambda x: x['onyx:hasEmotionIntensity'])
assert max_emotion['onyx:hasEmotionCategory'] == expected

View File

@@ -0,0 +1,7 @@
---
module: emotion-wnaffect
optional: true
requirements:
- nltk>=3.0.5
- lxml>=3.4.2
async: false

View File

@@ -0,0 +1,95 @@
# -*- coding: utf-8 -*-
"""
Clement Michard (c) 2015
"""
class Emotion:
"""Defines an emotion."""
emotions = {} # name to emotion (str -> Emotion)
def __init__(self, name, parent_name=None):
"""Initializes an Emotion object.
name -- name of the emotion (str)
parent_name -- name of the parent emotion (str)
"""
self.name = name
self.parent = None
self.level = 0
self.children = []
if parent_name:
self.parent = Emotion.emotions[parent_name] if parent_name else None
self.parent.children.append(self)
self.level = self.parent.level + 1
def get_level(self, level):
"""Returns the parent of self at the given level.
level -- level in the hierarchy (int)
"""
em = self
while em.level > level and em.level >= 0:
em = em.parent
return em
def __str__(self):
"""Returns the emotion string formatted."""
return self.name
def nb_children(self):
"""Returns the number of children of the emotion."""
return sum(child.nb_children() for child in self.children) + 1
@staticmethod
def printTree(emotion=None, indent="", last='updown'):
"""Prints the hierarchy of emotions.
emotion -- root emotion (Emotion)
"""
if not emotion:
emotion = Emotion.emotions["root"]
size_branch = {child: child.nb_children() for child in emotion.children}
leaves = sorted(emotion.children, key=lambda emotion: emotion.nb_children())
up, down = [], []
if leaves:
while sum(size_branch[e] for e in down) < sum(size_branch[e] for e in leaves):
down.append(leaves.pop())
up = leaves
for leaf in up:
next_last = 'up' if up.index(leaf) is 0 else ''
next_indent = '{0}{1}{2}'.format(indent, ' ' if 'up' in last else '', " " * len(emotion.name))
Emotion.printTree(leaf, indent=next_indent, last=next_last)
if last == 'up':
start_shape = ''
elif last == 'down':
start_shape = ''
elif last == 'updown':
start_shape = ' '
else:
start_shape = ''
if up:
end_shape = ''
elif down:
end_shape = ''
else:
end_shape = ''
print ('{0}{1}{2}{3}'.format(indent, start_shape, emotion.name, end_shape))
for leaf in down:
next_last = 'down' if down.index(leaf) is len(down) - 1 else ''
next_indent = '{0}{1}{2}'.format(indent, ' ' if 'down' in last else '', " " * len(emotion.name))
Emotion.printTree(leaf, indent=next_indent, last=next_last)

View File

@@ -0,0 +1,94 @@
# coding: utf-8
# In[1]:
# -*- coding: utf-8 -*-
"""
Clement Michard (c) 2015
"""
import os
import sys
import nltk
from emotion import Emotion
from nltk.corpus import WordNetCorpusReader
import xml.etree.ElementTree as ET
class WNAffect:
"""WordNet-Affect resource."""
nltk_resources = ['averaged_perceptron_tagger']
def __init__(self, wordnet16_dir, wn_domains_dir):
"""Initializes the WordNet-Affect object."""
cwd = os.getcwd()
nltk.data.path.append(cwd)
wn16_path = "{0}/dict".format(wordnet16_dir)
self.wn16 = WordNetCorpusReader(os.path.abspath("{0}/{1}".format(cwd, wn16_path)), nltk.data.find(wn16_path))
self.flat_pos = {'NN':'NN', 'NNS':'NN', 'JJ':'JJ', 'JJR':'JJ', 'JJS':'JJ', 'RB':'RB', 'RBR':'RB', 'RBS':'RB', 'VB':'VB', 'VBD':'VB', 'VGB':'VB', 'VBN':'VB', 'VBP':'VB', 'VBZ':'VB'}
self.wn_pos = {'NN':self.wn16.NOUN, 'JJ':self.wn16.ADJ, 'VB':self.wn16.VERB, 'RB':self.wn16.ADV}
self._load_emotions(wn_domains_dir)
self.synsets = self._load_synsets(wn_domains_dir)
def _load_synsets(self, wn_domains_dir):
"""Returns a dictionary POS tag -> synset offset -> emotion (str -> int -> str)."""
tree = ET.parse("{0}/a-synsets.xml".format(wn_domains_dir))
root = tree.getroot()
pos_map = { "noun": "NN", "adj": "JJ", "verb": "VB", "adv": "RB" }
synsets = {}
for pos in ["noun", "adj", "verb", "adv"]:
tag = pos_map[pos]
synsets[tag] = {}
for elem in root.findall(".//{0}-syn-list//{0}-syn".format(pos, pos)):
offset = int(elem.get("id")[2:])
if not offset: continue
if elem.get("categ"):
synsets[tag][offset] = Emotion.emotions[elem.get("categ")] if elem.get("categ") in Emotion.emotions else None
elif elem.get("noun-id"):
synsets[tag][offset] = synsets[pos_map["noun"]][int(elem.get("noun-id")[2:])]
return synsets
def _load_emotions(self, wn_domains_dir):
"""Loads the hierarchy of emotions from the WordNet-Affect xml."""
tree = ET.parse("{0}/a-hierarchy.xml".format(wn_domains_dir))
root = tree.getroot()
for elem in root.findall("categ"):
name = elem.get("name")
if name == "root":
Emotion.emotions["root"] = Emotion("root")
else:
Emotion.emotions[name] = Emotion(name, elem.get("isa"))
def get_emotion(self, word, pos):
"""Returns the emotion of the word.
word -- the word (str)
pos -- part-of-speech (str)
"""
if pos in self.flat_pos:
pos = self.flat_pos[pos]
synsets = self.wn16.synsets(word, self.wn_pos[pos])
if synsets:
offset = synsets[0].offset()
if offset in self.synsets[pos]:
return self.synsets[pos][offset]
return None
if __name__ == "__main__":
wordnet16, wndomains32, word, pos = sys.argv[1:5]
wna = WNAffect(wordnet16, wndomains32)
print wna.get_emotion(word, pos)

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy.plugins import Transformation from senpy.plugins import Transformation
from senpy.models import Entry from senpy.models import Entry
from nltk.tokenize.punkt import PunktSentenceTokenizer from nltk.tokenize.punkt import PunktSentenceTokenizer

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy.plugins import EmotionConversionPlugin from senpy.plugins import EmotionConversionPlugin
from senpy.models import EmotionSet, Emotion, Error from senpy.models import EmotionSet, Emotion, Error
@@ -85,7 +101,13 @@ class CentroidConversion(EmotionConversionPlugin):
def distance(centroid): def distance(centroid):
return sum(distance_k(centroid, original, k) for k in dimensions) return sum(distance_k(centroid, original, k) for k in dimensions)
emotion = min(centroids, key=lambda x: distance(centroids[x])) distances = {k: distance(centroids[k]) for k in centroids}
logger.debug('Converting %s', original)
logger.debug('Centroids: %s', centroids)
logger.debug('Distances: %s', distances)
emotion = min(distances, key=lambda x: distances[x])
result = Emotion(onyx__hasEmotionCategory=emotion) result = Emotion(onyx__hasEmotionCategory=emotion)
result.onyx__algorithmConfidence = distance(centroids[emotion]) result.onyx__algorithmConfidence = distance(centroids[emotion])

View File

@@ -9,30 +9,30 @@ centroids:
anger: anger:
A: 6.95 A: 6.95
D: 5.1 D: 5.1
V: 2.7 P: 2.7
disgust: disgust:
A: 5.3 A: 5.3
D: 8.05 D: 8.05
V: 2.7 P: 2.7
fear: fear:
A: 6.5 A: 6.5
D: 3.6 D: 3.6
V: 3.2 P: 3.2
happiness: happiness:
A: 7.22 A: 7.22
D: 6.28 D: 6.28
V: 8.6 P: 8.6
sadness: sadness:
A: 5.21 A: 5.21
D: 2.82 D: 2.82
V: 2.21 P: 2.21
centroids_direction: centroids_direction:
- emoml:big6 - emoml:big6
- emoml:pad - emoml:pad-dimensions
aliases: # These are aliases for any key in the centroid, to avoid repeating a long name several times aliases: # These are aliases for any key in the centroid, to avoid repeating a long name several times
A: emoml:pad-dimensions:arousal P: emoml:pad-dimensions_pleasure
V: emoml:pad-dimensions:valence A: emoml:pad-dimensions_arousal
D: emoml:pad-dimensions:dominance D: emoml:pad-dimensions_dominance
anger: emoml:big6anger anger: emoml:big6anger
disgust: emoml:big6disgust disgust: emoml:big6disgust
fear: emoml:big6fear fear: emoml:big6fear

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from senpy import PostProcessing, easy_test from senpy import PostProcessing, easy_test

View File

@@ -0,0 +1,28 @@
# Sentiment basic plugin
This plugin is based on the classifier developed for the TASS 2015 competition. It has been developed for Spanish and English. This is a demo plugin that uses only some features from the TASS 2015 classifier. To use the entirely functional classifier you can use the service in: http://senpy.cluster.gsi.dit.upm.es
There is more information avaliable in:
- Aspect based Sentiment Analysis of Spanish Tweets, Oscar Araque and Ignacio Corcuera-Platas and Constantino Román-Gómez and Carlos A. Iglesias and J. Fernando Sánchez-Rada. http://gsi.dit.upm.es/es/investigacion/publicaciones?view=publication&task=show&id=376
## Usage
Params accepted:
- Language: Spanish (es).
- Input: text to analyse.
Example request:
```
http://senpy.cluster.gsi.dit.upm.es/api/?algo=sentiment-basic&language=es&input=I%20love%20Madrid
```
Example respond: This plugin follows the standard for the senpy plugin response. For more information, please visit [senpy documentation](http://senpy.readthedocs.io). Specifically, NIF API section.
This plugin only supports **python2**
![alt GSI Logo][logoGSI]
[logoGSI]: http://www.gsi.dit.upm.es/images/stories/logos/gsi.png "GSI Logo"

View File

@@ -0,0 +1,177 @@
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import sys
import string
import nltk
import pickle
from sentiwn import SentiWordNet
from nltk.corpus import wordnet as wn
from textblob import TextBlob
from scipy.interpolate import interp1d
from os import path
from senpy.plugins import SentimentBox, SenpyPlugin
from senpy.models import Results, Entry, Sentiment, Error
if sys.version_info[0] >= 3:
unicode = str
class SentimentBasic(SentimentBox):
'''
Sentiment classifier using rule-based classification for Spanish. Based on english to spanish translation and SentiWordNet sentiment knowledge. This is a demo plugin that uses only some features from the TASS 2015 classifier. To use the entirely functional classifier you can use the service in: http://senpy.cluster.gsi.dit.upm.es.
'''
name = "sentiment-basic"
author = "github.com/nachtkatze"
version = "0.1.1"
extra_params = {
"language": {
"description": "language of the text",
"aliases": ["language", "l"],
"required": True,
"options": ["en","es", "it", "fr"],
"default": "en"
}
}
sentiword_path = "SentiWordNet_3.0.txt"
pos_path = "unigram_spanish.pickle"
maxPolarityValue = 1
minPolarityValue = -1
nltk_resources = ['punkt','wordnet', 'omw', 'omw-1.4']
with_polarity = False
def _load_swn(self):
self.swn_path = self.find_file(self.sentiword_path)
swn = SentiWordNet(self.swn_path)
return swn
def _load_pos_tagger(self):
self.pos_path = self.find_file(self.pos_path)
with open(self.pos_path, 'rb') as f:
tagger = pickle.load(f)
return tagger
def activate(self, *args, **kwargs):
self._swn = self._load_swn()
self._pos_tagger = self._load_pos_tagger()
def _remove_punctuation(self, tokens):
return [t for t in tokens if t not in string.punctuation]
def _tokenize(self, text):
sentence_ = {}
words = nltk.word_tokenize(text)
sentence_['sentence'] = text
tokens_ = [w.lower() for w in words]
sentence_['tokens'] = self._remove_punctuation(tokens_)
return sentence_
def _pos(self, tokens):
tokens['tokens'] = self._pos_tagger.tag(tokens['tokens'])
return tokens
def _compare_synsets(self, synsets, tokens):
for synset in synsets:
for word, lemmas in tokens['lemmas'].items():
for lemma in lemmas:
synset_ = lemma.synset()
if synset == synset_:
return synset
return None
def predict_one(self, features, activity):
language = activity.param("language")
text = features[0]
tokens = self._tokenize(text)
tokens = self._pos(tokens)
sufixes = {'es':'spa','en':'eng','it':'ita','fr':'fra'}
tokens['lemmas'] = {}
for w in tokens['tokens']:
lemmas = wn.lemmas(w[0], lang=sufixes[language])
if len(lemmas) == 0:
continue
tokens['lemmas'][w[0]] = lemmas
if language == "en":
trans = TextBlob(unicode(text))
else:
try:
trans = TextBlob(unicode(text)).translate(from_lang=language,to='en')
except Exception as ex:
raise Error('Could not translate the text from "{}" to "{}": {}'.format(language,
'en',
str(ex)))
useful_synsets = {}
for w_i, t_w in enumerate(trans.sentences[0].words):
synsets = wn.synsets(trans.sentences[0].words[w_i])
if len(synsets) == 0:
continue
eq_synset = self._compare_synsets(synsets, tokens)
useful_synsets[t_w] = eq_synset
scores = {}
scores = {}
if useful_synsets != None:
for word in useful_synsets:
if useful_synsets[word] is None:
continue
temp_scores = self._swn.get_score(useful_synsets[word].name().split('.')[0].replace(' ',' '))
for score in temp_scores:
if score['synset'] == useful_synsets[word]:
t_score = score['pos'] - score['neg']
f_score = 'neu'
if t_score > 0:
f_score = 'pos'
elif t_score < 0:
f_score = 'neg'
score['score'] = f_score
scores[word] = score
break
g_score = 0.5
for i in scores:
n_pos = 0.0
n_neg = 0.0
for w in scores:
if scores[w]['score'] == 'pos':
n_pos += 1.0
elif scores[w]['score'] == 'neg':
n_neg += 1.0
inter = interp1d([-1.0, 1.0], [0.0, 1.0])
try:
g_score = (n_pos - n_neg) / (n_pos + n_neg)
g_score = float(inter(g_score))
except:
if n_pos == 0 and n_neg == 0:
g_score = 0.5
if g_score > 0.5: # Positive
return [1, 0, 0]
elif g_score < 0.5: # Negative
return [0, 0, 1]
else:
return [0, 1, 0]
test_cases = [
{
'input': 'Odio ir al cine',
'params': {'language': 'es'},
'polarity': 'marl:Negative'
},
{
'input': 'El cielo está nublado',
'params': {'language': 'es'},
'polarity': 'marl:Neutral'
},
{
'input': 'Esta tarta está muy buena',
'params': {'language': 'es'},
'polarity': 'marl:Negative' # SURPRISINGLY!
}
]

View File

@@ -0,0 +1,8 @@
---
module: sentiment-basic
optional: true
requirements:
- nltk>=3.0.5
- scipy>=0.14.0
- textblob

View File

@@ -0,0 +1,70 @@
#!/usr/bin/env python
"""
Author : Jaganadh Gopinadhan <jaganadhg@gmail.com>
Copywright (C) : Jaganadh Gopinadhan
Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""
import sys,os
import re
from nltk.corpus import wordnet
class SentiWordNet(object):
"""
Interface to SentiWordNet
"""
def __init__(self,swn_file):
"""
"""
self.swn_file = swn_file
self.pos_synset = self.__parse_swn_file()
def __parse_swn_file(self):
"""
Parse the SentiWordNet file and populate the POS and SynsetID hash
"""
pos_synset_hash = {}
swn_data = open(self.swn_file,'r').readlines()
head_less_swn_data = filter((lambda line: not re.search(r"^\s*#",\
line)), swn_data)
for data in head_less_swn_data:
fields = data.strip().split("\t")
try:
pos,syn_set_id,pos_score,neg_score,syn_set_score,\
gloss = fields
except:
print("Found data without all details")
pass
if pos and syn_set_score:
pos_synset_hash[(pos,int(syn_set_id))] = (float(pos_score),\
float(neg_score))
return pos_synset_hash
def get_score(self,word,pos=None):
"""
Get score for a given word/word pos combination
"""
senti_scores = []
synsets = wordnet.synsets(word,pos)
for synset in synsets:
if (synset.pos(), synset.offset()) in self.pos_synset:
pos_val, neg_val = self.pos_synset[(synset.pos(), synset.offset())]
senti_scores.append({"pos":pos_val,"neg":neg_val,\
"obj": 1.0 - (pos_val - neg_val),'synset':synset})
return senti_scores

View File

@@ -0,0 +1,280 @@
# -*- coding: utf-8 -*-
'''
MeaningCloud plugin uses API from Meaning Cloud to perform sentiment analysis.
For more information about Meaning Cloud and its services, please visit: https://www.meaningcloud.com/developer/apis
## Usage
To use this plugin, you need to obtain an API key from meaningCloud signing up here: https://www.meaningcloud.com/developer/login
When you had obtained the meaningCloud API Key, you have to provide it to the plugin, using the param **apiKey**.
To use this plugin, you should use a GET Requests with the following possible params:
Params:
- Language: English (en) and Spanish (es). (default: en)
- API Key: the API key from Meaning Cloud. Aliases: ["apiKey","meaningCloud-key"]. (required)
- Input: text to analyse.(required)
- Model: model provided to Meaning Cloud API (for general domain). (default: general)
## Example of Usage
Example request:
```
http://senpy.gsi.upm.es/api/?algo=meaningCloud&language=en&apiKey=<put here your API key>&input=I%20love%20Madrid
```
'''
import time
import requests
import json
import string
import os
from os import path
import time
from senpy.plugins import SentimentPlugin
from senpy.models import Results, Entry, Entity, Topic, Sentiment, Error
from senpy.utils import check_template
class MeaningCloudPlugin(SentimentPlugin):
'''
Sentiment analysis with meaningCloud service.
To use this plugin, you need to obtain an API key from meaningCloud signing up here:
https://www.meaningcloud.com/developer/login
When you had obtained the meaningCloud API Key, you have to provide it to the plugin, using param apiKey.
Example request:
http://senpy.cluster.gsi.dit.upm.es/api/?algo=meaningCloud&language=en&apiKey=YOUR_API_KEY&input=I%20love%20Madrid.
'''
name = 'sentiment-meaningcloud'
author = 'GSI UPM'
version = "1.1"
maxPolarityValue = 1
minPolarityValue = -1
extra_params = {
"language": {
"description": "language of the input",
"aliases": ["language", "l"],
"required": True,
"options": ["en","es","ca","it","pt","fr","auto"],
"default": "auto"
},
"apikey":{
"description": "API key for the meaningcloud service. See https://www.meaningcloud.com/developer/login",
"aliases": ["apiKey", "meaningcloud-key", "meaningcloud-apikey"],
"required": True
}
}
def _polarity(self, value):
if 'NONE' in value:
polarity = 'marl:Neutral'
polarityValue = 0
elif 'N' in value:
polarity = 'marl:Negative'
polarityValue = -1
elif 'P' in value:
polarity = 'marl:Positive'
polarityValue = 1
return polarity, polarityValue
def analyse_entry(self, entry, activity):
params = activity.params
txt = entry['nif:isString']
api = 'http://api.meaningcloud.com/'
lang = params.get("language")
model = "general"
key = params["apikey"]
parameters = {
'key': key,
'model': model,
'lang': lang,
'of': 'json',
'txt': txt,
'tt': 'a'
}
try:
r = requests.post(
api + "sentiment-2.1", params=parameters, timeout=3)
parameters['lang'] = r.json()['model'].split('_')[1]
lang = parameters['lang']
r2 = requests.post(
api + "topics-2.0", params=parameters, timeout=3)
except requests.exceptions.Timeout:
raise Error("Meaning Cloud API does not response")
api_response = r.json()
api_response_topics = r2.json()
if not api_response.get('score_tag'):
raise Error(r.json())
entry['language_detected'] = lang
self.log.debug(api_response)
agg_polarity, agg_polarityValue = self._polarity(
api_response.get('score_tag', None))
agg_opinion = Sentiment(
id="Opinion0",
marl__hasPolarity=agg_polarity,
marl__polarityValue=agg_polarityValue,
marl__opinionCount=len(api_response['sentence_list']))
agg_opinion.prov(self)
entry.sentiments.append(agg_opinion)
self.log.debug(api_response['sentence_list'])
count = 1
for sentence in api_response['sentence_list']:
for nopinion in sentence['segment_list']:
self.log.debug(nopinion)
polarity, polarityValue = self._polarity(
nopinion.get('score_tag', None))
opinion = Sentiment(
id="Opinion{}".format(count),
marl__hasPolarity=polarity,
marl__polarityValue=polarityValue,
marl__aggregatesOpinion=agg_opinion.get('id'),
nif__anchorOf=nopinion.get('text', None),
nif__beginIndex=int(nopinion.get('inip', None)),
nif__endIndex=int(nopinion.get('endp', None)))
count += 1
opinion.prov(self)
entry.sentiments.append(opinion)
mapper = {'es': 'es.', 'en': '', 'ca': 'es.', 'it':'it.', 'fr':'fr.', 'pt':'pt.'}
for sent_entity in api_response_topics['entity_list']:
resource = "_".join(sent_entity.get('form', None).split())
entity = Entity(
id="Entity{}".format(sent_entity.get('id')),
itsrdf__taIdentRef="http://{}dbpedia.org/resource/{}".format(
mapper[lang], resource),
nif__anchorOf=sent_entity.get('form', None),
nif__beginIndex=int(sent_entity['variant_list'][0].get('inip', None)),
nif__endIndex=int(sent_entity['variant_list'][0].get('endp', None)))
sementity = sent_entity['sementity'].get('type', None).split(">")[-1]
entity['@type'] = "ODENTITY_{}".format(sementity)
entity.prov(self)
if 'senpy:hasEntity' not in entry:
entry['senpy:hasEntity'] = []
entry['senpy:hasEntity'].append(entity)
for topic in api_response_topics['concept_list']:
if 'semtheme_list' in topic:
for theme in topic['semtheme_list']:
concept = Topic()
concept.id = "Topic{}".format(topic.get('id'))
concept['@type'] = "ODTHEME_{}".format(theme['type'].split(">")[-1])
concept['fam:topic-reference'] = "http://dbpedia.org/resource/{}".format(theme['type'].split('>')[-1])
entry.prov(self)
if 'senpy:hasTopic' not in entry:
entry['senpy:hasTopic'] = []
entry['senpy:hasTopic'].append(concept)
yield entry
test_cases = [
{
'params': {
'algo': 'sentiment-meaningCloud',
'intype': 'direct',
'expanded-jsonld': 0,
'informat': 'text',
'prefix': '',
'plugin_type': 'analysisPlugin',
'urischeme': 'RFC5147String',
'outformat': 'json-ld',
'conversion': 'full',
'language': 'en',
'apikey': '00000',
'algorithm': 'sentiment-meaningCloud'
},
'input': 'Hello World Obama',
'expected': {
'marl:hasOpinion': [
{'marl:hasPolarity': 'marl:Neutral'}],
'senpy:hasEntity': [
{'itsrdf:taIdentRef': 'http://dbpedia.org/resource/Obama'}],
'senpy:hasTopic': [
{'fam:topic-reference': 'http://dbpedia.org/resource/Astronomy'}]
},
'responses': [
{
'url': 'http://api.meaningcloud.com/sentiment-2.1',
'method': 'POST',
'json': {
'model': 'general_en',
'sentence_list': [{
'text':
'Hello World',
'endp':
'10',
'inip':
'0',
'segment_list': [{
'text':
'Hello World',
'segment_type':
'secondary',
'confidence':
'100',
'inip':
'0',
'agreement':
'AGREEMENT',
'endp':
'10',
'polarity_term_list': [],
'score_tag':
'NONE'
}],
'score_tag':
'NONE',
}],
'score_tag':
'NONE'
}
}, {
'url': 'http://api.meaningcloud.com/topics-2.0',
'method': 'POST',
'json': {
'entity_list': [{
'form':
'Obama',
'id':
'__1265958475430276310',
'variant_list': [{
'endp': '16',
'form': 'Obama',
'inip': '12'
}],
'sementity': {
'fiction': 'nonfiction',
'confidence': 'uncertain',
'class': 'instance',
'type': 'Top>Person'
}
}],
'concept_list': [{
'form':
'world',
'id':
'5c053cd39d',
'relevance':
'100',
'semtheme_list': [{
'id': 'ODTHEME_ASTRONOMY',
'type': 'Top>NaturalSciences>Astronomy'
}]
}],
}
}]
}
]
if __name__ == '__main__':
from senpy import easy_test
easy_test()

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import requests import requests
import json import json

View File

@@ -0,0 +1,42 @@
# Sentimet-vader plugin
Vader is a plugin developed at GSI UPM for sentiment analysis.
The response of this plugin uses [Marl ontology](https://www.gsi.dit.upm.es/ontologies/marl/) developed at GSI UPM for semantic web.
## Acknowledgements
This plugin uses the vaderSentiment module underneath, which is described in the paper:
VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text
C.J. Hutto and Eric Gilbert
Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
If you use this plugin in your research, please cite the above paper.
For more information about the functionality, check the official repository
https://github.com/cjhutto/vaderSentiment
## Usage
Parameters:
- Language: es (Spanish), en(English).
- Input: Text to analyse.
Example request:
```
http://senpy.cluster.gsi.dit.upm.es/api/?algo=sentiment-vader&language=en&input=I%20love%20Madrid
```
Example respond: This plugin follows the standard for the senpy plugin response. For more information, please visit [senpy documentation](http://senpy.readthedocs.io). Specifically, NIF API section.
This plugin supports **python3**
![alt GSI Logo][logoGSI]
[logoGSI]: http://www.gsi.dit.upm.es/images/stories/logos/gsi.png "GSI Logo"
========

View File

@@ -0,0 +1,368 @@
#!/usr/bin/python
# coding: utf-8
'''
Created on July 04, 2013
@author: C.J. Hutto
Citation Information
If you use any of the VADER sentiment analysis tools
(VADER sentiment lexicon or Python code for rule-based sentiment
analysis engine) in your work or research, please cite the paper.
For example:
Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for
Sentiment Analysis of Social Media Text. Eighth International Conference on
Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
'''
import os, math, re, sys, fnmatch, string
import codecs
def make_lex_dict(f):
maps = {}
with codecs.open(f, encoding='iso-8859-1') as f:
for wmsr in f:
w, m = wmsr.strip().split('\t')[:2]
maps[w] = m
return maps
f = 'vader_sentiment_lexicon.txt' # empirically derived valence ratings for words, emoticons, slang, swear words, acronyms/initialisms
try:
word_valence_dict = make_lex_dict(f)
except:
f = os.path.join(os.path.dirname(__file__),'vader_sentiment_lexicon.txt')
word_valence_dict = make_lex_dict(f)
# for removing punctuation
regex_remove_punctuation = re.compile('[%s]' % re.escape(string.punctuation))
def sentiment(text):
"""
Returns a float for sentiment strength based on the input text.
Positive values are positive valence, negative value are negative valence.
"""
wordsAndEmoticons = str(text).split() #doesn't separate words from adjacent punctuation (keeps emoticons & contractions)
text_mod = regex_remove_punctuation.sub('', text) # removes punctuation (but loses emoticons & contractions)
wordsOnly = str(text_mod).split()
# get rid of empty items or single letter "words" like 'a' and 'I' from wordsOnly
for word in wordsOnly:
if len(word) <= 1:
wordsOnly.remove(word)
# now remove adjacent & redundant punctuation from [wordsAndEmoticons] while keeping emoticons and contractions
puncList = [".", "!", "?", ",", ";", ":", "-", "'", "\"",
"!!", "!!!", "??", "???", "?!?", "!?!", "?!?!", "!?!?"]
for word in wordsOnly:
for p in puncList:
pword = p + word
x1 = wordsAndEmoticons.count(pword)
while x1 > 0:
i = wordsAndEmoticons.index(pword)
wordsAndEmoticons.remove(pword)
wordsAndEmoticons.insert(i, word)
x1 = wordsAndEmoticons.count(pword)
wordp = word + p
x2 = wordsAndEmoticons.count(wordp)
while x2 > 0:
i = wordsAndEmoticons.index(wordp)
wordsAndEmoticons.remove(wordp)
wordsAndEmoticons.insert(i, word)
x2 = wordsAndEmoticons.count(wordp)
# get rid of residual empty items or single letter "words" like 'a' and 'I' from wordsAndEmoticons
for word in wordsAndEmoticons:
if len(word) <= 1:
wordsAndEmoticons.remove(word)
# remove stopwords from [wordsAndEmoticons]
#stopwords = [str(word).strip() for word in open('stopwords.txt')]
#for word in wordsAndEmoticons:
# if word in stopwords:
# wordsAndEmoticons.remove(word)
# check for negation
negate = ["aint", "arent", "cannot", "cant", "couldnt", "darent", "didnt", "doesnt",
"ain't", "aren't", "can't", "couldn't", "daren't", "didn't", "doesn't",
"dont", "hadnt", "hasnt", "havent", "isnt", "mightnt", "mustnt", "neither",
"don't", "hadn't", "hasn't", "haven't", "isn't", "mightn't", "mustn't",
"neednt", "needn't", "never", "none", "nope", "nor", "not", "nothing", "nowhere",
"oughtnt", "shant", "shouldnt", "uhuh", "wasnt", "werent",
"oughtn't", "shan't", "shouldn't", "uh-uh", "wasn't", "weren't",
"without", "wont", "wouldnt", "won't", "wouldn't", "rarely", "seldom", "despite"]
def negated(list, nWords=[], includeNT=True):
nWords.extend(negate)
for word in nWords:
if word in list:
return True
if includeNT:
for word in list:
if "n't" in word:
return True
if "least" in list:
i = list.index("least")
if i > 0 and list[i-1] != "at":
return True
return False
def normalize(score, alpha=15):
# normalize the score to be between -1 and 1 using an alpha that approximates the max expected value
normScore = score/math.sqrt( ((score*score) + alpha) )
return normScore
def wildCardMatch(patternWithWildcard, listOfStringsToMatchAgainst):
listOfMatches = fnmatch.filter(listOfStringsToMatchAgainst, patternWithWildcard)
return listOfMatches
def isALLCAP_differential(wordList):
countALLCAPS= 0
for w in wordList:
if str(w).isupper():
countALLCAPS += 1
cap_differential = len(wordList) - countALLCAPS
if cap_differential > 0 and cap_differential < len(wordList):
isDiff = True
else: isDiff = False
return isDiff
isCap_diff = isALLCAP_differential(wordsAndEmoticons)
b_incr = 0.293 #(empirically derived mean sentiment intensity rating increase for booster words)
b_decr = -0.293
# booster/dampener 'intensifiers' or 'degree adverbs' http://en.wiktionary.org/wiki/Category:English_degree_adverbs
booster_dict = {"absolutely": b_incr, "amazingly": b_incr, "awfully": b_incr, "completely": b_incr, "considerably": b_incr,
"decidedly": b_incr, "deeply": b_incr, "effing": b_incr, "enormously": b_incr,
"entirely": b_incr, "especially": b_incr, "exceptionally": b_incr, "extremely": b_incr,
"fabulously": b_incr, "flipping": b_incr, "flippin": b_incr,
"fricking": b_incr, "frickin": b_incr, "frigging": b_incr, "friggin": b_incr, "fully": b_incr, "fucking": b_incr,
"greatly": b_incr, "hella": b_incr, "highly": b_incr, "hugely": b_incr, "incredibly": b_incr,
"intensely": b_incr, "majorly": b_incr, "more": b_incr, "most": b_incr, "particularly": b_incr,
"purely": b_incr, "quite": b_incr, "really": b_incr, "remarkably": b_incr,
"so": b_incr, "substantially": b_incr,
"thoroughly": b_incr, "totally": b_incr, "tremendously": b_incr,
"uber": b_incr, "unbelievably": b_incr, "unusually": b_incr, "utterly": b_incr,
"very": b_incr,
"almost": b_decr, "barely": b_decr, "hardly": b_decr, "just enough": b_decr,
"kind of": b_decr, "kinda": b_decr, "kindof": b_decr, "kind-of": b_decr,
"less": b_decr, "little": b_decr, "marginally": b_decr, "occasionally": b_decr, "partly": b_decr,
"scarcely": b_decr, "slightly": b_decr, "somewhat": b_decr,
"sort of": b_decr, "sorta": b_decr, "sortof": b_decr, "sort-of": b_decr}
sentiments = []
for item in wordsAndEmoticons:
v = 0
i = wordsAndEmoticons.index(item)
if (i < len(wordsAndEmoticons)-1 and str(item).lower() == "kind" and \
str(wordsAndEmoticons[i+1]).lower() == "of") or str(item).lower() in booster_dict:
sentiments.append(v)
continue
item_lowercase = str(item).lower()
if item_lowercase in word_valence_dict:
#get the sentiment valence
v = float(word_valence_dict[item_lowercase])
#check if sentiment laden word is in ALLCAPS (while others aren't)
c_incr = 0.733 #(empirically derived mean sentiment intensity rating increase for using ALLCAPs to emphasize a word)
if str(item).isupper() and isCap_diff:
if v > 0: v += c_incr
else: v -= c_incr
#check if the preceding words increase, decrease, or negate/nullify the valence
def scalar_inc_dec(word, valence):
scalar = 0.0
word_lower = str(word).lower()
if word_lower in booster_dict:
scalar = booster_dict[word_lower]
if valence < 0: scalar *= -1
#check if booster/dampener word is in ALLCAPS (while others aren't)
if str(word).isupper() and isCap_diff:
if valence > 0: scalar += c_incr
else: scalar -= c_incr
return scalar
n_scalar = -0.74
if i > 0 and str(wordsAndEmoticons[i-1]).lower() not in word_valence_dict:
s1 = scalar_inc_dec(wordsAndEmoticons[i-1], v)
v = v+s1
if negated([wordsAndEmoticons[i-1]]): v = v*n_scalar
if i > 1 and str(wordsAndEmoticons[i-2]).lower() not in word_valence_dict:
s2 = scalar_inc_dec(wordsAndEmoticons[i-2], v)
if s2 != 0: s2 = s2*0.95
v = v+s2
# check for special use of 'never' as valence modifier instead of negation
if wordsAndEmoticons[i-2] == "never" and (wordsAndEmoticons[i-1] == "so" or wordsAndEmoticons[i-1] == "this"):
v = v*1.5
# otherwise, check for negation/nullification
elif negated([wordsAndEmoticons[i-2]]): v = v*n_scalar
if i > 2 and str(wordsAndEmoticons[i-3]).lower() not in word_valence_dict:
s3 = scalar_inc_dec(wordsAndEmoticons[i-3], v)
if s3 != 0: s3 = s3*0.9
v = v+s3
# check for special use of 'never' as valence modifier instead of negation
if wordsAndEmoticons[i-3] == "never" and \
(wordsAndEmoticons[i-2] == "so" or wordsAndEmoticons[i-2] == "this") or \
(wordsAndEmoticons[i-1] == "so" or wordsAndEmoticons[i-1] == "this"):
v = v*1.25
# otherwise, check for negation/nullification
elif negated([wordsAndEmoticons[i-3]]): v = v*n_scalar
# check for special case idioms using a sentiment-laden keyword known to SAGE
special_case_idioms = {"the shit": 3, "the bomb": 3, "bad ass": 1.5, "yeah right": -2,
"cut the mustard": 2, "kiss of death": -1.5, "hand to mouth": -2}
# future work: consider other sentiment-laden idioms
#other_idioms = {"back handed": -2, "blow smoke": -2, "blowing smoke": -2, "upper hand": 1, "break a leg": 2,
# "cooking with gas": 2, "in the black": 2, "in the red": -2, "on the ball": 2,"under the weather": -2}
onezero = "{} {}".format(str(wordsAndEmoticons[i-1]), str(wordsAndEmoticons[i]))
twoonezero = "{} {} {}".format(str(wordsAndEmoticons[i-2]), str(wordsAndEmoticons[i-1]), str(wordsAndEmoticons[i]))
twoone = "{} {}".format(str(wordsAndEmoticons[i-2]), str(wordsAndEmoticons[i-1]))
threetwoone = "{} {} {}".format(str(wordsAndEmoticons[i-3]), str(wordsAndEmoticons[i-2]), str(wordsAndEmoticons[i-1]))
threetwo = "{} {}".format(str(wordsAndEmoticons[i-3]), str(wordsAndEmoticons[i-2]))
if onezero in special_case_idioms: v = special_case_idioms[onezero]
elif twoonezero in special_case_idioms: v = special_case_idioms[twoonezero]
elif twoone in special_case_idioms: v = special_case_idioms[twoone]
elif threetwoone in special_case_idioms: v = special_case_idioms[threetwoone]
elif threetwo in special_case_idioms: v = special_case_idioms[threetwo]
if len(wordsAndEmoticons)-1 > i:
zeroone = "{} {}".format(str(wordsAndEmoticons[i]), str(wordsAndEmoticons[i+1]))
if zeroone in special_case_idioms: v = special_case_idioms[zeroone]
if len(wordsAndEmoticons)-1 > i+1:
zeroonetwo = "{} {}".format(str(wordsAndEmoticons[i]), str(wordsAndEmoticons[i+1]), str(wordsAndEmoticons[i+2]))
if zeroonetwo in special_case_idioms: v = special_case_idioms[zeroonetwo]
# check for booster/dampener bi-grams such as 'sort of' or 'kind of'
if threetwo in booster_dict or twoone in booster_dict:
v = v+b_decr
# check for negation case using "least"
if i > 1 and str(wordsAndEmoticons[i-1]).lower() not in word_valence_dict \
and str(wordsAndEmoticons[i-1]).lower() == "least":
if (str(wordsAndEmoticons[i-2]).lower() != "at" and str(wordsAndEmoticons[i-2]).lower() != "very"):
v = v*n_scalar
elif i > 0 and str(wordsAndEmoticons[i-1]).lower() not in word_valence_dict \
and str(wordsAndEmoticons[i-1]).lower() == "least":
v = v*n_scalar
sentiments.append(v)
# check for modification in sentiment due to contrastive conjunction 'but'
if 'but' in wordsAndEmoticons or 'BUT' in wordsAndEmoticons:
try: bi = wordsAndEmoticons.index('but')
except: bi = wordsAndEmoticons.index('BUT')
for s in sentiments:
si = sentiments.index(s)
if si < bi:
sentiments.pop(si)
sentiments.insert(si, s*0.5)
elif si > bi:
sentiments.pop(si)
sentiments.insert(si, s*1.5)
if sentiments:
sum_s = float(sum(sentiments))
#print sentiments, sum_s
# check for added emphasis resulting from exclamation points (up to 4 of them)
ep_count = str(text).count("!")
if ep_count > 4: ep_count = 4
ep_amplifier = ep_count*0.292 #(empirically derived mean sentiment intensity rating increase for exclamation points)
if sum_s > 0: sum_s += ep_amplifier
elif sum_s < 0: sum_s -= ep_amplifier
# check for added emphasis resulting from question marks (2 or 3+)
qm_count = str(text).count("?")
qm_amplifier = 0
if qm_count > 1:
if qm_count <= 3: qm_amplifier = qm_count*0.18
else: qm_amplifier = 0.96
if sum_s > 0: sum_s += qm_amplifier
elif sum_s < 0: sum_s -= qm_amplifier
compound = normalize(sum_s)
# want separate positive versus negative sentiment scores
pos_sum = 0.0
neg_sum = 0.0
neu_count = 0
for sentiment_score in sentiments:
if sentiment_score > 0:
pos_sum += (float(sentiment_score) +1) # compensates for neutral words that are counted as 1
if sentiment_score < 0:
neg_sum += (float(sentiment_score) -1) # when used with math.fabs(), compensates for neutrals
if sentiment_score == 0:
neu_count += 1
if pos_sum > math.fabs(neg_sum): pos_sum += (ep_amplifier+qm_amplifier)
elif pos_sum < math.fabs(neg_sum): neg_sum -= (ep_amplifier+qm_amplifier)
total = pos_sum + math.fabs(neg_sum) + neu_count
pos = math.fabs(pos_sum / total)
neg = math.fabs(neg_sum / total)
neu = math.fabs(neu_count / total)
else:
compound = 0.0; pos = 0.0; neg = 0.0; neu = 0.0
s = {"neg" : round(neg, 3),
"neu" : round(neu, 3),
"pos" : round(pos, 3),
"compound" : round(compound, 4)}
return s
if __name__ == '__main__':
# --- examples -------
sentences = [
"VADER is smart, handsome, and funny.", # positive sentence example
"VADER is smart, handsome, and funny!", # punctuation emphasis handled correctly (sentiment intensity adjusted)
"VADER is very smart, handsome, and funny.", # booster words handled correctly (sentiment intensity adjusted)
"VADER is VERY SMART, handsome, and FUNNY.", # emphasis for ALLCAPS handled
"VADER is VERY SMART, handsome, and FUNNY!!!",# combination of signals - VADER appropriately adjusts intensity
"VADER is VERY SMART, really handsome, and INCREDIBLY FUNNY!!!",# booster words & punctuation make this close to ceiling for score
"The book was good.", # positive sentence
"The book was kind of good.", # qualified positive sentence is handled correctly (intensity adjusted)
"The plot was good, but the characters are uncompelling and the dialog is not great.", # mixed negation sentence
"A really bad, horrible book.", # negative sentence with booster words
"At least it isn't a horrible book.", # negated negative sentence with contraction
":) and :D", # emoticons handled
"", # an empty string is correctly handled
"Today sux", # negative slang handled
"Today sux!", # negative slang with punctuation emphasis handled
"Today SUX!", # negative slang with capitalization emphasis
"Today kinda sux! But I'll get by, lol" # mixed sentiment example with slang and constrastive conjunction "but"
]
paragraph = "It was one of the worst movies I've seen, despite good reviews. \
Unbelievably bad acting!! Poor direction. VERY poor production. \
The movie was bad. Very bad movie. VERY bad movie. VERY BAD movie. VERY BAD movie!"
from nltk import tokenize
lines_list = tokenize.sent_tokenize(paragraph)
sentences.extend(lines_list)
tricky_sentences = [
"Most automated sentiment analysis tools are shit.",
"VADER sentiment analysis is the shit.",
"Sentiment analysis has never been good.",
"Sentiment analysis with VADER has never been this good.",
"Warren Beatty has never been so entertaining.",
"I won't say that the movie is astounding and I wouldn't claim that the movie is too banal either.",
"I like to hate Michael Bay films, but I couldn't fault this one",
"It's one thing to watch an Uwe Boll film, but another thing entirely to pay for it",
"The movie was too good",
"This movie was actually neither that funny, nor super witty.",
"This movie doesn't care about cleverness, wit or any other kind of intelligent humor.",
"Those who find ugly meanings in beautiful things are corrupt without being charming.",
"There are slow and repetitive parts, BUT it has just enough spice to keep it interesting.",
"The script is not fantastic, but the acting is decent and the cinematography is EXCELLENT!",
"Roger Dodger is one of the most compelling variations on this theme.",
"Roger Dodger is one of the least compelling variations on this theme.",
"Roger Dodger is at least compelling as a variation on the theme.",
"they fall in love with the product",
"but then it breaks",
"usually around the time the 90 day warranty expires",
"the twin towers collapsed today",
"However, Mr. Carter solemnly argues, his client carried out the kidnapping under orders and in the ''least offensive way possible.''"
]
sentences.extend(tricky_sentences)
for sentence in sentences:
print(sentence)
ss = sentiment(sentence)
print("\t" + str(ss))
print("\n\n Done!")

View File

@@ -0,0 +1,74 @@
# -*- coding: utf-8 -*-
from vaderSentiment import sentiment
from senpy.plugins import SentimentBox, SenpyPlugin
from senpy.models import Results, Sentiment, Entry
import logging
class VaderSentimentPlugin(SentimentBox):
'''
Sentiment classifier using vaderSentiment module. Params accepted: Language: {en, es}. The output uses Marl ontology developed at GSI UPM for semantic web.
'''
name = "sentiment-vader"
module = "sentiment-vader"
author = "@icorcuera"
version = "0.1.1"
extra_params = {
"language": {
"description": "language of the input",
"@id": "lang_rand",
"aliases": ["language", "l"],
"default": "auto",
"options": ["es", "en", "auto"]
},
"aggregate": {
"description": "Show only the strongest sentiment (aggregate) or all sentiments",
"aliases": ["aggregate","agg"],
"options": [True, False],
"default": False
}
}
requirements = {}
_VADER_KEYS = ['pos', 'neu', 'neg']
binary = False
def predict_one(self, features, activity):
text_input = ' '.join(features)
scores = sentiment(text_input)
sentiments = []
for k in self._VADER_KEYS:
sentiments.append(scores[k])
if activity.param('aggregate'):
m = max(sentiments)
sentiments = [k if k==m else None for k in sentiments]
return sentiments
test_cases = []
test_cases = [
{
'input': 'I am tired :(',
'polarity': 'marl:Negative'
},
{
'input': 'I love pizza :(',
'polarity': 'marl:Positive'
},
{
'input': 'I enjoy going to the cinema :)',
'polarity': 'marl:Negative'
},
{
'input': 'This cake is disgusting',
'polarity': 'marl:Negative'
},
]

File diff suppressed because it is too large Load Diff

View File

@@ -5,10 +5,10 @@
"senpy": "http://www.gsi.upm.es/onto/senpy/ns#", "senpy": "http://www.gsi.upm.es/onto/senpy/ns#",
"prov": "http://www.w3.org/ns/prov#", "prov": "http://www.w3.org/ns/prov#",
"nif": "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#", "nif": "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#",
"marl": "http://www.gsi.dit.upm.es/ontologies/marl/ns#", "marl": "http://www.gsi.upm.es/ontologies/marl/ns#",
"onyx": "http://www.gsi.dit.upm.es/ontologies/onyx/ns#", "onyx": "http://www.gsi.upm.es/ontologies/onyx/ns#",
"wna": "http://www.gsi.dit.upm.es/ontologies/wnaffect/ns#", "wna": "http://www.gsi.upm.es/ontologies/wnaffect/ns#",
"emoml": "http://www.gsi.dit.upm.es/ontologies/onyx/vocabularies/emotionml/ns#", "emoml": "http://www.gsi.upm.es/ontologies/onyx/vocabularies/emotionml/ns#",
"xsd": "http://www.w3.org/2001/XMLSchema#", "xsd": "http://www.w3.org/2001/XMLSchema#",
"fam": "http://vocab.fusepool.info/fam#", "fam": "http://vocab.fusepool.info/fam#",
"topics": { "topics": {

View File

@@ -1,4 +1,4 @@
var ONYX = "http://www.gsi.dit.upm.es/ontologies/onyx/ns#"; var ONYX = "http://www.gsi.upm.es/ontologies/onyx/ns#";
var RDF_TYPE = "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"; var RDF_TYPE = "http://www.w3.org/1999/02/22-rdf-syntax-ns#type";
var plugins_params = default_params = {}; var plugins_params = default_params = {};
var plugins = []; var plugins = [];
@@ -174,10 +174,18 @@ function add_plugin_pipeline(){
function draw_datasets(){ function draw_datasets(){
html = ""; html = "";
repeated_html = "<input class=\"checks-datasets\" type=\"checkbox\" value=\"";
for (dataset in datasets){ for (dataset in datasets){
html += repeated_html+datasets[dataset]["@id"]+"\">"+datasets[dataset]["@id"]; ds = datasets[dataset]
html += "<br>"
// html += repeated_html+datasets[dataset]["@id"]+"\">"+datasets[dataset]["@id"];
html += `
<span class="d-inline-block" tabindex="0" data-toggle="tooltip" title="Instances: ${ds["stats"]["instances"]}">
<div class="form-check form-check-inline">
<input class="form-check-input checks-datasets" type="checkbox" value="${ds["@id"]}">
<label class="form-check-label" for="defaultCheck1">${ds["@id"]}</label>
</div>
</span>
`
} }
document.getElementById("datasets").innerHTML = html; document.getElementById("datasets").innerHTML = html;
} }

View File

@@ -1,7 +1,7 @@
ns = { ns = {
'http://www.gsi.dit.upm.es/ontologies/marl/ns#': 'marl', 'http://www.gsi.upm.es/ontologies/marl/ns#': 'marl',
'http://www.gsi.dit.upm.es/ontologies/onyx/ns#': 'onyx', 'http://www.gsi.upm.es/ontologies/onyx/ns#': 'onyx',
'http://www.gsi.dit.upm.es/ontologies/senpy/ns#': 'onyx', 'http://www.gsi.upm.es/ontologies/senpy/ns#': 'onyx',
'http://www.gsi.upm.es/onto/senpy/ns#': 'senpy', 'http://www.gsi.upm.es/onto/senpy/ns#': 'senpy',
'http://www.w3.org/ns/prov#': 'prov', 'http://www.w3.org/ns/prov#': 'prov',
'http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#': 'nif' 'http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#': 'nif'

View File

@@ -64,7 +64,7 @@
These are some of the things you can do with the API: These are some of the things you can do with the API:
<ul> <ul>
<li>List all available plugins: <a href="/api/plugins">/api/plugins</a></li> <li>List all available plugins: <a href="/api/plugins">/api/plugins</a></li>
<li>Get information about the default plugin: <a href="/api/plugins/default">/api/plugins/default</a></li> <li>Get information about the default plugin: <a href="/api/plugins/default/">/api/plugins/default</a></li>
<li>List all available datasets: <a href="/api/datasets">/api/datasets</a></li> <li>List all available datasets: <a href="/api/datasets">/api/datasets</a></li>
<li>Download the JSON-LD context used: <a href="/api/contexts/Results.jsonld">/api/contexts/Results.jsonld</a></li> <li>Download the JSON-LD context used: <a href="/api/contexts/Results.jsonld">/api/contexts/Results.jsonld</a></li>
</ul> </ul>
@@ -233,28 +233,43 @@ In Data Science and Advanced Analytics (DSAA),
<div class="tab-pane" role="tabpanel" aria-labelledby="nav-evaluate" id="evaluate"> <div class="tab-pane" role="tabpanel" aria-labelledby="nav-evaluate" id="evaluate">
<div class="card my-2"> <div class="card my-2">
<div class="card-body"> <div class="card-body">
<p>Automatically evaluate the classification performance of your plugin in several public datasets, and compare it with other plugins.</p>
<p>The datasets will be automatically downloaded if they are not already available locally. Depending on the size of the dataset and the speed of the plugin, the evaluation may take a long time.</p>
<form id="form" class="container" onsubmit="" accept-charset="utf-8"> <form id="form" class="container" onsubmit="" accept-charset="utf-8">
<div> <div class="card my-2">
<p>Automatically evaluate the classification performance of your plugin in several public datasets, and compare it with other plugins.</p> <div class="card-header">
<p>The datasets will be automatically downloaded if they are not already available locally. Depending on the size of the dataset and the speed of the plugin, the evaluation may take a long time.</p> <h5>
<label>Select the plugin:</label> Select the plugin.
<select id="plugins-eval" name="plugins-eval" class=plugin onchange="draw_extra_parameters()"> </h5>
</select> </div>
</div> <div id="plugin_selection" class="card-body">
<div> <select id="plugins-eval" name="plugins-eval" class=plugin onchange="draw_extra_parameters()">
<label>Select the datasets:</label> </select>
<div id="datasets" name="datasets" >
</select>
</div> </div>
<button id="doevaluate" class="btn btn-lg btn-primary" onclick="evaluate_JSON()">Evaluate Plugin</button>
<!--<button id="visualise" name="type" type="button">Visualise!</button>-->
</div> </div>
<div class="card my-2">
<div class="card-header">
<h5>
Select the dataset.
</h5>
</div>
<div id="dataset_selection" class="card-body">
<div id="datasets" name="datasets" >
</div>
</div>
</div>
<!--<button id="visualise" name="type" type="button">Visualise!</button>-->
<button id="doevaluate" class="btn btn-lg btn-primary" onclick="evaluate_JSON()">Evaluate Plugin</button>
</form> </form>
</div> </div>
</div>
<div class="card my-2">
<div id="loading-results" class="loading"></div> <div id="loading-results" class="loading"></div>
<span id="input_request_eval"></span> <div id="input_request_eval"></div>
<div id="evaluate-div"> <div id="evaluate-div">
<ul class="nav nav-pills" role="tablist"> <ul class="nav nav-pills" role="tablist">
@@ -273,23 +288,25 @@ In Data Science and Advanced Analytics (DSAA),
</div> </div>
</div> </div>
<div class="tab-pane" role="tabpanel" aria-labelledby="" id="evaluate-table"> <div class="tab-pane" role="tabpanel" aria-labelledby="" id="evaluate-table">
<table id="eval_table" class="table table-condensed"> <div>
<thead> <table id="eval_table" class="table table-condensed">
<tr> <thead>
<th>Plugin</th> <tr>
<th>Dataset</th> <th>Plugin</th>
<th>Accuracy</th> <th>Dataset</th>
<th>Precision_macro</th> <th>Accuracy</th>
<th>Recall_macro</th> <th>Precision_macro</th>
<th>F1_macro</th> <th>Recall_macro</th>
<th>F1_weighted</th> <th>F1_macro</th>
<th>F1_micro</th> <th>F1_weighted</th>
<th>F1</th> <th>F1_micro</th>
</tr> <th>F1</th>
</thead> </tr>
<tbody> </thead>
</tbody> <tbody>
</table> </tbody>
</table>
</div>
</div> </div>
</div> </div>
</div> </div>
@@ -309,11 +326,11 @@ In Data Science and Advanced Analytics (DSAA),
</p> </p>
</div> </div>
<div id="site-logos"> <div id="site-logos">
<a href="http://www.gsi.dit.upm.es" target="_blank"><img id="mixedemotions-logo"src="static/img/me.png"/></a> <a href="http://www.gsi.upm.es" target="_blank"><img id="mixedemotions-logo"src="static/img/me.png"/></a>
</div> </div>
</div> </div>
</div> </div>
</body> </body>
<link href='http://fonts.googleapis.com/css?family=Architects+Daughter' rel='stylesheet' type='text/css'> <link href='//fonts.googleapis.com/css?family=Architects+Daughter' rel='stylesheet' type='text/css'>
</html> </html>

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from past.builtins import basestring from past.builtins import basestring
import os import os

View File

@@ -1,5 +1,20 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from . import models, __version__ from . import models, __version__
from collections import MutableMapping from collections.abc import MutableMapping
import pprint import pprint
import pdb import pdb
@@ -29,6 +44,10 @@ def check_template(indict, template):
raise models.Error(('Element not found.' raise models.Error(('Element not found.'
'\nExpected: {}\nIn: {}').format(pprint.pformat(e), '\nExpected: {}\nIn: {}').format(pprint.pformat(e),
pprint.pformat(indict))) pprint.pformat(indict)))
elif isinstance(template, float) and isinstance(indict, float):
diff = abs(indict - template)
if (diff > 0) and diff/(abs(indict+template)) > 0.05:
raise models.Error('Differences greater than 10% found.\n')
else: else:
if indict != template: if indict != template:
raise models.Error(('Differences found.\n' raise models.Error(('Differences found.\n'

View File

@@ -1,3 +1,19 @@
#
# Copyright 2014 Grupo de Sistemas Inteligentes (GSI) DIT, UPM
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os import os
import logging import logging
@@ -13,7 +29,7 @@ def read_version(versionfile=DEFAULT_FILE):
return f.read().strip() return f.read().strip()
except IOError: # pragma: no cover except IOError: # pragma: no cover
logger.error('Running an unknown version of senpy. Be careful!.') logger.error('Running an unknown version of senpy. Be careful!.')
return '0.0' return 'devel'
__version__ = read_version() __version__ = read_version()

View File

@@ -13,7 +13,7 @@ max-line-length = 100
[bdist_wheel] [bdist_wheel]
universal=1 universal=1
[tool:pytest] [tool:pytest]
addopts = --cov=senpy --cov-report term-missing addopts = -v --cov=senpy --cov-report term-missing
filterwarnings = filterwarnings =
ignore:the matrix subclass:PendingDeprecationWarning ignore:the matrix subclass:PendingDeprecationWarning
[coverage:report] [coverage:report]

View File

@@ -1,17 +1,44 @@
from setuptools import setup '''
Copyright 2014 GSI DIT UPM
with open('senpy/VERSION') as f: Licensed under the Apache License, Version 2.0 (the "License");
__version__ = f.read().strip() you may not use this file except in compliance with the License.
assert __version__ You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
'''
from setuptools import setup
from os import path
import io
try:
with io.open('senpy/VERSION') as f:
__version__ = f.read().strip()
assert __version__
except IOError: # pragma: no cover
print('Installing a development version of senpy. Proceed with caution!')
__version__ = 'devel'
def parse_requirements(filename): def parse_requirements(filename):
""" load requirements from a pip requirements file """ """ load requirements from a pip requirements file """
with open(filename, 'r') as f: with io.open(filename, 'r') as f:
lineiter = list(line.strip() for line in f) lineiter = list(line.strip() for line in f)
return [line for line in lineiter if line and not line.startswith("#")] return [line for line in lineiter if line and not line.startswith("#")]
this_directory = path.abspath(path.dirname(__file__))
with io.open(path.join(this_directory, 'README.rst'), encoding='utf-8') as f:
long_description = f.read()
install_reqs = parse_requirements("requirements.txt") install_reqs = parse_requirements("requirements.txt")
test_reqs = parse_requirements("test-requirements.txt") test_reqs = parse_requirements("test-requirements.txt")
extra_reqs = parse_requirements("extra-requirements.txt") extra_reqs = parse_requirements("extra-requirements.txt")
@@ -19,12 +46,14 @@ extra_reqs = parse_requirements("extra-requirements.txt")
setup( setup(
name='senpy', name='senpy',
python_requires='>3.3', python_requires='>3.6',
packages=['senpy'], # this must be the same as the name above packages=['senpy'], # this must be the same as the name above
version=__version__, version=__version__,
description=('A sentiment analysis server implementation. ' description=('A sentiment analysis server implementation. '
'Designed to be extensible, so new algorithms ' 'Designed to be extensible, so new algorithms '
'and sources can be used.'), 'and sources can be used.'),
long_description=long_description,
long_description_content_type='text/x-rst',
author='J. Fernando Sanchez', author='J. Fernando Sanchez',
author_email='balkian@gmail.com', author_email='balkian@gmail.com',
url='https://github.com/gsi-upm/senpy', # use the URL to the github repo url='https://github.com/gsi-upm/senpy', # use the URL to the github repo
@@ -38,7 +67,8 @@ setup(
tests_require=test_reqs, tests_require=test_reqs,
setup_requires=['pytest-runner', ], setup_requires=['pytest-runner', ],
extras_require={ extras_require={
'evaluation': extra_reqs 'evaluation': extra_reqs,
'extras': extra_reqs
}, },
include_package_data=True, include_package_data=True,
entry_points={ entry_points={

Some files were not shown because too many files have changed in this diff Show More