1
0
mirror of https://github.com/gsi-upm/sitc synced 2025-09-13 10:22:20 +00:00

185 Commits

Author SHA1 Message Date
Carlos A. Iglesias
9844820e66 Delete xai/readme 2025-06-06 17:24:29 +03:00
Carlos A. Iglesias
d10434362e Add files via upload 2025-06-06 17:24:05 +03:00
Carlos A. Iglesias
fb2135cea6 Create readme 2025-06-06 17:23:37 +03:00
Carlos A. Iglesias
ba6e533e0b Add files via upload
XAI notebook
2025-06-06 17:23:16 +03:00
Carlos A. Iglesias
4f5e976918 Create readme 2025-06-06 17:22:33 +03:00
Carlos A. Iglesias
b58370a19a Update .gitignore 2025-06-02 17:23:44 +03:00
Carlos A. Iglesias
5c203b0884 Update spiral.py
Fixed typo
2025-06-02 17:22:55 +03:00
Carlos A. Iglesias
5bf815f60f Update 2_4_2_Exercise_Optional.ipynb
Changed image path
2025-06-02 17:22:16 +03:00
Carlos A. Iglesias
90a3ff098b Update 2_4_1_Exercise.ipynb
Changed image path
2025-06-02 17:21:25 +03:00
Carlos A. Iglesias
945a8a7fb6 Update 2_4_0_Intro_NN.ipynb
Changed image path
2025-06-02 17:19:19 +03:00
Carlos A. Iglesias
6532ef1b27 Update 2_8_Conclusions.ipynb
Changed image path
2025-06-02 17:18:31 +03:00
Carlos A. Iglesias
3a73b2b286 Update 2_7_Model_Persistence.ipynb
Changed image path
2025-06-02 17:17:43 +03:00
Carlos A. Iglesias
2e4ec3cfdc Update 2_6_Model_Tuning.ipynb 2025-06-02 17:16:53 +03:00
Carlos A. Iglesias
21e7ae2f57 Update 2_5_2_Decision_Tree_Model.ipynb
Changed image path
2025-06-02 17:13:49 +03:00
Carlos A. Iglesias
7b4d16964d Update 2_5_1_kNN_Model.ipynb
Changed image path
2025-06-02 17:11:45 +03:00
Carlos A. Iglesias
c5967746ea Update 2_5_0_Machine_Learning.ipynb 2025-06-02 17:09:42 +03:00
Carlos A. Iglesias
ed7f0f3e1c Update 2_5_0_Machine_Learning.ipynb 2025-06-02 17:09:13 +03:00
Carlos A. Iglesias
9324516c19 Update 2_5_0_Machine_Learning.ipynb
Changed image path
2025-06-02 17:08:03 +03:00
Carlos A. Iglesias
6fc5565ea0 Update 2_2_Read_Data.ipynb 2025-06-02 17:05:17 +03:00
Carlos A. Iglesias
1113485833 Add files via upload 2025-06-02 17:03:20 +03:00
Carlos A. Iglesias
0c3f317a85 Add files via upload 2025-06-02 17:02:46 +03:00
Carlos A. Iglesias
0b550c837b Update 2_2_Read_Data.ipynb
Added figures
2025-06-02 17:00:58 +03:00
Carlos A. Iglesias
d7ce6df7fe Update 2_2_Read_Data.ipynb 2025-06-02 16:57:54 +03:00
Carlos A. Iglesias
e2edae6049 Update 2_2_Read_Data.ipynb 2025-06-02 16:54:37 +03:00
Carlos A. Iglesias
4ea0146def Update 2_2_Read_Data.ipynb 2025-06-02 16:54:06 +03:00
Carlos A. Iglesias
e7b2cee795 Add files via upload 2025-06-02 16:31:20 +03:00
Carlos A. Iglesias
9e1d0e5534 Add files via upload 2025-06-02 16:30:13 +03:00
Carlos A. Iglesias
f82203f371 Update 2_4_Preprocessing.ipynb
Changed image path
2025-06-02 16:29:26 +03:00
Carlos A. Iglesias
b9ecccdeab Update 2_3_1_Advanced_Visualisation.ipynb 2025-06-02 16:28:06 +03:00
Carlos A. Iglesias
44a555ac2d Update 2_3_1_Advanced_Visualisation.ipynb
Changed image path
2025-06-02 16:09:55 +03:00
Carlos A. Iglesias
ec11ff2d5e Update 2_3_0_Visualisation.ipynb
Changed image path
2025-06-02 16:06:53 +03:00
Carlos A. Iglesias
ec02125396 Update 2_2_Read_Data.ipynb 2025-06-02 16:04:57 +03:00
Carlos A. Iglesias
b5f1a7dd22 Update 2_0_0_Intro_ML.ipynb 2025-06-02 16:03:03 +03:00
Carlos A. Iglesias
1cc1e45673 Update 2_2_Read_Data.ipynb
Changed image path
2025-06-02 16:02:45 +03:00
Carlos A. Iglesias
a2ad2c0e92 Update 2_1_Intro_ScikitLearn.ipynb
Changed images path
2025-06-02 16:00:59 +03:00
Carlos A. Iglesias
1add6a4c8e Update 2_0_1_Objectives.ipynb
Changed image path
2025-06-02 15:58:32 +03:00
Carlos A. Iglesias
af78e6480d Update 2_0_0_Intro_ML.ipynb
changed path to image
2025-06-02 15:57:25 +03:00
Carlos A. Iglesias
cae7d8cbb2 Updated LLM 2025-05-05 16:39:20 +02:00
Carlos A. Iglesias
f58aa6c0b8 Delete nlp/0_1_LLM.ipynb 2025-05-05 16:38:41 +02:00
Carlos A. Iglesias
6e8448f22f Update 0_2_NLP_Assignment.ipynb 2025-04-24 18:31:56 +02:00
Carlos A. Iglesias
8f2a5c17d8 Update 0_1_NLP_Slides.ipynb 2025-04-24 18:30:18 +02:00
Carlos A. Iglesias
36d117e417 Delete nlp/spacy/readme.md 2025-04-21 18:59:11 +02:00
Carlos A. Iglesias
2fc057f6f9 Add files via upload 2025-04-21 18:58:47 +02:00
Carlos A. Iglesias
5b0d4f2a5d Add files via upload 2025-04-21 18:58:15 +02:00
Carlos A. Iglesias
7afa2b3b22 Create readme.md 2025-04-21 18:57:59 +02:00
Carlos A. Iglesias
4e0f9159e8 Update 2_5_1_Exercise.ipynb 2025-04-03 18:54:52 +02:00
Carlos A. Iglesias
82aa552976 Update 2_5_1_Exercise.ipynb 2025-04-03 18:53:35 +02:00
Carlos A. Iglesias
3ebff69cf8 Update 2_5_1_Exercise.ipynb 2025-04-03 18:43:58 +02:00
Carlos A. Iglesias
0f228bbec3 Update 2_5_1_Exercise.ipynb 2025-04-03 18:43:34 +02:00
Carlos A. Iglesias
64c8854741 Update 2_5_1_Exercise.ipynb 2025-04-03 18:41:49 +02:00
Carlos A. Iglesias
3e081e5d83 Update 2_5_1_Exercise.ipynb 2025-04-03 18:38:26 +02:00
Carlos A. Iglesias
065797b886 Update 2_5_1_Exercise.ipynb 2025-04-03 18:37:26 +02:00
Carlos A. Iglesias
8d2f625b7e Update 2_5_1_Exercise.ipynb 2025-04-03 18:36:31 +02:00
Carlos A. Iglesias
26eda30a71 Update 2_5_1_Exercise.ipynb 2025-04-03 18:35:53 +02:00
Carlos A. Iglesias
55365ae927 Update 2_5_1_Exercise.ipynb 2025-04-03 18:34:50 +02:00
Carlos A. Iglesias
152125b3da Update 2_5_1_Exercise.ipynb 2025-04-03 18:33:47 +02:00
Carlos A. Iglesias
97362545ea Update 2_5_1_Exercise.ipynb
Added https://sklearn-genetic-opt.readthedocs.io/en/stable/index.html
2025-04-03 18:32:32 +02:00
cif
c49c866a2e Update notebook with pivot_table examples 2025-03-06 16:05:16 +01:00
Carlos A. Iglesias
3f7694e330 Add files via upload
Added ttl
2025-02-20 19:14:13 +01:00
Carlos A. Iglesias
bf684d6e6e Updated index 2024-06-07 17:54:18 +03:00
Carlos A. Iglesias
d935b85b26 Add files via upload
Added images
2024-06-03 14:44:28 +02:00
Carlos A. Iglesias
1d8e777236 Create .p 2024-06-03 15:42:13 +03:00
Carlos A. Iglesias
23ebe2f390 Update 3_1_Read_Data.ipynb
Updated table markdown
2024-05-21 14:30:26 +02:00
Carlos A. Iglesias
01eb89ada4 New noteboook about transformers 2024-05-14 09:55:02 +02:00
Carlos A. Iglesias
e4fdcd65a1 Update 2_6_1_Q-Learning_Basic.ipynb
Updated installation with new version of gymnasium
2024-04-24 18:46:54 +02:00
Carlos A. Iglesias
9f46c534f7 Update 2_5_1_Exercise.ipynb
Added optional exercises.
2024-04-18 18:04:43 +02:00
Carlos A. Iglesias
743c57691f Delete sna/t.txt 2024-04-17 17:24:12 +02:00
Carlos A. Iglesias
2c53b81299 Uploaded SNA files 2024-04-17 17:23:28 +02:00
Carlos A. Iglesias
dd6c053109 Add files via upload 2024-04-17 17:22:36 +02:00
Carlos A. Iglesias
e35e0a11e9 Create t.txt 2024-04-17 17:22:20 +02:00
Carlos A. Iglesias
7315b681e4 Update README.md 2024-04-17 17:21:21 +02:00
Carlos A. Iglesias
3fac9c6f78 Add files via upload 2024-04-04 18:27:48 +02:00
Carlos A. Iglesias
21819abeae Added visualization notebooks 2024-04-03 22:53:02 +02:00
Carlos A. Iglesias
0d4c0c706d Added images 2024-04-03 22:51:58 +02:00
Carlos A. Iglesias
8de629b495 Create .gitkeep 2024-04-03 22:51:19 +02:00
Carlos A. Iglesias
86114b4a56 Added preprocessing notebooks 2024-04-03 22:50:36 +02:00
Carlos A. Iglesias
1a3f618995 Add files via upload 2024-04-03 21:52:25 +02:00
Carlos A. Iglesias
a1121c03a5 Create .gitkeep - Added preprocessing notebooks 2024-04-03 21:51:34 +02:00
Carlos A. Iglesias
715d0cb77f Create .gitkeep
Added new set of exercises
2024-04-03 21:50:50 +02:00
Carlos A. Iglesias
0150ce7cf7 Update 3_7_SVM.ipynb
Updated formatted table
2024-02-22 12:23:08 +01:00
Carlos A. Iglesias
08dfe5c147 Update 3_4_Visualisation_Pandas.ipynb
Updated code to last version of seaborn
2024-02-22 11:55:35 +01:00
Carlos A. Iglesias
78e62af098 Update 3_3_Data_Munging_with_Pandas.ipynb
Updated to last version of scikit
2024-02-21 12:29:04 +01:00
Carlos A. Iglesias
3f5eba3e84 Update 3_2_Pandas.ipynb
Updated links
2024-02-21 12:16:12 +01:00
Carlos A. Iglesias
2de1cda8f1 Update 3_1_Read_Data.ipynb
Updated links
2024-02-21 12:14:25 +01:00
Carlos A. Iglesias
cc442c35f3 Update 3_0_0_Intro_ML_2.ipynb
Updated links
2024-02-21 12:12:14 +01:00
Carlos A. Iglesias
1100c352fa Update 2_6_Model_Tuning.ipynb
updated links
2024-02-21 11:47:34 +01:00
Carlos A. Iglesias
9b573d292d Update 2_5_2_Decision_Tree_Model.ipynb
Updated links
2024-02-21 11:41:42 +01:00
Carlos A. Iglesias
dd8a4f50d8 Update 2_5_2_Decision_Tree_Model.ipynb
Updated links
2024-02-21 11:40:59 +01:00
Carlos A. Iglesias
47148f2ccc Update util_ds.py
Updated links
2024-02-21 11:40:06 +01:00
Carlos A. Iglesias
8ffda8123a Update 2_5_1_kNN_Model.ipynb
Updated links
2024-02-21 11:07:38 +01:00
Carlos A. Iglesias
6629837e7d Update 2_5_0_Machine_Learning.ipynb
Updated links
2024-02-21 11:06:21 +01:00
Carlos A. Iglesias
ba08a9a264 Update 2_4_Preprocessing.ipynb
Updated links
2024-02-21 11:02:09 +01:00
Carlos A. Iglesias
4b8fd30f42 Update 2_3_1_Advanced_Visualisation.ipynb
Updated links
2024-02-21 11:00:53 +01:00
Carlos A. Iglesias
d879369930 Update 2_3_0_Visualisation.ipynb
Updated links
2024-02-21 10:57:34 +01:00
Carlos A. Iglesias
4da01f3ae6 Update 2_0_0_Intro_ML.ipynb
Updated links
2024-02-21 10:44:43 +01:00
Carlos A. Iglesias
da9a01e26b Update 2_0_1_Objectives.ipynb
Updated links
2024-02-21 10:43:40 +01:00
Carlos A. Iglesias
dc23b178d7 Delete python/plurals.py 2024-02-08 18:32:43 +01:00
Carlos A. Iglesias
5410d6115d Delete python/catalog.py 2024-02-08 18:32:18 +01:00
Carlos A. Iglesias
6749aa5deb Added files for modules 2024-02-08 18:26:08 +01:00
Carlos A. Iglesias
c31e6c1676 Update 1_2_Numbers_Strings.ipynb 2024-02-08 17:47:42 +01:00
Carlos A. Iglesias
1c7496c8ac Update 1_2_Numbers_Strings.ipynb
Improved formatting.
2024-02-08 17:46:18 +01:00
Carlos A. Iglesias
35b1ae4ec8 Update 1_8_Classes.ipynb
Improved formatting.
2024-02-08 17:43:25 +01:00
Carlos A. Iglesias
58fc6f5e9c Update 1_4_Sets.ipynb
Typo corrected.
2024-02-08 17:42:45 +01:00
Carlos A. Iglesias
91147becee Update 1_3_Sequences.ipynb
Formatting improvement.
2024-02-08 17:41:15 +01:00
Carlos A. Iglesias
1530995243 Update 1_0_Intro_Python.ipynb
Updated links.
2024-02-08 17:36:46 +01:00
Carlos A. Iglesias
0c0960cec7 Update 1_7_Variables.ipynb typo in bold markdown
Typo in bold markdown
2024-02-08 17:33:48 +01:00
cif
3363c953f4 Borrada versión anterior 2023-04-27 15:43:44 +02:00
cif
542ce2708d Actualizada práctica a gymnasium y extendida 2023-04-27 15:42:01 +02:00
cif
380340d66d Updated 4_4 to use get_feature_names_out() instead of get_feature_names 2023-04-23 16:41:53 +02:00
cif
7f49f8990b Updated 4_4 - using feature_log_prob_ instead of coef_ (deprecated) 2023-04-23 16:37:48 +02:00
Carlos A. Iglesias
419ea57824 Transparencias con Spacy 2023-04-20 18:20:44 +02:00
Carlos A. Iglesias
7d6010114d Upload data for assignment 2023-04-20 18:17:12 +02:00
Carlos A. Iglesias
f9d8234e14 Added exercise with Spacy 2023-04-20 16:20:28 +02:00
Carlos A. Iglesias
d41fa61c65 Delete 0_2_NLP_Assignment.ipynb 2023-04-20 16:19:57 +02:00
Carlos A. Iglesias
05a4588acf Exercise with Spacy 2023-04-20 16:18:47 +02:00
Carlos A. Iglesias
50933f6c94 Update 3_7_SVM.ipynb
Fixed typo and updated link
2023-03-09 18:04:14 +01:00
J. Fernando Sánchez
68ba528dd7 Fix typos 2023-02-20 19:43:36 +01:00
J. Fernando Sánchez
897bb487b1 Actualizar ejercicios LOD 2023-02-13 18:26:14 +01:00
Oscar Araque
41d3bdea75 minor typos in ml1 2022-09-05 18:20:29 +02:00
Carlos A. Iglesias
0a9cd3bd5e Update 3_7_SVM.ipynb
Fixed typo in a comment
2022-03-17 17:58:09 +01:00
Carlos A. Iglesias
2c7c9e58e0 Update 3_7_SVM.ipynb
Fixed bug in ROC curve visualization
2022-03-17 17:50:27 +01:00
cif
f0278aea33 Updated 2022-03-07 14:19:44 +01:00
cif
7bf0fb6479 Updated 2022-03-07 14:17:02 +01:00
cif
4d87b07ed9 Updated visualization 2022-03-07 14:16:14 +01:00
cif
7d71ba5f7a Updated references 2022-03-07 13:03:48 +01:00
cif
1124c9129c Fixed URL 2022-03-07 13:01:21 +01:00
cif
df6449b55f Updated to last version of seaborn 2022-03-07 12:57:17 +01:00
cif
d99eeb733a Updated median with only numeric values 2022-03-07 12:44:14 +01:00
cif
a43fb4c78c Updated references 2022-03-07 12:28:10 +01:00
Carlos A. Iglesias
bf21e3ceab Update 3_1_Read_Data.ipynb
Updated references
2022-03-07 11:01:34 +01:00
Carlos A. Iglesias
e41d233828 Update 3_0_0_Intro_ML_2.ipynb
Updated bibliography
2022-03-07 10:58:29 +01:00
Carlos A. Iglesias
a7c6be5b96 Update 2_6_Model_Tuning.ipynb
Fixed typo.
2022-02-28 12:51:18 +01:00
Carlos A. Iglesias
11a1ea80d3 Update 2_6_Model_Tuning.ipynb
Fixed typos.
2022-02-28 12:45:40 +01:00
Carlos A. Iglesias
a209d18a5b Update 2_5_1_kNN_Model.ipynb
Fixed typo.
2022-02-28 12:38:27 +01:00
cif
ffefd8c2e3 Actualizada bibliografía 2022-02-21 13:55:09 +01:00
cif
f43cde73e4 Actualizada bibliografía 2022-02-21 13:51:21 +01:00
cif
8784fdc773 Actualizada bibliografía 2022-02-21 13:39:33 +01:00
cif
a6d5f9ddeb Actualizada bibliografía 2022-02-21 13:32:07 +01:00
cif
2e72a4d729 Actualizada bibliografía 2022-02-21 13:29:33 +01:00
cif
9426b4c061 Actualizada bibliografía 2022-02-21 13:26:24 +01:00
cif
5e5979d515 Actualizados enlace 2022-02-21 13:22:46 +01:00
cif
270dcec611 Actualizados enlace 2022-02-21 13:09:21 +01:00
Carlos A. Iglesias
e6e52b43ee Update 2_4_Preprocessing.ipynb
Actualizado enlace de Packt.
2022-02-21 12:57:53 +01:00
Carlos A. Iglesias
3b7675fa3f Update 2_3_0_Visualisation.ipynb
Actualizado enlace de bibliografía de packt.
2022-02-21 12:56:22 +01:00
Carlos A. Iglesias
44c63412f9 Update 2_2_Read_Data.ipynb
Updated scikit url
2022-02-21 12:26:30 +01:00
Carlos A. Iglesias
5febbc21a4 Update 2_1_Intro_ScikitLearn.ipynb
Errata en dimensionality.
2022-02-21 12:22:15 +01:00
J. Fernando Sánchez
66ed4ba258 Minor changes LOD 01 and 03 2022-02-15 20:48:49 +01:00
Carlos A. Iglesias
95cd25aef4 Update 1__10_Modules_Packages.ipynb
Fixed link to module tutorial
2022-02-10 17:51:32 +01:00
J. Fernando Sánchez
955e74fc8e Add requirements
Now the dependencies should be automatically installed if you open the repo
through Jupyter Binder
2021-11-10 08:48:54 +01:00
cif2cif
6743dad100 Cleaned output 2021-06-07 10:38:53 +02:00
cif2cif
729f7684c2 Cleaned output 2021-06-07 10:36:12 +02:00
cif2cif
ae8d3d3ba2 Updated with the new libraries 2021-05-07 11:10:21 +02:00
cif2cif
2ba0e2f3d9 updated to last version of OpenGym 2021-04-19 19:10:03 +02:00
cif2cif
c9114cc796 Fixed broken link and bug of sklearn-deap with scikit 0.24 2021-04-19 17:47:22 +02:00
cif2cif
b80c097362 Merge branch 'master' of https://github.com/gsi-upm/sitc 2021-04-06 10:21:25 +02:00
cif2cif
161cd8492b Fixed bug in substrings_in_string and set default df[AgeGroup] to np.nan 2021-04-06 10:20:29 +02:00
Oscar Araque
3d6d96dd8a updated ml1/2_6: using scorer to avoid traning warnings 2021-03-11 16:28:14 +01:00
cif2cif
44aa3d24fb Updated joblib import to sklearn 0.23 2021-02-27 21:30:21 +01:00
cif2cif
8925a4a3c1 Clean 2_5_2 2021-02-27 20:51:52 +01:00
cif2cif
23913811df Clean 2_5_1 2021-02-27 20:50:03 +01:00
cif2cif
7b4391f187 Updated reference to module six modified in scikit-learn 0.23 2021-02-27 20:21:15 +01:00
cif2cif
0c100dbadc Merge branch 'master' of https://github.com/gsi-upm/sitc 2021-02-27 20:12:26 +01:00
cif2cif
2f7cbe9e45 Updated util_knn.py to new version of scikit 2021-02-27 20:11:17 +01:00
J. Fernando Sánchez
b43125ca59 LOD: minor changes 2021-02-22 17:32:31 +01:00
cif2cif
5144b7f228 Added intro RDF and tutorial 2021-02-22 12:55:40 +01:00
cif2cif
8b6d6de169 added tutorial SPARQL 2021-02-18 18:10:59 +01:00
Carlos A. Iglesias
7271b5e632 Update README.md
Added clone comment
2021-02-09 19:54:56 +01:00
Carlos A. Iglesias
bd99321d6b Update README.md
Fixed a typo
2021-02-09 19:53:55 +01:00
Carlos A. Iglesias
91b8f66056 Update 1_1_Notebooks.ipynb
Changes URL for Anacoda
2021-02-09 19:53:14 +01:00
Carlos A. Iglesias
242a0a9252 Update 4_7_Exercises.ipynb 2020-04-29 18:46:31 +02:00
Carlos A. Iglesias
d8d25c4dc3 Update 4_7_Exercises.ipynb 2020-04-29 18:11:50 +02:00
Carlos A. Iglesias
4979fe6877 Update 4_7_Exercises.ipynb 2020-04-29 18:10:10 +02:00
Carlos A. Iglesias
c5e0f146c4 Update 4_7_Exercises.ipynb 2020-04-29 18:06:24 +02:00
Carlos A. Iglesias
167475029e Updated exercise 1 since the code of the previous link was outdated 2020-04-29 18:04:18 +02:00
Carlos A. Iglesias
da79a18bfc Fixed broken link 2020-04-23 23:25:09 +02:00
Carlos A. Iglesias
47761c11aa errata en algoritmos 2020-03-05 17:19:56 +01:00
J. Fernando Sánchez
fd5aa4a1fd fix typo 2020-02-20 17:38:02 +01:00
J. Fernando Sánchez
396a7b17ca update RDF example 2020-02-20 17:36:07 +01:00
J. Fernando Sánchez
2248188219 Updated URL rdf and LOD 2020-02-20 11:28:55 +01:00
Carlos A. Iglesias
21dc5ec3de Update 1__10_Modules_Packages.ipynb 2020-02-13 18:10:08 +01:00
Carlos A. Iglesias
db99033727 Update 1__10_Modules_Packages.ipynb 2020-02-13 18:08:25 +01:00
Carlos A. Iglesias
5459e801d5 Update 1_7_Variables.ipynb 2020-02-13 17:56:36 +01:00
Carlos A. Iglesias
75f08ea170 Merge pull request #5 from gsi-upm/dveni-patch-2
Update 4_1_Lexical_Processing.ipynb
2019-11-27 10:19:12 +01:00
Dani Vera
19ea5dff09 Update 4_1_Lexical_Processing.ipynb 2019-11-26 15:14:40 +01:00
Carlos A. Iglesias
e70689072f Merge pull request #4 from gsi-upm/dveni-patch-1
Update 3_3_Data_Munging_with_Pandas.ipynb
2019-09-19 10:46:19 +02:00
177 changed files with 156562 additions and 1755 deletions

View File

@@ -1,19 +1,21 @@
# sitc # sitc
Exercises for Intelligent Systems Course at Universidad Politécnica de Madrid, Telecommunication Engineering School. This material is used in the subjects Exercises for Intelligent Systems Course at Universidad Politécnica de Madrid, Telecommunication Engineering School. This material is used in the subjects
- SITC (Sistemas Inteligentes y Tecnologías del Conocimiento) - Master Universitario de Ingeniería de Telecomunicación (MUIT) - CDAW (Ciencia de datos y aprendizaje en automático en la web de datos) - Master Universitario de Ingeniería de Telecomunicación (MUIT)
- TIAD (Tecnologías Inteligentes de Análisis de Datos) - Master Universitario en Ingeniera de Redes y Servicios Telemáticos) - ABID (Analítica de Big Data) - Master Universitario en Ingeniera de Redes y Servicios Telemáticos)
For following this course: For following this course:
- Follow the instructions to install the environment: https://github.com/gsi-upm/sitc/blob/master/python/1_1_Notebooks.ipynb (Just install 'conda') - Follow the instructions to install the environment: https://github.com/gsi-upm/sitc/blob/master/python/1_1_Notebooks.ipynb (Just install 'conda')
- Download the course: use 'https://github.com/gsi-upm/sitc' - Download the course: use 'https://github.com/gsi-upm/sitc' (or clone the repository to receive updates).
- Run in a terminal in the foloder sitc: jupyter notebook (and enjoy) - Run in a terminal in the folder sitc: jupyter notebook (and enjoy)
Topics Topics
* Python: quick introduction to Python * Python: a quick introduction to Python
* ML-1: introduction to machine learning with scikit-learn * ML-1: introduction to machine learning with scikit-learn
* ML-2: introduction to machine learning with pandas and scikit-learn * ML-2: introduction to machine learning with pandas and scikit-learn
* ML-21: preprocessing and visualizatoin
* ML-3: introduction to machine learning. Neural Computing * ML-3: introduction to machine learning. Neural Computing
* ML-4: introduction to Evolutionary Computing * ML-4: introduction to Evolutionary Computing
* ML-5: introduction to Reinforcement Learning * ML-5: introduction to Reinforcement Learning
* NLP: introduction to NLP * NLP: introduction to NLP
* LOD: Linked Open Data, exercises and example code * LOD: Linked Open Data, exercises and example code
* SNA: Social Network Analysis

1
images/.p Normal file
View File

@@ -0,0 +1 @@

BIN
images/EscUpmPolit_p.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 KiB

BIN
images/cart.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

BIN
images/data-chart-type.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

BIN
images/frozenlake-world.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

BIN
images/gym-maze.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 222 KiB

BIN
images/iris-classes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

BIN
images/iris-dataset.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

BIN
images/iris-features.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 944 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 237 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

BIN
images/qlearning-algo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

BIN
images/recording.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

BIN
images/titanic.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 152 KiB

View File

@@ -0,0 +1,484 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<header style=\"width:100%;position:relative\">\n",
" <div style=\"width:80%;float:right;\">\n",
" <h1>Course Notes for Learning Intelligent Systems</h1>\n",
" <h3>Department of Telematic Engineering Systems</h3>\n",
" <h5>Universidad Politécnica de Madrid. © Carlos A. Iglesias </h5>\n",
" </div>\n",
" <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
"</header>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This lecture provides an introduction to RDF and the SPARQL query language.\n",
"\n",
"This is the first in a series of notebooks about SPARQL, which consists of:\n",
"\n",
"* This notebook, which explains basic concepts of RDF and SPARQL\n",
"* [A notebook](01_SPARQL_Introduction.ipynb) that provides an introduction to SPARQL through a collection of exercises of increasing difficulty.\n",
"* [An optional notebook](02_SPARQL_Custom_Endpoint.ipynb) with queries to a custom dataset.\n",
"The dataset is meant to be done after the [RDF exercises](../rdf/RDF.ipynb) and it is out of the scope of this course.\n",
"You can consult it if you are interested."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# RDF basics\n",
"This section is taken from [[1](#1), [2](#2)].\n",
"\n",
"RDF allows us to make statements about resources. The format of these statements is simple. A statement always has the following structure:\n",
"\n",
" <subject> <predicate> <object>\n",
" \n",
"An RDF statement expresses a relationship between two resources. The **subject** and the **object** represent the two resources being related; the **predicate** represents the nature of their relationship.\n",
"The relationship is phrased in a directional way (from subject to object).\n",
"In RDF this relationship is known as a **property**.\n",
"Because RDF statements consist of three elements they are called **triples**.\n",
"\n",
"Here are some examples of RDF triples (informally expressed in pseudocode):\n",
"\n",
" <Bob> <is a> <person>.\n",
" <Bob> <is a friend of> <Alice>.\n",
" \n",
"Resources are identified by [IRIs](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier), which can appear in all three positions of a triple. For example, the IRI for Leonardo da Vinci in DBpedia is:\n",
"\n",
" <http://dbpedia.org/resource/Leonardo_da_Vinci>\n",
"\n",
"IRIs can be abbreviated as *prefixed names*. For example, \n",
" PREFIX dbr: <http://dbpedia.org/resource/>\n",
" <dbr:Leonardo_da_Vinci>\n",
" \n",
"Objects can be literals: \n",
" * strings (e.g., \"plain string\" or \"string with language\"@en)\n",
" * numbers (e.g., \"13.4\"^^xsd:float)\n",
" * dates (e.g., )\n",
" * booleans\n",
" * etc.\n",
" \n",
"RDF data is stored in RDF repositories that expose SPARQL endpoints.\n",
"Let's query one of the most famous RDF repositories: [dbpedia](https://wiki.dbpedia.org/).\n",
"First, we should learn how to execute SPARQL in a notebook."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Executing SPARQL in a notebook\n",
"There are several ways to execute SPARQL in a notebook.\n",
"Some of the most popular are:\n",
"\n",
"* using libraries such as [sparql-client](https://pypi.org/project/sparql-client/) or [rdflib](https://rdflib.dev/sparqlwrapper/) that enable executing SPARQL within a Python3 kernel\n",
"* using other libraries. In our case, a light library has been developed (the file helpers.py) for accessing SPARQL endpoints using an HTTP connection.\n",
"* using the [graph notebook package](https://pypi.org/project/graph-notebook/)\n",
"* using a SPARQL kernel [sparql kernel](https://github.com/paulovn/sparql-kernel) instead of the Python3 kernel\n",
"\n",
"\n",
"We are going to use the second option to avoid installing new packages.\n",
"\n",
"To use the library, you need to:\n",
"\n",
"1. Import `sparql` from helpers (i.e., `helpers.py`, a file that is available in the github repository)\n",
"2. Use the `%%sparql` magic command to indicate the SPARQL endpoint and then the SPARQL code.\n",
"\n",
"Let's try it!\n",
"\n",
"# Queries agains DBPedia\n",
"\n",
"We are going to execute a SPARQL query against DBPedia. This section is based on [[8](#8)].\n",
"\n",
"First, we just create a query to retrieve arbitrary triples (subject, predicate, object) without any restriction (besides limiting the result to 10 triples)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from helpers import sparql"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"SELECT ?s ?p ?o\n",
"WHERE {\n",
" ?s ?p ?o\n",
"}\n",
"LIMIT 10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, it worked, but the results are not particulary interesting. \n",
"Let's search for a famous football player, Fernando Torres.\n",
"\n",
"To do so, we will search for entities whose English \"human-readable representation\" (i.e., label) matches \"Fernando Torres\":"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" ?athlete rdfs:label \"Fernando Torres\"@en \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Great, we found the IRI of the node: `http://dbpedia.org/resource/Fernando_Torres`\n",
"\n",
"Now we can start asking for more properties.\n",
"\n",
"To do so, go to http://dbpedia.org/resource/Fernando_Torres and you will see all the information available about Fernando Torres. Pay attention to the names of predicates to be able to create new queries. For example, we are interesting in knowing where Fernando Torres was born (`dbo:birthPlace`).\n",
"\n",
"Let's go!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" ?athlete rdfs:label \"Fernando Torres\"@en ;\n",
" dbo:birthPlace ?birthPlace . \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we examine the SPARQL query, we find three blocks:\n",
"\n",
"* **PREFIX** section: IRIs of vocabularies and the prefix used below, to avoid long IRIs. e.g., by defining the `dbo` prefix in our example, the `dbo:birthPlace` below expands to `http://dbpedia.org/ontology/birthPlace`.\n",
"* **SELECT** section: variables we want to return (`*` is an abbreviation that selects all of the variables in a query)\n",
"* **WHERE** clause: triples where some elements are variables. These variables are bound during the query processing process and bounded variables are returned.\n",
"\n",
"Now take a closer look at the **WHERE** section.\n",
"We said earlier that triples are made out of three elements and each triple pattern should finish with a period (`.`) (although the last pattern can omit this).\n",
"However, when two or more triple patterns share the same subject, we omit it all but the first one, and use ` ;` as separator.\n",
"If if both the subject and predicate are the same, we could use a coma `,` instead.\n",
"This allows us to avoid repetition and make queries more readable.\n",
"But don't forget the space before your separators (`;` and `.`).\n",
"\n",
"The result is interesting, we know he was born in Fuenlabrada, but we see an additional (wrong) value, the Spanish national football team. The conversion process from Wikipedia to DBPedia should still be tuned :).\n",
"\n",
"We can *fix* it, by adding some more constaints.\n",
"In our case, only want a birth place that is also a municipality (i.e., its type is `http://dbpedia.org/resource/Municipalities_of_Spain`).\n",
"Let's see!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" ?athlete rdfs:label \"Fernando Torres\"@en ;\n",
" dbo:birthPlace ?birthPlace .\n",
" ?birthPlace dbo:type dbr:Municipalities_of_Spain \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Great. Now it looks better.\n",
"Notice that we added a new prefix.\n",
"\n",
"Now, is Fuenlabrada is a big city?\n",
"Let's find out.\n",
"\n",
"**Hint**: you can find more subject / object / predicate nodes related to [Fuenlabrada])http://dbpedia.org/resource/Fuenlabrada) in the RDF graph just as we did before.\n",
"That is how we found the `dbo:areaTotal` property."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" dbr:Fuenlabrada dbo:areaTotal ?area \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, it shows 39.1 km$^2$.\n",
"\n",
"Let's go back to our Fernando Torres.\n",
"What we are really insterested in is the name of the city he was born in, not its IRI.\n",
"As we saw before, the human-readable name is provided by the `rdfs:label` property:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbp: <http://dbpedia.org/property/>\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" ?player rdfs:label \"Fernando Torres\"@en ;\n",
" dbo:birthPlace ?birthPlace .\n",
" ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
" rdfs:label ?placeName \n",
" \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, we are almost there. We see that we receive the city name in many languages. We want just the English name. Let's filter!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbp: <http://dbpedia.org/property/>\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" ?player rdfs:label \"Fernando Torres\"@en ;\n",
" dbo:birthPlace ?birthPlace .\n",
" ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
" rdfs:label ?placeName .\n",
" FILTER ( LANG ( ?placeName ) = 'en' )\n",
" \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Awesome!\n",
"\n",
"But we said we don't care about the IRI of the place. We only want two pieces of data: Fernando's birth date and the name of his birthplace.\n",
"\n",
"Let's tune our query a bit more."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbp: <http://dbpedia.org/property/>\n",
"\n",
"SELECT ?birthDate, ?placeName\n",
"WHERE\n",
" {\n",
" ?player rdfs:label \"Fernando Torres\"@en ;\n",
" dbo:birthDate ?birthDate ;\n",
" dbo:birthPlace ?birthPlace .\n",
" ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
" rdfs:label ?placeName .\n",
" FILTER ( LANG ( ?placeName ) = 'en' )\n",
" \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Great 😃\n",
"\n",
"Are there many football players born in Fuenlabrada? Let's find out!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbp: <http://dbpedia.org/property/>\n",
"\n",
"SELECT *\n",
"WHERE\n",
" {\n",
" ?player a dbo:SoccerPlayer ; \n",
" dbo:birthPlace dbr:Fuenlabrada . \n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, not that many. Observe we have used `a`.\n",
"It is just an abbreviation for `rdf:type`, both can be used interchangeably.\n",
"\n",
"If you want additional examples, you can follow the notebook by [Shawn Graham](https://github.com/o-date/sparql-and-lod/blob/master/sparql-intro.ipynb), which is based on the SPARQL tutorial by Matthew Lincoln, available [here in English](https://programminghistorian.org/en/lessons/retired/graph-databases-and-SPARQL) and [here in Spanish](https://programminghistorian.org/es/lecciones/retirada/sparql-datos-abiertos-enlazados]). You have also a local copy of these tutorials together with this notebook [here in English](https://htmlpreview.github.io/?https://github.com/gsi-upm/sitc/blob/master/lod/tutorial/graph-databases-and-SPARQL.html) and [here in Spanish](https://htmlpreview.github.io/?https://github.com/gsi-upm/sitc/blob/master/lod/tutorial/sparql-datos-abiertos-enlazados.html). \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* <a id=\"1\">[1]</a> [SPARQL by Example. A Tutorial. Lee Feigenbaum. W3C, 2009](https://www.w3.org/2009/Talks/0615-qbe/#q1)\n",
"* <a id=\"2\">[2]</a> [RDF Primer W3C](https://www.w3.org/TR/rdf11-primer/)\n",
"* <a id=\"3\">[3]</a> [SPARQL queries of Beatles recording sessions](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html)\n",
"* <a id=\"4\">[4]</a> [RDFLib documentation](https://rdflib.readthedocs.io/en/stable/).\n",
"* <a id=\"5\">[5]</a> [Wikidata Query Service query examples](https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples)\n",
"* <a id=\"6\">[6]</a> [RDF Graph Data Model. Learn about the RDF graph model used by Stardog.](https://www.stardog.com/tutorials/data-model)\n",
"* <a id=\"7\">[7]</a> [Learn SPARQL Write Knowledge Graph queries using SPARQL with step-by-step examples.](https://www.stardog.com/tutorials/sparql/)\n",
"* <a id=\"8\">[8]</a> [Running Basic SPARQL Queries Against DBpedia.](https://medium.com/virtuoso-blog/dbpedia-basic-queries-bc1ac172cc09)\n",
"* <a id=\"8\">[9]</a> [Intro SPARQL based on painters.](https://github.com/o-date/sparql-and-lod/blob/master/sparql-intro.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]
}
],
"metadata": {
"datacleaner": {
"position": {
"top": "50px"
},
"python": {
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
},
"window_display": false
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}

View File

@@ -6,11 +6,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "7276f055a8c504d3c80098c62ed41a4f", "checksum": "7276f055a8c504d3c80098c62ed41a4f",
"grade": false, "grade": false,
"grade_id": "cell-0bfe38f97f6ab2d2", "grade_id": "cell-0bfe38f97f6ab2d2",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -56,11 +57,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "99aecbad8f94966d92d72dc911d3ff99", "cell_type": "markdown",
"checksum": "40ccd05ad0704781327031a84dfb9939",
"grade": false, "grade": false,
"grade_id": "cell-4f8492996e74bf20", "grade_id": "cell-4f8492996e74bf20",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -69,7 +71,7 @@
"\n", "\n",
"* This notebook\n", "* This notebook\n",
"* External SPARQL editors (optional)\n", "* External SPARQL editors (optional)\n",
" * YASGUI-GSI http://yasgui.cluster.gsi.dit.upm.es\n", " * YASGUI-GSI http://yasgui.gsi.upm.es\n",
" * DBpedia virtuoso http://dbpedia.org/sparql\n", " * DBpedia virtuoso http://dbpedia.org/sparql\n",
"\n", "\n",
"Using the YASGUI-GSI editor has several advantages over other options.\n", "Using the YASGUI-GSI editor has several advantages over other options.\n",
@@ -93,18 +95,19 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "99e3107f9987cdddae7866dded27f165", "cell_type": "markdown",
"checksum": "81894e9d65e5dd9f3b6e1c5f66804bf6",
"grade": false, "grade": false,
"grade_id": "cell-70ac24910356c3cf", "grade_id": "cell-70ac24910356c3cf",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
"source": [ "source": [
"## Instructions\n", "## Instructions\n",
"\n", "\n",
"We will be using a semantic server, available at: http://fuseki.cluster.gsi.dit.upm.es/sitc.\n", "We will be using a semantic server, available at: http://fuseki.gsi.upm.es/sitc.\n",
"\n", "\n",
"This server contains a dataset about [Beatles songs](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html), which we will query with SPARQL.\n", "This server contains a dataset about [Beatles songs](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html), which we will query with SPARQL.\n",
"\n", "\n",
@@ -122,11 +125,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "1d332d3d11fd6b57f0ec0ac3c358c6cb", "checksum": "1d332d3d11fd6b57f0ec0ac3c358c6cb",
"grade": false, "grade": false,
"grade_id": "cell-eb13908482825e42", "grade_id": "cell-eb13908482825e42",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -144,11 +148,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "code",
"checksum": "aca7c5538b8fc53e99c92e94e6818c83", "checksum": "aca7c5538b8fc53e99c92e94e6818c83",
"grade": false, "grade": false,
"grade_id": "cell-b3f3d92fa2100c3d", "grade_id": "cell-b3f3d92fa2100c3d",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -163,11 +168,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "e896b6560e45d5c385a43aa85e3523c7", "checksum": "e896b6560e45d5c385a43aa85e3523c7",
"grade": false, "grade": false,
"grade_id": "cell-04410e75828c388d", "grade_id": "cell-04410e75828c388d",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -193,11 +199,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "96ca90572d6b275fa515c6b976115257", "cell_type": "markdown",
"checksum": "34710d3bb8e2cf826833a43adb7fb448",
"grade": false, "grade": false,
"grade_id": "cell-2a44c0da2c206d01", "grade_id": "cell-2a44c0da2c206d01",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -210,7 +217,7 @@
"Some examples are:\n", "Some examples are:\n",
"\n", "\n",
"* DBpedia's virtuoso query editor https://dbpedia.org/sparql\n", "* DBpedia's virtuoso query editor https://dbpedia.org/sparql\n",
"* A javascript based client hosted at GSI: http://yasgui.cluster.gsi.dit.upm.es/\n", "* A javascript based client hosted at GSI: http://yasgui.gsi.upm.es/\n",
"\n", "\n",
"[^1]: http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html" "[^1]: http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html"
] ]
@@ -221,11 +228,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "79c60bd3d4c13f380aae5778c5ce7245", "checksum": "79c60bd3d4c13f380aae5778c5ce7245",
"grade": false, "grade": false,
"grade_id": "cell-d645128d3af18117", "grade_id": "cell-d645128d3af18117",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -241,11 +249,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "f7428fe79cd33383dfd3b09a0d951b6e", "checksum": "f7428fe79cd33383dfd3b09a0d951b6e",
"grade": false, "grade": false,
"grade_id": "cell-8391a5322a9ad4a7", "grade_id": "cell-8391a5322a9ad4a7",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -260,11 +269,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "f6b5da583694dd5cc9326c670830875d", "checksum": "f6b5da583694dd5cc9326c670830875d",
"grade": false, "grade": false,
"grade_id": "cell-4f56a152e4d70c02", "grade_id": "cell-4f56a152e4d70c02",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -313,17 +323,18 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "7a9dc62ab639143c9fc13593e50500d4", "cell_type": "code",
"checksum": "3bc71f851a33fa401d18ea3ab02cf61f",
"grade": false, "grade": false,
"grade_id": "cell-8ce8c954513f17e7", "grade_id": "cell-8ce8c954513f17e7",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"SELECT ?entity ?type\n", "SELECT ?entity ?type\n",
"WHERE {\n", "WHERE {\n",
@@ -338,11 +349,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "d6a79c2f5fd005a9e15a8f67dcfd4784", "checksum": "d6a79c2f5fd005a9e15a8f67dcfd4784",
"grade": false, "grade": false,
"grade_id": "cell-3d6d622c717c3950", "grade_id": "cell-3d6d622c717c3950",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -375,17 +387,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "69e016b0224f410f03f6217ac30c03a8", "cell_type": "code",
"checksum": "65be7168bedb4f6dc2f19e2138bab232",
"grade": false, "grade": false,
"grade_id": "cell-6e904d692b5facad", "grade_id": "cell-6e904d692b5facad",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"SELECT ?entity ?prop\n", "SELECT ?entity ?prop\n",
"WHERE {\n", "WHERE {\n",
@@ -401,12 +414,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "97bd5d5383bd94a72c7452bc33e4b0f9", "cell_type": "code",
"checksum": "e78b57fa9baab578f5a4bd22dc499fca",
"grade": true, "grade": true,
"grade_id": "cell-3fc0d3c43dfd04a3", "grade_id": "cell-3fc0d3c43dfd04a3",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -440,7 +454,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"SELECT ?type\n", "SELECT ?type\n",
"WHERE {\n", "WHERE {\n",
@@ -465,7 +479,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"SELECT DISTINCT ?type\n", "SELECT DISTINCT ?type\n",
"WHERE {\n", "WHERE {\n",
@@ -507,17 +521,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "47c4f68e342ffe59a3804de7b6a3909b", "cell_type": "code",
"checksum": "35563ff455c7e8b1c91f61db97b2011b",
"grade": false, "grade": false,
"grade_id": "cell-e615f9a77c4bc9a5", "grade_id": "cell-e615f9a77c4bc9a5",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"SELECT DISTINCT ?property\n", "SELECT DISTINCT ?property\n",
"WHERE {\n", "WHERE {\n",
@@ -532,12 +547,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "c9ffeba2d4ffc3e0b95f15a0ec6012c5", "cell_type": "code",
"checksum": "7603c90d8c177e2e6678baa2f1b6af36",
"grade": true, "grade": true,
"grade_id": "cell-9168718938ab7347", "grade_id": "cell-9168718938ab7347",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -569,7 +585,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -638,17 +654,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "8b0faf938efc1a64a70515da3c132605", "cell_type": "code",
"checksum": "069811507dbac4b86dc5d3adc82ba4ec",
"grade": false, "grade": false,
"grade_id": "cell-0223a51f609edcf9", "grade_id": "cell-0223a51f609edcf9",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -667,12 +684,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "e93d7336fd125d95996e60fd312a4e4d", "cell_type": "code",
"checksum": "9833a3efa75c7e2784ef5d60aae2a13e",
"grade": true, "grade": true,
"grade_id": "cell-3c7943c6382c62f5", "grade_id": "cell-3c7943c6382c62f5",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -706,17 +724,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "271f2b194c2db4c558a46e8312b593e6", "cell_type": "code",
"checksum": "b68a279085a1ed087f5e474a6602299e",
"grade": false, "grade": false,
"grade_id": "cell-8f43547dd788bb33", "grade_id": "cell-8f43547dd788bb33",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -735,12 +754,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "9f1f7cec8ce4674971543728ada86674", "cell_type": "code",
"checksum": "b4461d243cc058b1828769cc906d4947",
"grade": true, "grade": true,
"grade_id": "cell-e13a1c921af2f6eb", "grade_id": "cell-e13a1c921af2f6eb",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -770,11 +790,12 @@
"\n", "\n",
"SELECT *\n", "SELECT *\n",
"WHERE { ... }\n", "WHERE { ... }\n",
"ORDER BY <variable> <variable> ... DESC(<variable>) ASC(<variable>)\n", "ORDER BY <variable> <variable> ... \n",
"... other statements like LIMIT ...\n", "... other statements like LIMIT ...\n",
"```\n", "```\n",
"\n", "\n",
"The results can be sorted in ascending or descending order, and using several variables." "The results can be sorted in ascending or descending order, and using several variables.\n",
"By default the results are ordered in ascending order, but you can indicate the order using an optional modifier (`ASC(<variable>)`, or `DESC(<variable>)`). \n"
] ]
}, },
{ {
@@ -790,17 +811,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "9dcd9c6d51a61ac129cffa06e1463c66", "cell_type": "code",
"checksum": "335403f01e484ce5563ff059e9764ff4",
"grade": false, "grade": false,
"grade_id": "cell-a0f0b9d9b05c9631", "grade_id": "cell-a0f0b9d9b05c9631",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -820,12 +842,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "a044b3fd6b8bd4e098bbe4d818cb4e9f", "cell_type": "code",
"checksum": "45530eb91cbc5b3fddcc93d96f07e579",
"grade": true, "grade": true,
"grade_id": "cell-bc012ca9d7ad2867", "grade_id": "cell-bc012ca9d7ad2867",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -858,7 +881,7 @@
" rdfs:label \"Ringo Starr\" .\n", " rdfs:label \"Ringo Starr\" .\n",
"```\n", "```\n",
"\n", "\n",
"Using this structure, and the SPARQL statements you already know, to get the **names** of all musicians that collaborated in at least one song.\n" "Using this structure, and the SPARQL statements you already know, get the **names** of all musicians that collaborated in at least one song.\n"
] ]
}, },
{ {
@@ -867,17 +890,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "9da7a62b6237078f5eab7e593a8eb590", "cell_type": "code",
"checksum": "8fb253675d2e8510e2c6780b960721e5",
"grade": false, "grade": false,
"grade_id": "cell-523b963fa4e288d0", "grade_id": "cell-523b963fa4e288d0",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -898,12 +922,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "c8e3a929faf2afa72207c6921382654c", "cell_type": "code",
"checksum": "f4474b302bc2f634b3b2ee6e1c7e7257",
"grade": true, "grade": true,
"grade_id": "cell-aa9a4e18d6fda225", "grade_id": "cell-aa9a4e18d6fda225",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -930,13 +955,13 @@
"\n", "\n",
"Results can be aggregated using different functions.\n", "Results can be aggregated using different functions.\n",
"One of the simplest functions is `COUNT`.\n", "One of the simplest functions is `COUNT`.\n",
"The syntax for COUNT is:\n", "The syntax for `COUNT` is:\n",
" \n", " \n",
"```sparql\n", "```sparql\n",
"SELECT (COUNT(?variable) as ?count_name)\n", "SELECT (COUNT(?variable) as ?count_name)\n",
"```\n", "```\n",
"\n", "\n",
"Use `COUNT` to get the number of songs in which Ringo collaborated." "Use `COUNT` to get the number of songs in which Ringo collaborated. Your query should return a column named `number`."
] ]
}, },
{ {
@@ -945,17 +970,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "d8419711d2db43ad657e2658a1ea86c4", "cell_type": "code",
"checksum": "c7b6620f5ba28b482197ab693cb7142a",
"grade": false, "grade": false,
"grade_id": "cell-e89d08031e30b299", "grade_id": "cell-e89d08031e30b299",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX m: <http://learningsparql.com/ns/musician/>\n", "PREFIX m: <http://learningsparql.com/ns/musician/>\n",
@@ -975,12 +1001,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "29404e07edf639cdc0ce0d82e654ec31", "cell_type": "code",
"checksum": "c90e1427d7e48d9ae8abab40ff92e3b0",
"grade": true, "grade": true,
"grade_id": "cell-903d2be00885e1d2", "grade_id": "cell-903d2be00885e1d2",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -1012,7 +1039,7 @@
"\n", "\n",
"Once results are grouped, they can be aggregated using any aggregation function, such as `COUNT`.\n", "Once results are grouped, they can be aggregated using any aggregation function, such as `COUNT`.\n",
"\n", "\n",
"Using `GROUP BY` and `COUNT`, get the count of songs that use each instrument:" "Using `GROUP BY` and `COUNT`, get the count of songs in which Ringo Starr has played each of the instruments:"
] ]
}, },
{ {
@@ -1021,17 +1048,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "7a0a7206384e7e1d9eb4450dd9e9871f", "cell_type": "code",
"checksum": "7556bacb20c1fbd059dec165c982908d",
"grade": false, "grade": false,
"grade_id": "cell-1429e4eb5400dbc7", "grade_id": "cell-1429e4eb5400dbc7",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX m: <http://learningsparql.com/ns/musician/>\n", "PREFIX m: <http://learningsparql.com/ns/musician/>\n",
@@ -1053,12 +1081,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "bd4dc379fea969d513be0ea97ee75922", "cell_type": "code",
"checksum": "34a8432e8d4cea70994c8214ed0e5eb6",
"grade": true, "grade": true,
"grade_id": "cell-907aaf6001e27e50", "grade_id": "cell-907aaf6001e27e50",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -1094,7 +1123,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -1115,7 +1144,9 @@
"Now, use the same principle to get the count of **different** instruments in each song.\n", "Now, use the same principle to get the count of **different** instruments in each song.\n",
"Some songs have several musicians playing the same instrument, but we only care about *different* instruments in each song.\n", "Some songs have several musicians playing the same instrument, but we only care about *different* instruments in each song.\n",
"\n", "\n",
"Use `?number` for the count." "Use `?song` for the song and `?number` for the count.\n",
"\n",
"Take into consideration that instruments are entities of type `i:Instrument`."
] ]
}, },
{ {
@@ -1124,17 +1155,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "4a231b4d6874dad435512b988c17c39e", "cell_type": "code",
"checksum": "3139d9b7e620266946ffe1ae0cf67581",
"grade": false, "grade": false,
"grade_id": "cell-ee208c762d00da9c", "grade_id": "cell-ee208c762d00da9c",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -1144,6 +1176,8 @@
" [] a s:Song ;\n", " [] a s:Song ;\n",
" rdfs:label ?song ;\n", " rdfs:label ?song ;\n",
" ?instrument ?musician .\n", " ?instrument ?musician .\n",
" \n",
"?instrument a s:Instrument .\n",
"}\n", "}\n",
"# YOUR ANSWER HERE\n", "# YOUR ANSWER HERE\n",
"ORDER BY DESC(?number)" "ORDER BY DESC(?number)"
@@ -1156,19 +1190,20 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "8118099bf14d9f0eb241c4d93ea6f0b9", "cell_type": "code",
"checksum": "5abf6eb7a67ebc9f7612b876105c1960",
"grade": true, "grade": true,
"grade_id": "cell-ddeec32b8ac3d894", "grade_id": "cell-ddeec32b8ac3d894",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"s = solution()\n", "s = solution()\n",
"assert s['columns']['number'][0] == '27'" "assert s['columns']['number'][0] == '25'"
] ]
}, },
{ {
@@ -1193,7 +1228,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n", "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1213,10 +1248,10 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"However, there are some songs that do not have a vocalist (at least, in the dataset).\n", "However, there are some songs that do not have a vocalist (at least, in the dataset).\n",
"Those songs will not appear in the list above, because we they do not match part of the `WHERE` clause.\n", "Those songs will not appear in the list above, because they do not match part of the `WHERE` clause.\n",
"\n", "\n",
"In these cases, we can specify optional values in a query using the `OPTIONAL` keyword.\n", "In these cases, we can specify optional values in a query using the `OPTIONAL` keyword.\n",
"When a set of clauses are inside an OPTIONAL group, the SPARQL endpoint will try to use them in the query.\n", "When a set of clauses are inside an `OPTIONAL` group, the SPARQL endpoint will try to use them in the query.\n",
"If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n", "If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n",
"\n", "\n",
"To exemplify this, we can use a property that **does not exist in the dataset**:" "To exemplify this, we can use a property that **does not exist in the dataset**:"
@@ -1228,7 +1263,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n", "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1261,17 +1296,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "4b0a0854457c37640aad67f375ed3a17", "cell_type": "code",
"checksum": "3bc508872193750d57d07efbf334c212",
"grade": false, "grade": false,
"grade_id": "cell-dcd68c45c1608a28", "grade_id": "cell-dcd68c45c1608a28",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n", "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1294,12 +1330,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "f7122b2284b5d59d59ce4a2925f0bb21", "cell_type": "code",
"checksum": "69edef3121b8dfab385a00cd181c956f",
"grade": true, "grade": true,
"grade_id": "cell-1e706b9c1c1331bc", "grade_id": "cell-1e706b9c1c1331bc",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -1343,17 +1380,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "09621e7af911faf39a834e8281bc6d1f", "cell_type": "code",
"checksum": "300df0a3cf9729dd4814b3153b2fedb4",
"grade": false, "grade": false,
"grade_id": "cell-0c7cc924a13d792a", "grade_id": "cell-0c7cc924a13d792a",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n", "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1379,12 +1417,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "cebff8ce42f3f36923e81e083a23d24c", "cell_type": "code",
"checksum": "22d6fcdb72a8b2c5ab496cdbb5e2740a",
"grade": true, "grade": true,
"grade_id": "cell-2541abc93ab4d506", "grade_id": "cell-2541abc93ab4d506",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -1416,17 +1455,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "ea9797f3b2d001ea41d7fa7a5170d5fb", "cell_type": "code",
"checksum": "e4e898c8a16b8aa5865dfde2f6e68ec6",
"grade": false, "grade": false,
"grade_id": "cell-d750b6d64c6aa0a7", "grade_id": "cell-d750b6d64c6aa0a7",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
@@ -1469,7 +1509,9 @@
"source": [ "source": [
"Now, count how many instruments each musician have played in a song.\n", "Now, count how many instruments each musician have played in a song.\n",
"\n", "\n",
"**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**." "**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**.\n",
"\n",
"Use `?musician` for the musician and `?number` for the count."
] ]
}, },
{ {
@@ -1478,17 +1520,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "2d82df272d43f678d3b19bf0b41530c1", "cell_type": "code",
"checksum": "fade6ab714376e0eabfa595dd6bd6a8b",
"grade": false, "grade": false,
"grade_id": "cell-2f5aa516f8191787", "grade_id": "cell-2f5aa516f8191787",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
@@ -1513,12 +1556,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "bc83dd9577c9111b1f0ef5bd40c4ec08", "cell_type": "code",
"checksum": "33e93ec2a3d1f9eb4b0310d4651b74c2",
"grade": true, "grade": true,
"grade_id": "cell-bcd0f7e26b6c11c2", "grade_id": "cell-bcd0f7e26b6c11c2",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -1533,7 +1577,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Which songs had Ringo in dums OR Lennon in lead vocals? (UNION)" "### Which songs had Ringo in drums OR Lennon in lead vocals? (UNION)"
] ]
}, },
{ {
@@ -1567,17 +1611,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "a1e20e2be817a592683dea89eed0120e", "cell_type": "code",
"checksum": "09262d81449c498c37e4b9d9b1dcdfed",
"grade": false, "grade": false,
"grade_id": "cell-d3a742bd87d9c793", "grade_id": "cell-d3a742bd87d9c793",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
@@ -1597,18 +1642,19 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "087630476d73bb415b065fafbd6024f0", "cell_type": "code",
"checksum": "11061e79ec06ccb3a9c496319a528366",
"grade": true, "grade": true,
"grade_id": "cell-409402df0e801d09", "grade_id": "cell-409402df0e801d09",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"assert len(solution()['tuples']) == 246" "assert len(solution()['tuples']) == 209"
] ]
}, },
{ {
@@ -1648,17 +1694,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "1d2cb88412c89c35861a4f9fccea3bf2", "cell_type": "code",
"checksum": "9ddd2d1f50f841b889bfd29b175d06da",
"grade": false, "grade": false,
"grade_id": "cell-9d1ec854eb530235", "grade_id": "cell-9d1ec854eb530235",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"\n", "\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
"\n", "\n",
@@ -1680,12 +1727,13 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"checksum": "aa20aa4d11632ea5bd6004df3187d979", "cell_type": "code",
"checksum": "0ea5496acd1c3edd9e188b351690a533",
"grade": true, "grade": true,
"grade_id": "cell-a79c688b4566dbe8", "grade_id": "cell-a79c688b4566dbe8",
"locked": true, "locked": true,
"points": 0, "points": 1,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -1729,7 +1777,9 @@
"\n", "\n",
"Using `GROUP_CONCAT`, get a list of the instruments that each musician could play.\n", "Using `GROUP_CONCAT`, get a list of the instruments that each musician could play.\n",
"\n", "\n",
"You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/)." "You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/).\n",
"\n",
"Use `?musician` for the musician and `?instruments` for the list of instruments."
] ]
}, },
{ {
@@ -1738,17 +1788,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "508b7f8656e849838aa93cd38f1c6635", "cell_type": "code",
"checksum": "d18e8b6e1d32aed395a533febb29fcb5",
"grade": false, "grade": false,
"grade_id": "cell-7ea1f5154cdd8324", "grade_id": "cell-7ea1f5154cdd8324",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n", "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1773,7 +1824,9 @@
"\n", "\n",
"You can check if a string or URI matches a regular expression with `regex(?variable, \"<regex>\", \"i\")`.\n", "You can check if a string or URI matches a regular expression with `regex(?variable, \"<regex>\", \"i\")`.\n",
"\n", "\n",
"The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/)." "The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/).\n",
"\n",
"Use `?instrument` for the instrument and `?ins` for the url of the type."
] ]
}, },
{ {
@@ -1782,17 +1835,18 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"checksum": "cff1f9c034393f8af055e1f930d5fe32", "cell_type": "code",
"checksum": "f926fa3a3568d122454a12312859cda1",
"grade": false, "grade": false,
"grade_id": "cell-b6bee887a1b1fc60", "grade_id": "cell-b6bee887a1b1fc60",
"locked": false, "locked": false,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n", "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n", "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
"PREFIX s: <http://learningsparql.com/ns/schema/>\n", "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n", "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1830,7 +1884,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -1844,9 +1898,22 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.2" "version": "3.8.10"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
} }
}, },
"nbformat": 4, "nbformat": 4,
"nbformat_minor": 2 "nbformat_minor": 4
} }

View File

@@ -6,11 +6,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "7276f055a8c504d3c80098c62ed41a4f", "checksum": "7276f055a8c504d3c80098c62ed41a4f",
"grade": false, "grade": false,
"grade_id": "cell-0bfe38f97f6ab2d2", "grade_id": "cell-0bfe38f97f6ab2d2",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -31,11 +32,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "42642609861283bc33914d16750b7efa", "checksum": "42642609861283bc33914d16750b7efa",
"grade": false, "grade": false,
"grade_id": "cell-0cd673883ee592d1", "grade_id": "cell-0cd673883ee592d1",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -59,11 +61,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "a3ecb4b300a5ab82376a4a8cb01f7e6b", "checksum": "a3ecb4b300a5ab82376a4a8cb01f7e6b",
"grade": false, "grade": false,
"grade_id": "cell-10264483046abcc4", "grade_id": "cell-10264483046abcc4",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -80,11 +83,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "2fedf0d73fc90104d1ab72c3413dfc83", "checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
"grade": false, "grade": false,
"grade_id": "cell-4f8492996e74bf20", "grade_id": "cell-4f8492996e74bf20",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -100,11 +104,12 @@
"deletable": false, "deletable": false,
"editable": false, "editable": false,
"nbgrader": { "nbgrader": {
"cell_type": "markdown",
"checksum": "c5f8646518bd832a47d71f9d3218237a", "checksum": "c5f8646518bd832a47d71f9d3218237a",
"grade": false, "grade": false,
"grade_id": "cell-eb13908482825e42", "grade_id": "cell-eb13908482825e42",
"locked": true, "locked": true,
"schema_version": 1, "schema_version": 3,
"solution": false "solution": false
} }
}, },
@@ -148,7 +153,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n", "%%sparql http://fuseki.gsi.upm.es/hotels\n",
" \n", " \n",
"SELECT ?g (COUNT(?s) as ?count) WHERE {\n", "SELECT ?g (COUNT(?s) as ?count) WHERE {\n",
" GRAPH ?g {\n", " GRAPH ?g {\n",
@@ -160,14 +165,12 @@
] ]
}, },
{ {
"cell_type": "code", "cell_type": "markdown",
"execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [],
"source": [ "source": [
"You should see many graphs, with different triple counts.\n", "You should see many graphs, with different triple counts.\n",
"\n", "\n",
"The biggest one should be http://fuseki.cluster.gsi.dit.upm.es/synthetic" "The biggest one should be http://fuseki.gsi.upm.es/synthetic"
] ]
}, },
{ {
@@ -183,11 +186,11 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n", "%%sparql http://fuseki.gsi.upm.es/hotels\n",
" \n", " \n",
"SELECT *\n", "SELECT *\n",
"WHERE {\n", "WHERE {\n",
" GRAPH <http://fuseki.cluster.gsi.dit.upm.es/synthetic>{\n", " GRAPH <http://fuseki.gsi.upm.es/synthetic>{\n",
" ?s ?p ?o .\n", " ?s ?p ?o .\n",
" }\n", " }\n",
"}\n", "}\n",
@@ -233,13 +236,13 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n", "%%sparql http://fuseki.gsi.upm.es/hotels\n",
"\n", "\n",
"PREFIX schema: <http://schema.org/>\n", "PREFIX schema: <http://schema.org/>\n",
" \n", " \n",
"SELECT ?s ?o\n", "SELECT ?s ?o\n",
"WHERE {\n", "WHERE {\n",
" GRAPH <http://fuseki.cluster.gsi.dit.upm.es/35c20a49f8c6581be1cf7bd56d12d131>{\n", " GRAPH <http://fuseki.gsi.upm.es/35c20a49f8c6581be1cf7bd56d12d131>{\n",
" ?s a ?o .\n", " ?s a ?o .\n",
" }\n", " }\n",
"\n", "\n",
@@ -264,11 +267,11 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n", "%%sparql http://fuseki.gsi.upm.es/hotels\n",
" \n", " \n",
"SELECT *\n", "SELECT *\n",
"WHERE {\n", "WHERE {\n",
" GRAPH <http://fuseki.cluster.gsi.dit.upm.es/synthetic>{\n", " GRAPH <http://fuseki.gsi.upm.es/synthetic>{\n",
" ?s ?p ?o .\n", " ?s ?p ?o .\n",
" }\n", " }\n",
"}\n", "}\n",
@@ -295,7 +298,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n", "%%sparql http://fuseki.gsi.upm.es/hotels\n",
"\n", "\n",
"PREFIX schema: <http://schema.org/>\n", "PREFIX schema: <http://schema.org/>\n",
" \n", " \n",
@@ -308,7 +311,7 @@
" SELECT ?g\n", " SELECT ?g\n",
" WHERE {\n", " WHERE {\n",
" GRAPH ?g {}\n", " GRAPH ?g {}\n",
" FILTER (str(?g) != 'http://fuseki.cluster.gsi.dit.upm.es/synthetic')\n", " FILTER (str(?g) != 'http://fuseki.gsi.upm.es/synthetic')\n",
" }\n", " }\n",
" }\n", " }\n",
"\n", "\n",
@@ -339,12 +342,13 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"cell_type": "code",
"checksum": "860c3977cd06736f1342d535944dbb63", "checksum": "860c3977cd06736f1342d535944dbb63",
"grade": true, "grade": true,
"grade_id": "cell-9bd08e4f5842cb89", "grade_id": "cell-9bd08e4f5842cb89",
"locked": false, "locked": false,
"points": 0, "points": 0,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
@@ -366,12 +370,13 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"cell_type": "code",
"checksum": "1946a7ed4aba8d168bb3fad898c05651", "checksum": "1946a7ed4aba8d168bb3fad898c05651",
"grade": true, "grade": true,
"grade_id": "cell-9dc1c9033198bb18", "grade_id": "cell-9dc1c9033198bb18",
"locked": false, "locked": false,
"points": 0, "points": 0,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
@@ -393,12 +398,13 @@
"metadata": { "metadata": {
"deletable": false, "deletable": false,
"nbgrader": { "nbgrader": {
"cell_type": "code",
"checksum": "6714abc5226618b76dc4c1aaed6d1a49", "checksum": "6714abc5226618b76dc4c1aaed6d1a49",
"grade": true, "grade": true,
"grade_id": "cell-6c18003ced54be23", "grade_id": "cell-6c18003ced54be23",
"locked": false, "locked": false,
"points": 0, "points": 0,
"schema_version": 1, "schema_version": 3,
"solution": true "solution": true
} }
}, },
@@ -435,7 +441,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -449,7 +455,20 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.2" "version": "3.8.10"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
} }
}, },
"nbformat": 4, "nbformat": 4,

1429
lod/03_SPARQL_Writers.ipynb Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,652 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
"grade": false,
"grade_id": "cell-0bfe38f97f6ab2d2",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"<header style=\"width:100%;position:relative\">\n",
" <div style=\"width:80%;float:right;\">\n",
" <h1>Course Notes for Learning Intelligent Systems</h1>\n",
" <h3>Department of Telematic Engineering Systems</h3>\n",
" <h5>Universidad Politécnica de Madrid</h5>\n",
" </div>\n",
" <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
"</header>"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "bd478e6253226d24ba7f33cb9f6ba706",
"grade": false,
"grade_id": "cell-0cd673883ee592d1",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"## Advanced SPARQL\n",
"\n",
"This notebook complements [the SPARQL notebook](./01_SPARQL.ipynb) with some advanced commands.\n",
"\n",
"If you have not completed the exercises in the previous notebook, please do so before continuing.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "9ea4fd529653214745b937d5fc4559e5",
"grade": false,
"grade_id": "cell-10264483046abcc4",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"## Objectives\n",
"\n",
"* To cover some SPARQL concepts that are less frequently used "
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
"grade": false,
"grade_id": "cell-4f8492996e74bf20",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"## Tools\n",
"\n",
"See [the SPARQL notebook](./01_SPARQL_Introduction.ipynb#Tools)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c5f8646518bd832a47d71f9d3218237a",
"grade": false,
"grade_id": "cell-eb13908482825e42",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Run this line to enable the `%%sparql` magic command."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from helpers import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercises"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Working with dates"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To explore dates, we will focus on our Writers example.\n",
"\n",
"First, search for writers born in the XX century.\n",
"You can use a special filter, knowing that `\"2000\"^^xsd:date` is the first date of year 2000."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1a23c8b9a53f7ae28f28b1c23b9706b5",
"grade": false,
"grade_id": "cell-ab7755944d46f9ca",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct: <http://purl.org/dc/terms/>\n",
"PREFIX dbc: <http://dbpedia.org/resource/Category:>\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\n",
"\n",
"SELECT ?escritor ?nombre (year(?fechaNac) as ?nac)\n",
"WHERE {\n",
" ?escritor dct:subject dbc:Spanish_novelists ;\n",
" rdfs:label ?nombre ;\n",
" dbo:birthDate ?fechaNac .\n",
" FILTER(lang(?nombre) = \"es\") .\n",
" # YOUR ANSWER HERE\n",
"}\n",
"# YOUR ANSWER HERE\n",
"LIMIT 1000"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e261d808f509c1e29227db94d1eab784",
"grade": true,
"grade_id": "cell-cf3821f2d33fb0f6",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"assert 'Ramiro Ledesma' in solution()['columns']['nombre']\n",
"assert 'Ray Loriga' in solution()['columns']['nombre']\n",
"assert all(int(x) > 1899 and int(x) < 2001 for x in solution()['columns']['nac'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, get the list of Spanish novelists that are still alive.\n",
"\n",
"A person is alive if their death date is not defined and the were born less than 100 years ago.\n",
"\n",
"Remember, we can check whether the optional value for a key was bound in a SPARQL query using `BOUND(?key)`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e4579d551790c33ba4662562c6a05d99",
"grade": false,
"grade_id": "cell-474b1a72dec6827c",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"%%sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
"\n",
"SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac\n",
"\n",
"WHERE {\n",
" ?escritor dct:subject dbc:Spanish_novelists .\n",
" ?escritor rdfs:label ?nombre .\n",
" ?escritor dbo:birthDate ?fechaNac .\n",
"# YOUR ANSWER HERE\n",
" FILTER(lang(?nombre) = \"es\") .\n",
"}\n",
"# YOUR ANSWER HERE\n",
"LIMIT 1000"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "770bbddef5210c28486a1929e4513ada",
"grade": true,
"grade_id": "cell-46b62dd2856bc919",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"assert 'Fernando Arrabal' in solution()['columns']['nombre']\n",
"assert 'Albert Espinosa' in solution()['columns']['nombre']\n",
"for year in solution()['columns']['nac']:\n",
" assert int(year) >= 1918"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with badly formatted dates (OPTIONAL!)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, get the list of Spanish novelists that died before their fifties (i.e. younger than 50 years old), or that aren't 50 years old yet.\n",
"\n",
"For the sake of simplicity, you can use the `year(<date>)` function.\n",
"\n",
"Hint: you can use boolean logic in your filters (e.g. `&&` and `||`).\n",
"\n",
"Hint 2: Some dates are not formatted properly, which makes some queries fail when they shouldn't. As a workaround, you could convert the date to string, and back to date again: `xsd:dateTime(str(?date))`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e55173801ab36337ad356a1bc286dbd1",
"grade": false,
"grade_id": "cell-ceefd3c8fbd39d79",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
"\n",
"SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac, ?fechaDef\n",
"\n",
"WHERE {\n",
" ?escritor dct:subject dbc:Spanish_novelists .\n",
" ?escritor rdfs:label ?nombre .\n",
" ?escritor dbo:birthDate ?fechaNac .\n",
" # YOUR ANSWER HERE\n",
"}\n",
"# YOUR ANSWER HERE\n",
"LIMIT 100"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "1b77cfaefb8b2ec286ce7b0c70804fe0",
"grade": true,
"grade_id": "cell-461cd6ccc6c2dc79",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"assert 'Javier Sierra' in solution()['columns']['nombre']\n",
"assert 'http://dbpedia.org/resource/José_Ángel_Mañas' in solution()['columns']['escritor']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Regular expressions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Regular expressions](https://www.w3.org/TR/rdf-sparql-query/#funcex-regex) are a very powerful tool, but we will only cover the basics in this exercise.\n",
"\n",
"In essence, regular expressions match strings against patterns.\n",
"In their simplest form, they can be used to find substrings within a variable.\n",
"For instance, using `regex(?label, \"substring\")` would only match if and only if the `?label` variable contains `substring`.\n",
"But regular expressions can be more complex than that.\n",
"For instance, we can find patterns such as: a 10 digit number, a 5 character long string, or variables without whitespaces.\n",
"\n",
"The syntax of the regex function is the following:\n",
"\n",
"```\n",
"regex(?variable, \"pattern\", \"flags\")\n",
"```\n",
"\n",
"Flags are optional configuration options for the regular expression, such as *do not care about case* (`i` flag).\n",
"\n",
"As an example, let us find the cities in Madrid that contain \"de\" in their name."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"SELECT ?localidad\n",
"WHERE {\n",
" ?localidad <http://dbpedia.org/ontology/isPartOf> <http://dbpedia.org/resource/Community_of_Madrid> .\n",
" ?localidad rdfs:label ?nombre .\n",
" FILTER (lang(?nombre) = \"es\" ).\n",
" FILTER regex(?nombre, \"de\", \"i\")\n",
"}\n",
"LIMIT 10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, use regular expressions to find Spanish novelists whose **first name** is Juan.\n",
"In other words, their name **starts with** \"Juan\"."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b70a9a4f102c253e864d2e8aec79ce81",
"grade": false,
"grade_id": "cell-a57d3546a812f689",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
"\n",
"# YOUR ANSWER HERE\n",
"\n",
"WHERE {\n",
" {\n",
" ?escritor dct:subject dbc:Spanish_poets .\n",
" }\n",
" UNION {\n",
" ?escritor dct:subject dbc:Spanish_novelists .\n",
" }\n",
" ?escritor rdfs:label ?nombre\n",
" FILTER(lang(?nombre) = \"es\") .\n",
"# YOUR ANSWER HERE\n",
"}\n",
"ORDER BY ?nombre\n",
"LIMIT 1000"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "66db9abddfafa91c2dc25577457f71fb",
"grade": true,
"grade_id": "cell-c149fe65008f39a9",
"locked": true,
"points": 0,
"schema_version": 3,
"solution": false
}
},
"outputs": [],
"source": [
"assert len(solution()['columns']['nombre']) > 15\n",
"for i in solution()['columns']['nombre']:\n",
" assert 'Juan' in i\n",
"assert \"Robert Juan-Cantavella\" not in solution()['columns']['nombre']"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "1be6d6e4d8e74240ef07deffcbe5e71a",
"grade": false,
"grade_id": "cell-0c2f0113d97dc9de",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"## Group concat"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "c8dbb73a781bd24080804f289a1cea0b",
"grade": false,
"grade_id": "asdasdasdddddddddddasdasdsad",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Sometimes, it is useful to aggregate results from form different rows.\n",
"For instance, we might want to get a comma-separated list of the names in each each autonomous community in Spain.\n",
"\n",
"In those cases, we can use the `GROUP_CONCAT` function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
" \n",
"SELECT ?com, GROUP_CONCAT(?name, \",\") as ?places # notice how we rename the variable\n",
"\n",
"WHERE {\n",
" ?com dct:subject dbc:Autonomous_communities_of_Spain .\n",
" ?localidad dbo:subdivision ?com ;\n",
" rdfs:label ?name .\n",
" FILTER (lang(?name)=\"es\")\n",
"}\n",
"\n",
"ORDER BY ?com\n",
"LIMIT 100"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"cell_type": "markdown",
"checksum": "4779fb61645634308d0ed01e0c88e8a4",
"grade": false,
"grade_id": "asdiopjasdoijasdoijasd",
"locked": true,
"schema_version": 3,
"solution": false
}
},
"source": [
"Try it yourself, to get a list of works by each of the authors in this query:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "e5d87d1d8eba51c510241ba75981a597",
"grade": false,
"grade_id": "cell-2e3de17c75047652",
"locked": false,
"schema_version": 3,
"solution": true
}
},
"outputs": [],
"source": [
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
"\n",
"# YOUR ANSWER HERE\n",
"\n",
"WHERE {\n",
" ?escritor a dbo:Writer .\n",
" ?escritor rdfs:label ?nombre .\n",
" ?escritor dbo:birthDate ?fechaNac .\n",
" ?escritor dbo:birthPlace dbr:Madrid .\n",
" # YOUR ANSWER HERE\n",
" FILTER(lang(?nombre) = \"es\") .\n",
" FILTER(!bound(?titulo) || lang(?titulo) = \"en\") .\n",
"\n",
"}\n",
"ORDER BY ?nombre\n",
"LIMIT 100"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© 2018 Universidad Politécnica de Madrid."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

4352
lod/BeatlesMusicians.ttl Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -12,6 +12,7 @@ from urllib.request import Request, urlopen
from urllib.parse import quote_plus, urlencode from urllib.parse import quote_plus, urlencode
from urllib.error import HTTPError from urllib.error import HTTPError
import ssl
import json import json
import sys import sys
@@ -20,7 +21,9 @@ display_javascript(js, raw=True)
def send_query(query, endpoint): def send_query(query, endpoint):
FORMATS = ",".join(["application/sparql-results+json", "text/javascript", "application/json"]) FORMATS = ",".join(["application/sparql-results+json",
"text/javascript",
"application/json"])
data = {'query': query} data = {'query': query}
# b = quote_plus(query) # b = quote_plus(query)
@@ -30,10 +33,18 @@ def send_query(query, endpoint):
headers={'content-type': 'application/x-www-form-urlencoded', headers={'content-type': 'application/x-www-form-urlencoded',
'accept': FORMATS}, 'accept': FORMATS},
method='POST') method='POST')
res = urlopen(r) context = ssl.create_default_context()
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
res = urlopen(r, context=context, timeout=2)
data = res.read().decode('utf-8') data = res.read().decode('utf-8')
if res.getcode() == 200: if res.getcode() == 200:
return json.loads(data) try:
return json.loads(data)
except Exception:
print('Got: ', data, file=sys.stderr)
raise
raise Exception('Error getting results: {}'.format(data)) raise Exception('Error getting results: {}'.format(data))
@@ -60,7 +71,7 @@ def solution():
def query(query, endpoint=None, print_table=False): def query(query, endpoint=None, print_table=False):
global LAST_QUERY global LAST_QUERY
endpoint = endpoint or "http://fuseki.cluster.gsi.dit.upm.es/sitc/" endpoint = endpoint or "http://fuseki.gsi.upm.es/sitc/"
results = send_query(query, endpoint) results = send_query(query, endpoint)
tuples = to_table(results) tuples = to_table(results)

0
lod/tests.py Normal file
View File

View File

@@ -0,0 +1,61 @@
.highlight { padding-top: 0; margin: 0;}
.highlight .c { color: #999988; font-style: italic } /* Comment */
.highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */
.highlight .k { color: #000000; font-weight: bold } /* Keyword */
.highlight .o { color: #000000; font-weight: bold } /* Operator */
.highlight .cm { color: #999988; font-style: italic } /* Comment.Multiline */
.highlight .cp { color: #999999; font-weight: bold; font-style: italic } /* Comment.Preproc */
.highlight .c1 { color: #999988; font-style: italic } /* Comment.Single */
.highlight .cs { color: #999999; font-weight: bold; font-style: italic } /* Comment.Special */
.highlight .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */
.highlight .ge { color: #000000; font-style: italic } /* Generic.Emph */
.highlight .gr { color: #aa0000 } /* Generic.Error */
.highlight .gh { color: #999999 } /* Generic.Heading */
.highlight .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */
.highlight .go { color: #888888 } /* Generic.Output */
.highlight .gp { color: #555555 } /* Generic.Prompt */
.highlight .gs { font-weight: bold } /* Generic.Strong */
.highlight .gu { color: #aaaaaa } /* Generic.Subheading */
.highlight .gt { color: #aa0000 } /* Generic.Traceback */
.highlight .kc { color: #000000; font-weight: bold } /* Keyword.Constant */
.highlight .kd { color: #000000; font-weight: bold } /* Keyword.Declaration */
.highlight .kn { color: #000000; font-weight: bold } /* Keyword.Namespace */
.highlight .kp { color: #000000; font-weight: bold } /* Keyword.Pseudo */
.highlight .kr { color: #000000; font-weight: bold } /* Keyword.Reserved */
.highlight .kt { color: #445588; font-weight: bold } /* Keyword.Type */
.highlight .m { color: #009999 } /* Literal.Number */
.highlight .s { color: #d01040 } /* Literal.String */
.highlight .na { color: #008080 } /* Name.Attribute */
.highlight .nb { color: #0086B3 } /* Name.Builtin */
.highlight .nc { color: #445588; font-weight: bold } /* Name.Class */
.highlight .no { color: #008080 } /* Name.Constant */
.highlight .nd { color: #3c5d5d; font-weight: bold } /* Name.Decorator */
.highlight .ni { color: #800080 } /* Name.Entity */
.highlight .ne { color: #990000; font-weight: bold } /* Name.Exception */
.highlight .nf { color: #990000; font-weight: bold } /* Name.Function */
.highlight .nl { color: #990000; font-weight: bold } /* Name.Label */
.highlight .nn { color: #555555 } /* Name.Namespace */
.highlight .nt { color: #000080 } /* Name.Tag */
.highlight .nv { color: #008080 } /* Name.Variable */
.highlight .ow { color: #000000; font-weight: bold } /* Operator.Word */
.highlight .w { color: #bbbbbb } /* Text.Whitespace */
.highlight .mf { color: #009999 } /* Literal.Number.Float */
.highlight .mh { color: #009999 } /* Literal.Number.Hex */
.highlight .mi { color: #009999 } /* Literal.Number.Integer */
.highlight .mo { color: #009999 } /* Literal.Number.Oct */
.highlight .sb { color: #d01040 } /* Literal.String.Backtick */
.highlight .sc { color: #d01040 } /* Literal.String.Char */
.highlight .sd { color: #d01040 } /* Literal.String.Doc */
.highlight .s2 { color: #d01040 } /* Literal.String.Double */
.highlight .se { color: #d01040 } /* Literal.String.Escape */
.highlight .sh { color: #d01040 } /* Literal.String.Heredoc */
.highlight .si { color: #d01040 } /* Literal.String.Interpol */
.highlight .sx { color: #d01040 } /* Literal.String.Other */
.highlight .sr { color: #009926 } /* Literal.String.Regex */
.highlight .s1 { color: #d01040 } /* Literal.String.Single */
.highlight .ss { color: #990073 } /* Literal.String.Symbol */
.highlight .bp { color: #999999 } /* Name.Builtin.Pseudo */
.highlight .vc { color: #008080 } /* Name.Variable.Class */
.highlight .vg { color: #008080 } /* Name.Variable.Global */
.highlight .vi { color: #008080 } /* Name.Variable.Instance */
.highlight .il { color: #009999 } /* Literal.Number.Integer.Long */

968
lod/tutorial/css/style.css Normal file
View File

@@ -0,0 +1,968 @@
@media screen {
body {
font-family: 'Quattrocento', Verdana, sans-serif;
font-size:16px;
background-color:#ffffff;
}
.container {
max-width: 48rem;
overflow: hidden;
text-overflow: ellipsis;
}
/* =============================================================================
Helper classes
========================================================================== */
.noclear {
clear:none;
}
.expanded {
max-width: 58rem;
}
.garnish {
width: 23%;
padding:0;
}
.full-width {
width:80%;
margin: 0 auto;
text-align:center;
}
.float-right {
float:right;
margin-left: 1rem;
margin-bottom: 1rem;
}
.float-left {
margin-right: 1rem;
margin-bottom: 1rem;
}
/* =============================================================================
Home Page
========================================================================== */
.home-block {
padding:3rem 0;
color:#666;
}
.home-block h2 {
margin:0;
font-size:2.8rem;
color:#333;
text-align:center;
}
.home-block p {
margin:0rem;
font-family:'Open Sans';
font-size:1.2rem;
padding-top:2rem;
text-align:justify;
}
.home-block a:visited {
color: #38c;
}
.home-stripe-1 {
color:#eee;
background:#27b;
}
.home-stripe-1 h2, .home-stripe-2 h2 {
color:#fff;
}
.home-stripe-1 a:visited, .home-stripe-1 a:link {
color:#6bf;
}
.home-stripe-2 {
color:#fff;
background:#289;
}
.home-stripe-2 a:visited, .home-stripe-2 a:link {
color:#6cd;
}
.home-image {
width: 75%;
}
.home-logo img {
width: 200px;
}
.home-logo a h1 {
color: #fff;
}
.home-logo {
color: #fff;
}
.home-logo li {
font-size: 1.2rem;
}
.en-back {
background-color: #444444;
}
.es-back {
background-color: #535D7F;
}
.fr-back {
background-color: #3D7C81;
}
.pt-back {
background-color: #d6b664;
}
.sitewide-alert {
position: relative;
margin-bottom: 0;
}
/* =============================================================================
Lesson Headers
========================================================================== */
header {
margin:-3rem 0 3rem 0;
padding:0;
font-family:'Roboto', sans-serif;
color:#ccc;
background: #efefef;
border-top:1px solid #333;
border-bottom:1px solid #333;
text-align:left;
}
header .container-fluid {
margin:0;
padding:1rem;
background: #f5f5f5;
}
header h1 {
margin:0;
padding:0;
font-size:1.8rem;
text-align:left;
}
header h2 {
font-family:'Roboto', sans-serif;
font-size:1.2rem;
color:#333;
margin: 1.5rem 0 1.5rem 0rem;
text-align:left;
}
header h3, header h4 {
font: .9rem/1.1rem 'Roboto Condensed', sans-serif;
text-transform:uppercase;
font-variant:small-caps;
letter-spacing:80%;
color:#666;
margin:.3rem 0 0 0;
padding:0;
}
header h4 {
display:inline;
margin:0;
line-height:1.3rem;
}
header .header-image {
float:left;
border:.2rem solid gray;
margin:0;
padding:0;
max-width: 200px;
}
header .header-abstract {
font: 1rem/1.4rem 'Roboto', sans-serif;
color:#666;
margin:1rem 0;
}
header .header-helpers {
clear:both;
background:#ccc;
color:#fff;
border-top:1px solid #999;
border-bottom:1px solid #999;
}
header ul {
margin:0;
padding:0;
list-style-type: none;
}
header li, header .metarow {
font: .9rem/1.1rem 'Roboto Condensed';
}
header .metarow {
color:#999;
}
header .peer-review, header .open-license {
font-size: 0.9rem;
color: #666;
margin: 0;
}
/* =============================================================================
Lessons Index
========================================================================== */
/*****************
FILTER BUTTONS
******************/
ul.filter, ul.sort-by {
margin: 0 0 1rem 0;
padding: 0px;
text-align:center;
}
li.filter,
li.sort,
#filter-none {
font: .9rem/1.1rem 'Open Sans', sans-serif;
padding: .4rem .6rem;
border:none;
border-radius: 3px;
display:inline-block;
text-transform:uppercase;
text-decoration: none;
}
.filter li:hover,
.sort-by li:hover,
#filter-none:hover {
cursor: pointer;
}
.activities li.current:hover,
.filter li.current:hover,
.sort-by li.current:hover {
cursor:default;
}
.topic li a {
text-decoration: none;
}
.activities li {
background-color:#38c;
color:#fff;
}
.activities li:hover {
background-color:#16a;
}
.activities li.current {
background-color:#059;
}
.topics li {
background-color:#eee;
color: #38a;
}
.topics li:hover {
background-color:#ccc;
}
.topics li.current {
background-color:#aaa;
color: #333;
}
#filter-none {
width:99.5%;
clear:both;
text-align:center;
margin-bottom:1rem;
background-color:#fefefe;
color:#666;
border:1px solid #999;
}
#filter-none:hover {
background-color:#ededed;
}
/*****************
SEARCH
*****************/
.search-input {
width:55%;
clear:both;
margin-bottom:1rem;
background-color:#fefefe;
color:#666;
border:1px solid #999;
font: .9rem/1.1rem 'Open Sans',
sans-serif;
padding: .4rem .6rem;
border-radius: 3px;
display:inline-block;
text-transform:uppercase;
text-decoration: none;
}
#search-button,
#enable-search-button {
background-color: #efefef;
color: rgb(153, 143, 143);
width: 35%;
font: .9rem/1.1rem 'Open Sans',
sans-serif;
padding: .4rem .6rem;
border: none;
border-radius: 3px;
display: inline-block;
text-transform: uppercase;
text-decoration: none;
}
@media only screen and (max-width: 767px) {
/* phones */
#search-button,
#enable-search-button {
width: 80%;
}
}
#search-info-button {
padding: 0.5rem;
color: rgb(153, 143, 143);
}
#search-info {
display: none;
height:0px;
background:#efefef;
overflow:hidden;
transition:0.5s;
-webkit-transition:0.5s;
width: 100%;
text-align: left;
box-sizing: border-box;
}
#search-info.visible {
display: block;
height: fit-content;
height: -moz-max-content;
padding: 10px;
margin-top: 10px;
}
/*****************
SORT BUTTONS
*****************/
li.sort {
background-color: #efefef;
color:#666;
width:49.5%;
}
li.sort:hover {
text-decoration: none;
background-color:#cecece;
}
#current-sort {
font-size:75%;
}
.sort.my-desc:after, .sort-desc:after {
width: 0;
height: 0;
border-left: .4rem solid transparent;
border-right: .4rem solid transparent;
border-top: .4rem solid;
content:"";
position: relative;
top:.75rem;
right:-.3rem;
}
.sort.my-asc:after, .sort-asc:after {
width: 0;
height: 0;
border-left: .4rem solid transparent;
border-right: .4rem solid transparent;
border-bottom: .4rem solid;
content:"";
position: relative;
bottom:.75rem;
right:-.3rem;
}
.sort-desc:after {
top:1rem;
}
.sort-asc:after {
bottom:1rem;
}
/*****************************
LESSON INDEX RESULTS LIST
*****************************/
h2.results-title {
margin:1rem 0;
font: 1.6rem/2rem 'Roboto Condensed';
color:#666;
text-transform:uppercase;
}
#results-value {
color:#000;
}
#lesson-list .list ul {
margin:0;
padding:0;
}
#lesson-list .list li {
list-style-type:none;
margin:0;
}
.lesson-description {
margin-bottom:2rem;
padding:0rem;
min-height:120px;
text-align:left;
}
.lesson-description img {
width:100%;
}
.lesson-image {
width:120px;
float:left;
margin-right:1rem;
}
.above-title {
margin:0 0 .2rem 0;
font: .8rem/1rem 'Roboto Condensed';
color:#999;
text-transform:uppercase;
clear:none;
}
.lesson-description h2.title {
font: 1.2rem/1.3rem 'Crete Round', serif;
margin:0 0 .8rem 0;
clear:none;
}
.list .date,
.lesson-description .activity,
.lesson-description .topics,
.lesson-description .difficulty {
display: none;
}
#pre-loader {
visibility: hidden;
display: flex;
justify-content: center;
align-items: center;
height: 100vh;
width: 100%;
position: fixed;
top: 0;
left: 0;
z-index: 9999;
transition: opacity 0.3s linear;
background: rgba(211, 211, 211, 0.8);
}
/* =============================================================================
Top Navigation Bar
========================================================================== */
.navbar {
padding: .6rem 1rem;
margin: 0 0 3rem 0;
}
.navbar-dark .navbar-nav .nav-link {
font-family:'Open Sans';
text-transform:uppercase;
color:#fff;
font-size:.9rem;
}
.btn-group > .btn-secondary {
border-color: #333333;
background-color: #888888;
}
.lang {
text-transform:lowercase !important;
}
.navbar-dark .navbar-nav .nav-link:hover, .navbar-dark .navbar-brand:hover {
color:#39a;
}
.navbar-toggler-icon {
background-image: url("data:image/svg+xml;charset=utf8,%3Csvg viewBox='0 0 32 32' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath stroke='rgba(255,255,255, 1)' stroke-width='2' stroke-linecap='round' stroke-miterlimit='10' d='M4 8h24M4 16h24M4 24h24'/%3E%3C/svg%3E");
}
.navbar-collapse {
text-align:center;
}
.navbar-dark .navbar-brand {
font-family:'Crete Round', serif;
color:#fff;
letter-spacing: .02em;
}
.btn-group > a.btn {
padding-left: 1rem;
padding-right: 1rem;
}
a.dropdown-item {
border-bottom:1px solid #ccc;
font-family:'Roboto';
}
.dropdown-menu {
position: absolute;
background: #fff;
border: 1px solid #ccc;
margin:0;
padding:0;
}
.dropdown-menu a {
font-size:.8rem;
line-height:2rem;
text-transform:uppercase;
}
.dropdown-menu a:last-child {
border-bottom:none;
}
.dropdown-menu:after, .dropdown-menu:before {
bottom: 100%;
left: 20%;
border: solid transparent;
content: " ";
height: 0;
width: 0;
position: absolute;
pointer-events: none;
}
.dropdown-menu:after {
border-color: rgba(255, 255, 255, 0);
border-bottom-color: #fff;
border-width: 12px;
margin-left: -12px;
}
.dropdown-menu:before {
border-color: rgba(51, 153, 170, 0);
border-bottom-color: #ccc;
border-width: 13px;
margin-left: -13px;
}
.navbar-dark .navbar-nav .nav-link:focus {
color: #ccc;
}
.header-link {
position: absolute;
right: 0.6em;
opacity: 0;
-webkit-transition: opacity 0.2s ease-in-out 0.1s;
-moz-transition: opacity 0.2s ease-in-out 0.1s;
-ms-transition: opacity 0.2s ease-in-out 0.1s;
}
h2:hover .header-link,
h3:hover .header-link,
h4:hover .header-link,
h5:hover .header-link,
h6:hover .header-link {
opacity: 1;
}
/* =============================================================================
Lesson Typography
========================================================================== */
a {text-decoration:none;}
a:link {color: #38c;}
a:visited {color: #39a;}
a:hover {color: #555;}
a:active {color: #555;}
b, strong { font-weight: bold; }
blockquote { margin: 1em 2em; padding: 0 1em 0 1em; font-style: italic; border:1px solid #666; background: #eeeeee;}
hr {
display: block; height: 1px; border: 0; border-top: 1px solid #ccc; margin: 2em 0; padding: 0; }
img {
max-width:100%;
}
ins { background: #ff9; color: #000; text-decoration: none; }
h1,h2,h3,h4,h5 {
font-family:'Crete Round', serif;
font-weight:normal;
clear:both;
}
h1 {
font-size:2rem;
margin-bottom:1.5rem;
letter-spacing:-.03rem;
text-align:center;
}
h2 {
font-size:1.6rem;
margin-top:3rem;
letter-spacing:-.02rem;
}
h3 {
font-size:1.4rem;
margin-top:2.5rem;
}
h4 {
font-size:1.2rem;
margin-top:1.8rem;
}
h5 {
font-size:1.0rem;
margin-top:1.4rem;
}
h1 a, h2 a, h3 a, h4 a, h5 a {
text-decoration:none;
}
h1 a:link { color: #38c; }
h1 a:visited {color: #39a; }
/* select button generated by codeblocks.js */
.fa-align-left {opacity: 0.2;}
.highlight:hover .fa-align-left {opacity: 1;}
q { quotes: none; }
q:before, q:after { content: ""; content: none; }
small { font-size: 85%; }
/* Position subscript and superscript content without affecting line-height: h5bp.com/k */
sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
sup { top: -0.5em; }
sub { bottom: -0.25em; }
li {
margin-bottom:.5rem;
line-height:1.4rem;
}
li.nav-item {
margin-bottom:0;
}
.alert {
font-family: 'Roboto';
}
.alert h2, .alert h3, .alert h4 {
margin-top:0;
}
/* =============================================================================
Code Highlighting
========================================================================== */
code {
font-family: monospace, serif;
font-size:.9rem;
}
.highlight {
margin: 1rem 0 1rem 0;
padding:.5rem .2rem;
font-size:.9rem;
white-space: pre;
word-wrap: normal;
overflow: auto;
border: 1px solid #eee;
background: #fafafa;
}
/* =============================================================================
Figures
========================================================================== */
figure {
margin: 0 auto .5rem;
text-align: center;
display:table;
}
figcaption {
margin-top:.5rem;
font-family:'Open Sans';
font-size:0.8em;
color: #666;
display:block;
caption-side: bottom;
}
.author-info, .citation-info {
border-top:1px solid #333;
padding-top:1rem;
margin-top:2rem;
}
.author-name, .suggested-citation-header {
font-family:'Roboto Condensed';
font-weight: 600;
font-size:1.2rem;
color: #666;
text-transform:uppercase;
}
.author-description p, .suggested-citation-text p {
font-size:0.9rem;
font-family:'Open Sans';
color: #666;
}
/* =============================================================================
Tables
========================================================================== */
table {
width: 100%;
margin-bottom: 1em;
}
th, td {
padding: 10px;
text-align: left;
border-bottom: 1px solid #ddd;
}
thead {
background-color: #535353;
color: #fff;
font-weight: bold;
}
tr:nth-child(even) {background-color: #f2f2f2}
/* =============================================================================
Blog Index and Layout
========================================================================== */
.blog-header {
text-align:center;
}
.blog-header h2 {
margin:0;
line-height: 2rem;
}
.blog-header h3 { /*author*/
margin-top:.4rem;
color: #666;
font-size:1rem;
}
.blog-header h4{
color: #999;
font-size:1rem;
margin-bottom:.2rem;
font-family:'Roboto Condensed';
text-transform:uppercase;
}
.blog-header figure {
max-width:80%;
}
.blog-header figcaption {
text-align: center;
}
.blog-page-header {
margin-bottom:3rem;
}
/* =============================================================================
Project Team
========================================================================== */
.contact-box {
margin-bottom:3rem;
}
/* =============================================================================
Footer
========================================================================== */
footer[role="contentinfo"] {
margin-top: 2rem;
padding: 2rem 0;
font-family:'Open Sans';
font-size:.9rem;
color: #fff;
background-color:#666;
text-align:center;
}
footer a, footer a:link, footer a:visited {
color: #fff;
border-bottom:1px #eee dotted;
}
footer a:hover {
text-decoration: none;
border-bottom:1px #fff solid;
}
footer .fa {
margin: 0 .2rem 0rem 0rem;
}
.footer-head {
font-size:1.1rem;
line-height:1.4rem;
margin-bottom:1rem;
}
} /* end screen */
@media only screen and (max-width: 768px) {
.garnish {
display:none;
}
.dropdown-menu:after, .dropdown-menu:before {
display:none;
}
}
/* Print Styling */
@media screen {
/* Class to hide elements only shown when printing */
.hide-print {
display: none !important;
}
}
@media print {
* { background: transparent !important; color: black !important; box-shadow:none !important; text-shadow: none !important; filter:none !important; -ms-filter: none !important; } /* Black prints faster: h5bp.com/s */
a, a:visited { text-decoration: underline; }
a[href]:after { content: " (" attr(href) ")"; }
abbr[title]:after { content: " (" attr(title) ")"; }
a[href^="javascript:"]:after, a[href^="#"]:after { content: ""; } /* Don't show links for images, or javascript/internal links */
pre, blockquote {
border: 1px solid #999;
page-break-inside: avoid;
margin: 0.5cm;
padding: 0.5cm
}
thead { display: table-header-group; } /* h5bp.com/t */
tr, img { page-break-inside: avoid; }
img { max-width: 100% !important; }
@page {
margin: 1.5cm;
}
body { font-size: 0.85rem;}
p, h2, h3 { orphans: 3; widows: 3; }
h1, h2, h3 { page-break-after: avoid; }
h1 { font-size: 1.4rem; }
h2 { font-size: 1.1rem; }
h3 { font-size: 1rem; }
h4 { font-size: 0.9rem; }
.header-bottom {
margin-bottom: 2rem;
page-break-after: always;
}
.hide-screen {
/* Hide elements that only appear on screen */
display: none !important;
}
.print-header {
/* format navbar for print */
display: block;
z-index:1030;
width: 100%;
height: 3rem;
padding: .6rem 1rem;
margin-bottom: 1rem;
color:#fff;
white-space: nowrap;
font-family: 'Crete Round', serif;
border-bottom: 1px solid lightgrey;
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,17 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 19.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
viewBox="0 0 256 256" style="enable-background:new 0 0 256 256;" xml:space="preserve">
<style type="text/css">
.st0{fill:#A6CE39;}
.st1{fill:#FFFFFF;}
</style>
<path class="st0" d="M256,128c0,70.7-57.3,128-128,128C57.3,256,0,198.7,0,128C0,57.3,57.3,0,128,0C198.7,0,256,57.3,256,128z"/>
<g>
<path class="st1" d="M86.3,186.2H70.9V79.1h15.4v48.4V186.2z"/>
<path class="st1" d="M108.9,79.1h41.6c39.6,0,57,28.3,57,53.6c0,27.5-21.5,53.6-56.8,53.6h-41.8V79.1z M124.3,172.4h24.5
c34.9,0,42.9-26.5,42.9-39.7c0-21.5-13.7-39.7-43.7-39.7h-23.7V172.4z"/>
<path class="st1" d="M88.7,56.8c0,5.5-4.5,10.1-10.1,10.1c-5.6,0-10.1-4.6-10.1-10.1c0-5.6,4.5-10.1,10.1-10.1
C84.2,46.7,88.7,51.3,88.7,56.8z"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 983 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 318 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 318 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 190 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 351 KiB

View File

@@ -0,0 +1,40 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="262pt" height="124pt"
viewBox="0.00 0.00 261.53 124.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 120)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-120 257.533,-120 257.533,4 -4,4"/>
<!-- nw -->
<g id="node1" class="node"><title>nw</title>
<ellipse fill="none" stroke="gray" cx="49.3505" cy="-98" rx="49.2014" ry="18"/>
<text text-anchor="middle" x="49.3505" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
</g>
<!-- oil -->
<g id="node3" class="node"><title>oil</title>
<ellipse fill="none" stroke="gray" cx="117.35" cy="-18" rx="42.8742" ry="18"/>
<text text-anchor="middle" x="117.35" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
</g>
<!-- nw&#45;&gt;oil -->
<g id="edge1" class="edge"><title>nw&#45;&gt;oil</title>
<path fill="none" stroke="gray" d="M63.7715,-80.4582C73.3018,-69.5265 85.9453,-55.0236 96.5567,-42.8517"/>
<polygon fill="gray" stroke="gray" points="99.2502,-45.0882 103.183,-35.2505 93.9738,-40.4882 99.2502,-45.0882"/>
<text text-anchor="middle" x="108.138" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
<!-- wb -->
<g id="node2" class="node"><title>wb</title>
<ellipse fill="none" stroke="gray" cx="185.35" cy="-98" rx="68.3645" ry="18"/>
<text text-anchor="middle" x="185.35" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
</g>
<!-- wb&#45;&gt;oil -->
<g id="edge2" class="edge"><title>wb&#45;&gt;oil</title>
<path fill="none" stroke="gray" d="M170.595,-80.0752C161.138,-69.2266 148.718,-54.9801 138.252,-42.9755"/>
<polygon fill="gray" stroke="gray" points="140.589,-40.3299 131.38,-35.0922 135.313,-44.9299 140.589,-40.3299"/>
<text text-anchor="middle" x="176.138" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 2.2 KiB

View File

@@ -0,0 +1,104 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="438pt" height="212pt"
viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
<!-- nw -->
<g id="node1" class="node"><title>nw</title>
<ellipse fill="none" stroke="gray" cx="132" cy="-186" rx="49.2014" ry="18"/>
<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
</g>
<!-- rr -->
<g id="node2" class="node"><title>rr</title>
<ellipse fill="none" stroke="gray" cx="132" cy="-98" rx="60.0217" ry="18"/>
<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
</g>
<!-- nw&#45;&gt;rr -->
<g id="edge1" class="edge"><title>nw&#45;&gt;rr</title>
<path fill="none" stroke="gray" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
<polygon fill="gray" stroke="gray" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
</g>
<!-- oil -->
<g id="node5" class="node"><title>oil</title>
<ellipse fill="none" stroke="gray" cx="253" cy="-98" rx="42.8742" ry="18"/>
<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
</g>
<!-- nw&#45;&gt;oil -->
<g id="edge3" class="edge"><title>nw&#45;&gt;oil</title>
<path fill="none" stroke="gray" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
<polygon fill="gray" stroke="gray" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
<!-- nwd -->
<g id="node7" class="node"><title>nwd</title>
<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
</g>
<!-- nw&#45;&gt;nwd -->
<g id="edge2" class="edge"><title>nw&#45;&gt;nwd</title>
<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
<text text-anchor="middle" x="106.566" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="106.566" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created in</text>
</g>
<!-- d -->
<g id="node6" class="node"><title>d</title>
<ellipse fill="none" stroke="gray" cx="263" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
</g>
<!-- rr&#45;&gt;d -->
<g id="edge5" class="edge"><title>rr&#45;&gt;d</title>
<path fill="none" stroke="gray" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
<polygon fill="gray" stroke="gray" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
</g>
<!-- rrb -->
<g id="node8" class="node"><title>rrb</title>
<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
</g>
<!-- rr&#45;&gt;rrb -->
<g id="edge4" class="edge"><title>rr&#45;&gt;rrb</title>
<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
</g>
<!-- jv -->
<g id="node3" class="node"><title>jv</title>
<ellipse fill="none" stroke="gray" cx="372" cy="-98" rx="57.9076" ry="18"/>
<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
</g>
<!-- jv&#45;&gt;d -->
<g id="edge6" class="edge"><title>jv&#45;&gt;d</title>
<path fill="none" stroke="gray" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
<polygon fill="gray" stroke="gray" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
</g>
<!-- wb -->
<g id="node4" class="node"><title>wb</title>
<ellipse fill="none" stroke="gray" cx="277" cy="-186" rx="68.3645" ry="18"/>
<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
</g>
<!-- wb&#45;&gt;jv -->
<g id="edge7" class="edge"><title>wb&#45;&gt;jv</title>
<path fill="none" stroke="gray" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
<polygon fill="gray" stroke="gray" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
</g>
<!-- wb&#45;&gt;oil -->
<g id="edge8" class="edge"><title>wb&#45;&gt;oil</title>
<path fill="none" stroke="gray" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
<polygon fill="gray" stroke="gray" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 6.2 KiB

View File

@@ -0,0 +1,104 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="438pt" height="212pt"
viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
<!-- nw -->
<g id="node1" class="node"><title>nw</title>
<ellipse fill="none" stroke="red" cx="132" cy="-186" rx="49.2014" ry="18"/>
<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
</g>
<!-- rr -->
<g id="node2" class="node"><title>rr</title>
<ellipse fill="none" stroke="red" cx="132" cy="-98" rx="60.0217" ry="18"/>
<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
</g>
<!-- nw&#45;&gt;rr -->
<g id="edge1" class="edge"><title>nw&#45;&gt;rr</title>
<path fill="none" stroke="orange" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
<polygon fill="orange" stroke="orange" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
</g>
<!-- oil -->
<g id="node5" class="node"><title>oil</title>
<ellipse fill="none" stroke="gray" cx="253" cy="-98" rx="42.8742" ry="18"/>
<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
</g>
<!-- nw&#45;&gt;oil -->
<g id="edge3" class="edge"><title>nw&#45;&gt;oil</title>
<path fill="none" stroke="gray" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
<polygon fill="gray" stroke="gray" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
<!-- nwd -->
<g id="node7" class="node"><title>nwd</title>
<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
</g>
<!-- nw&#45;&gt;nwd -->
<g id="edge2" class="edge"><title>nw&#45;&gt;nwd</title>
<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
<text text-anchor="middle" x="106.566" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="106.566" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created in</text>
</g>
<!-- d -->
<g id="node6" class="node"><title>d</title>
<ellipse fill="none" stroke="orange" cx="263" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
</g>
<!-- rr&#45;&gt;d -->
<g id="edge5" class="edge"><title>rr&#45;&gt;d</title>
<path fill="none" stroke="orange" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
<polygon fill="orange" stroke="orange" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
</g>
<!-- rrb -->
<g id="node8" class="node"><title>rrb</title>
<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
</g>
<!-- rr&#45;&gt;rrb -->
<g id="edge4" class="edge"><title>rr&#45;&gt;rrb</title>
<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
</g>
<!-- jv -->
<g id="node3" class="node"><title>jv</title>
<ellipse fill="none" stroke="red" cx="372" cy="-98" rx="57.9076" ry="18"/>
<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
</g>
<!-- jv&#45;&gt;d -->
<g id="edge6" class="edge"><title>jv&#45;&gt;d</title>
<path fill="none" stroke="orange" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
<polygon fill="orange" stroke="orange" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
</g>
<!-- wb -->
<g id="node4" class="node"><title>wb</title>
<ellipse fill="none" stroke="red" cx="277" cy="-186" rx="68.3645" ry="18"/>
<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
</g>
<!-- wb&#45;&gt;jv -->
<g id="edge7" class="edge"><title>wb&#45;&gt;jv</title>
<path fill="none" stroke="orange" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
<polygon fill="orange" stroke="orange" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
</g>
<!-- wb&#45;&gt;oil -->
<g id="edge8" class="edge"><title>wb&#45;&gt;oil</title>
<path fill="none" stroke="gray" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
<polygon fill="gray" stroke="gray" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 6.2 KiB

View File

@@ -0,0 +1,104 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="438pt" height="212pt"
viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
<!-- nw -->
<g id="node1" class="node"><title>nw</title>
<ellipse fill="none" stroke="red" cx="132" cy="-186" rx="49.2014" ry="18"/>
<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
</g>
<!-- rr -->
<g id="node2" class="node"><title>rr</title>
<ellipse fill="none" stroke="gray" cx="132" cy="-98" rx="60.0217" ry="18"/>
<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
</g>
<!-- nw&#45;&gt;rr -->
<g id="edge1" class="edge"><title>nw&#45;&gt;rr</title>
<path fill="none" stroke="gray" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
<polygon fill="gray" stroke="gray" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
</g>
<!-- oil -->
<g id="node5" class="node"><title>oil</title>
<ellipse fill="none" stroke="orange" cx="253" cy="-98" rx="42.8742" ry="18"/>
<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
</g>
<!-- nw&#45;&gt;oil -->
<g id="edge3" class="edge"><title>nw&#45;&gt;oil</title>
<path fill="none" stroke="orange" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
<polygon fill="orange" stroke="orange" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
<!-- nwd -->
<g id="node7" class="node"><title>nwd</title>
<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
</g>
<!-- nw&#45;&gt;nwd -->
<g id="edge2" class="edge"><title>nw&#45;&gt;nwd</title>
<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
<text text-anchor="middle" x="105.455" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="105.455" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">createdin</text>
</g>
<!-- d -->
<g id="node6" class="node"><title>d</title>
<ellipse fill="none" stroke="gray" cx="263" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
</g>
<!-- rr&#45;&gt;d -->
<g id="edge5" class="edge"><title>rr&#45;&gt;d</title>
<path fill="none" stroke="gray" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
<polygon fill="gray" stroke="gray" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
</g>
<!-- rrb -->
<g id="node8" class="node"><title>rrb</title>
<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
</g>
<!-- rr&#45;&gt;rrb -->
<g id="edge4" class="edge"><title>rr&#45;&gt;rrb</title>
<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
</g>
<!-- jv -->
<g id="node3" class="node"><title>jv</title>
<ellipse fill="none" stroke="gray" cx="372" cy="-98" rx="57.9076" ry="18"/>
<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
</g>
<!-- jv&#45;&gt;d -->
<g id="edge6" class="edge"><title>jv&#45;&gt;d</title>
<path fill="none" stroke="gray" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
<polygon fill="gray" stroke="gray" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
</g>
<!-- wb -->
<g id="node4" class="node"><title>wb</title>
<ellipse fill="none" stroke="red" cx="277" cy="-186" rx="68.3645" ry="18"/>
<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
</g>
<!-- wb&#45;&gt;jv -->
<g id="edge7" class="edge"><title>wb&#45;&gt;jv</title>
<path fill="none" stroke="gray" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
<polygon fill="gray" stroke="gray" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
</g>
<!-- wb&#45;&gt;oil -->
<g id="edge8" class="edge"><title>wb&#45;&gt;oil</title>
<path fill="none" stroke="orange" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
<polygon fill="orange" stroke="orange" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 6.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

View File

@@ -0,0 +1,127 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="636pt" height="364pt"
viewBox="0.00 0.00 636.30 364.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 360)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-360 632.301,-360 632.301,4 -4,4"/>
<!-- o -->
<g id="node1" class="node"><title>o</title>
<ellipse fill="none" stroke="red" cx="274.27" cy="-338" rx="53.3595" ry="18"/>
<text text-anchor="middle" x="274.27" y="-335" font-family="Helvetica,sans-Serif" font-size="10.00">object/PPA82633</text>
</g>
<!-- th1 -->
<g id="node2" class="node"><title>th1</title>
<ellipse fill="none" stroke="red" cx="40.2702" cy="-258" rx="40.0417" ry="18"/>
<text text-anchor="middle" x="40.2702" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">thes:x11409</text>
</g>
<!-- o&#45;&gt;th1 -->
<g id="edge1" class="edge"><title>o&#45;&gt;th1</title>
<path fill="none" stroke="red" d="M224.341,-331.342C190.446,-326.351 145.117,-317.397 107.454,-302 93.6639,-296.363 79.5997,-287.87 67.927,-279.917"/>
<polygon fill="red" stroke="red" points="69.704,-276.888 59.5111,-273.997 65.6765,-282.613 69.704,-276.888"/>
<text text-anchor="middle" x="147.178" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:p45_consists_of</text>
</g>
<!-- dep -->
<g id="node4" class="node"><title>dep</title>
<ellipse fill="none" stroke="red" cx="172.27" cy="-258" rx="74.1479" ry="18"/>
<text text-anchor="middle" x="172.27" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">person&#45;institution/147800</text>
</g>
<!-- o&#45;&gt;dep -->
<g id="edge3" class="edge"><title>o&#45;&gt;dep</title>
<path fill="none" stroke="red" d="M235.65,-325.351C222.058,-319.869 207.403,-312.234 196.239,-302 191.195,-297.376 186.961,-291.439 183.525,-285.462"/>
<polygon fill="red" stroke="red" points="186.46,-283.516 178.779,-276.219 180.234,-286.714 186.46,-283.516"/>
<text text-anchor="middle" x="228.286" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P62_depicts</text>
</g>
<!-- etc -->
<g id="node6" class="node"><title>etc</title>
<ellipse fill="none" stroke="gray" cx="274.27" cy="-18" rx="27" ry="18"/>
<text text-anchor="middle" x="274.27" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">etc...</text>
</g>
<!-- o&#45;&gt;etc -->
<g id="edge10" class="edge"><title>o&#45;&gt;etc</title>
<path fill="none" stroke="gray" d="M274.27,-319.958C274.27,-304.156 274.27,-279.99 274.27,-259 274.27,-259 274.27,-259 274.27,-97 274.27,-80.1099 274.27,-61.1626 274.27,-46.172"/>
<polygon fill="gray" stroke="gray" points="277.77,-46.0417 274.27,-36.0418 270.77,-46.0418 277.77,-46.0417"/>
</g>
<!-- own -->
<g id="node7" class="node"><title>own</title>
<ellipse fill="none" stroke="red" cx="395.27" cy="-258" rx="93.1176" ry="18"/>
<text text-anchor="middle" x="395.27" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">thesIdentifier:the&#45;british&#45;museum</text>
</g>
<!-- o&#45;&gt;own -->
<g id="edge5" class="edge"><title>o&#45;&gt;own</title>
<path fill="none" stroke="red" d="M297.887,-321.776C315.86,-310.19 340.856,-294.077 361.023,-281.077"/>
<polygon fill="red" stroke="red" points="363.112,-283.894 369.621,-275.534 359.319,-278.011 363.112,-283.894"/>
<text text-anchor="middle" x="392.854" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P54_has_current_owner</text>
</g>
<!-- con -->
<g id="node8" class="node"><title>con</title>
<ellipse fill="none" stroke="red" cx="524.27" cy="-178" rx="80.1403" ry="18"/>
<text text-anchor="middle" x="524.27" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">object/PPA82633/concept/1</text>
</g>
<!-- o&#45;&gt;con -->
<g id="edge6" class="edge"><title>o&#45;&gt;con</title>
<path fill="none" stroke="red" d="M322.863,-330.474C381.749,-321.472 475.782,-303.207 497.27,-276 513.047,-256.024 519.612,-227.389 522.339,-206.409"/>
<polygon fill="red" stroke="red" points="525.845,-206.541 523.445,-196.222 518.886,-205.786 525.845,-206.541"/>
<text text-anchor="middle" x="548.839" y="-255.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P128_carries</text>
</g>
<!-- th1lab -->
<g id="node3" class="node"><title>th1lab</title>
<ellipse fill="none" stroke="gray" cx="40.2702" cy="-178" rx="27" ry="18"/>
<text text-anchor="middle" x="40.2702" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">paper</text>
</g>
<!-- th1&#45;&gt;th1lab -->
<g id="edge2" class="edge"><title>th1&#45;&gt;th1lab</title>
<path fill="none" stroke="gray" d="M40.2702,-239.689C40.2702,-229.894 40.2702,-217.422 40.2702,-206.335"/>
<polygon fill="gray" stroke="gray" points="43.7703,-206.262 40.2702,-196.262 36.7703,-206.262 43.7703,-206.262"/>
<text text-anchor="middle" x="66.2858" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
</g>
<!-- deplab -->
<g id="node5" class="node"><title>deplab</title>
<ellipse fill="none" stroke="gray" cx="172.27" cy="-178" rx="66.8537" ry="18"/>
<text text-anchor="middle" x="172.27" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">Julius Caesar Scaliger</text>
</g>
<!-- dep&#45;&gt;deplab -->
<g id="edge4" class="edge"><title>dep&#45;&gt;deplab</title>
<path fill="none" stroke="gray" d="M172.27,-239.689C172.27,-229.894 172.27,-217.422 172.27,-206.335"/>
<polygon fill="gray" stroke="gray" points="175.77,-206.262 172.27,-196.262 168.77,-206.262 175.77,-206.262"/>
<text text-anchor="middle" x="198.286" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
</g>
<!-- contype -->
<g id="node9" class="node"><title>contype</title>
<ellipse fill="none" stroke="gray" cx="431.27" cy="-98" rx="85.8678" ry="18"/>
<text text-anchor="middle" x="431.27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">ecrm:E73_Information_Object</text>
</g>
<!-- con&#45;&gt;contype -->
<g id="edge7" class="edge"><title>con&#45;&gt;contype</title>
<path fill="none" stroke="gray" d="M504.547,-160.458C491.321,-149.365 473.71,-134.595 459.068,-122.314"/>
<polygon fill="gray" stroke="gray" points="461.196,-119.531 451.285,-115.787 456.698,-124.895 461.196,-119.531"/>
<text text-anchor="middle" x="494.61" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">rdf:type</text>
</g>
<!-- concon -->
<g id="node10" class="node"><title>concon</title>
<ellipse fill="none" stroke="gray" cx="576.27" cy="-98" rx="40.8927" ry="18"/>
<text text-anchor="middle" x="576.27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">thes:x12440</text>
</g>
<!-- con&#45;&gt;concon -->
<g id="edge8" class="edge"><title>con&#45;&gt;concon</title>
<path fill="none" stroke="gray" d="M535.553,-160.075C542.599,-149.507 551.795,-135.713 559.664,-123.91"/>
<polygon fill="gray" stroke="gray" points="562.73,-125.619 565.365,-115.357 556.906,-121.736 562.73,-125.619"/>
<text text-anchor="middle" x="588.96" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P129_is_about</text>
</g>
<!-- conlab -->
<g id="node11" class="node"><title>conlab</title>
<ellipse fill="none" stroke="gray" cx="576.27" cy="-18" rx="33.894" ry="18"/>
<text text-anchor="middle" x="576.27" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">academic</text>
</g>
<!-- concon&#45;&gt;conlab -->
<g id="edge9" class="edge"><title>concon&#45;&gt;conlab</title>
<path fill="none" stroke="gray" d="M576.27,-79.6893C576.27,-69.8938 576.27,-57.4218 576.27,-46.335"/>
<polygon fill="gray" stroke="gray" points="579.77,-46.2623 576.27,-36.2623 572.77,-46.2624 579.77,-46.2623"/>
<text text-anchor="middle" x="602.286" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 7.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 190 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

View File

@@ -0,0 +1,114 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="367pt" height="364pt"
viewBox="0.00 0.00 367.21 364.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 360)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-360 363.215,-360 363.215,4 -4,4"/>
<!-- obj -->
<g id="node1" class="node"><title>obj</title>
<ellipse fill="none" stroke="gray" cx="148.735" cy="-338" rx="148.97" ry="18"/>
<text text-anchor="middle" x="148.735" y="-335" font-family="Helvetica,sans-Serif" font-size="10.00">http://collection.britishmuseum.org/id/object/PPA82633</text>
</g>
<!-- object_type -->
<g id="node2" class="node"><title>object_type</title>
<ellipse fill="none" stroke="gray" cx="51.7348" cy="-258" rx="38.5366" ry="18"/>
<text text-anchor="middle" x="51.7348" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">object_type</text>
</g>
<!-- obj&#45;&gt;object_type -->
<g id="edge1" class="edge"><title>obj&#45;&gt;object_type</title>
<path fill="none" stroke="gray" d="M98.3951,-320.855C88.4182,-315.964 78.6502,-309.764 70.9106,-302 66.4004,-297.476 62.8568,-291.706 60.1099,-285.873"/>
<polygon fill="gray" stroke="gray" points="63.1969,-284.17 56.2095,-276.205 56.7054,-286.789 63.1969,-284.17"/>
<text text-anchor="middle" x="108.647" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">bmo:PX_object_type</text>
</g>
<!-- production -->
<g id="node4" class="node"><title>production</title>
<ellipse fill="none" stroke="gray" cx="148.735" cy="-258" rx="36.3999" ry="18"/>
<text text-anchor="middle" x="148.735" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">production</text>
</g>
<!-- obj&#45;&gt;production -->
<g id="edge3" class="edge"><title>obj&#45;&gt;production</title>
<path fill="none" stroke="gray" d="M148.735,-319.689C148.735,-309.894 148.735,-297.422 148.735,-286.335"/>
<polygon fill="gray" stroke="gray" points="152.235,-286.262 148.735,-276.262 145.235,-286.262 152.235,-286.262"/>
<text text-anchor="middle" x="203.657" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P108i_was_produced_by</text>
</g>
<!-- other -->
<g id="node8" class="node"><title>other</title>
<text text-anchor="middle" x="281.735" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">Other top&#45;level object attributes</text>
</g>
<!-- obj&#45;&gt;other -->
<g id="edge7" class="edge"><title>obj&#45;&gt;other</title>
<path fill="none" stroke="gray" d="M223.709,-322.337C236.677,-317.399 249.318,-310.804 259.735,-302 264.947,-297.595 269.068,-291.646 272.265,-285.579"/>
<polygon fill="gray" stroke="gray" points="275.598,-286.706 276.554,-276.154 269.227,-283.807 275.598,-286.706"/>
<text text-anchor="middle" x="271.069" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
</g>
<!-- print -->
<g id="node3" class="node"><title>print</title>
<ellipse fill="none" stroke="gray" cx="51.7348" cy="-178" rx="27" ry="18"/>
<text text-anchor="middle" x="51.7348" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">print</text>
</g>
<!-- object_type&#45;&gt;print -->
<g id="edge2" class="edge"><title>object_type&#45;&gt;print</title>
<path fill="none" stroke="gray" d="M51.7348,-239.689C51.7348,-229.894 51.7348,-217.422 51.7348,-206.335"/>
<polygon fill="gray" stroke="gray" points="55.2349,-206.262 51.7348,-196.262 48.2349,-206.262 55.2349,-206.262"/>
<text text-anchor="middle" x="77.7504" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
</g>
<!-- date -->
<g id="node5" class="node"><title>date</title>
<ellipse fill="none" stroke="gray" cx="134.735" cy="-178" rx="27" ry="18"/>
<text text-anchor="middle" x="134.735" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">date</text>
</g>
<!-- production&#45;&gt;date -->
<g id="edge4" class="edge"><title>production&#45;&gt;date</title>
<path fill="none" stroke="gray" d="M143.042,-240.075C141.328,-234.389 139.605,-227.974 138.481,-222 137.543,-217.015 136.839,-211.66 136.311,-206.48"/>
<polygon fill="gray" stroke="gray" points="139.778,-205.937 135.456,-196.264 132.803,-206.521 139.778,-205.937"/>
<text text-anchor="middle" x="175.862" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P9_consists_of</text>
</g>
<!-- other_prod -->
<g id="node9" class="node"><title>other_prod</title>
<text text-anchor="middle" x="234.735" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">Other production info</text>
</g>
<!-- production&#45;&gt;other_prod -->
<g id="edge8" class="edge"><title>production&#45;&gt;other_prod</title>
<path fill="none" stroke="gray" d="M176.351,-246.343C188.633,-240.562 202.579,-232.439 212.735,-222 217.34,-217.267 221.193,-211.376 224.322,-205.489"/>
<polygon fill="gray" stroke="gray" points="227.508,-206.94 228.646,-196.406 221.187,-203.931 227.508,-206.94"/>
<text text-anchor="middle" x="222.069" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
</g>
<!-- timespan -->
<g id="node6" class="node"><title>timespan</title>
<ellipse fill="none" stroke="gray" cx="134.735" cy="-98" rx="32.8294" ry="18"/>
<text text-anchor="middle" x="134.735" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">timespan</text>
</g>
<!-- date&#45;&gt;timespan -->
<g id="edge5" class="edge"><title>date&#45;&gt;timespan</title>
<path fill="none" stroke="gray" d="M134.735,-159.689C134.735,-149.894 134.735,-137.422 134.735,-126.335"/>
<polygon fill="gray" stroke="gray" points="138.235,-126.262 134.735,-116.262 131.235,-126.262 138.235,-126.262"/>
<text text-anchor="middle" x="178.088" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P4_has_time&#45;span</text>
</g>
<!-- start_date -->
<g id="node7" class="node"><title>start_date</title>
<ellipse fill="none" stroke="gray" cx="66.7348" cy="-18" rx="34.828" ry="18"/>
<text text-anchor="middle" x="66.7348" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">start_date</text>
</g>
<!-- timespan&#45;&gt;start_date -->
<g id="edge6" class="edge"><title>timespan&#45;&gt;start_date</title>
<path fill="none" stroke="gray" d="M105.598,-89.5265C91.4682,-84.285 75.76,-75.6942 67.3129,-62 64.4467,-57.3534 63.1708,-51.8529 62.8105,-46.3654"/>
<polygon fill="gray" stroke="gray" points="66.3171,-46.1414 63.0686,-36.0569 59.3193,-45.9661 66.3171,-46.1414"/>
<text text-anchor="middle" x="124.446" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P82a_begin_of_the_begin</text>
</g>
<!-- other_date -->
<g id="node10" class="node"><title>other_date</title>
<text text-anchor="middle" x="202.735" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">End of begin, Start of end...</text>
</g>
<!-- timespan&#45;&gt;other_date -->
<g id="edge9" class="edge"><title>timespan&#45;&gt;other_date</title>
<path fill="none" stroke="gray" d="M155.643,-84.0943C164.163,-78.1192 173.652,-70.4665 180.735,-62 184.802,-57.138 188.386,-51.398 191.417,-45.7224"/>
<polygon fill="gray" stroke="gray" points="194.728,-46.9184 195.983,-36.3981 188.442,-43.8394 194.728,-46.9184"/>
<text text-anchor="middle" x="190.069" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 7.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 351 KiB

34
lod/tutorial/js/bootstrap-4-navbar.js vendored Normal file
View File

@@ -0,0 +1,34 @@
/*!
* Bootstrap 4 multi dropdown navbar ( https://bootstrapthemes.co/demo/resource/bootstrap-4-multi-dropdown-navbar/ )
* Copyright 2017.
* Licensed under the GPL license
*/
$( document ).ready( function () {
$( '.mobile-drop a.dropdown-toggle' ).on( 'click', function ( e ) {
var $el = $( this );
var $parent = $( this ).offsetParent( ".mobile-drop" );
if ($('.show.mobile-drop').length > 0){
$('.show.mobile-drop').each(function(item){
$(this).toggleClass('show');
});
}
var $subMenu = $( this ).next( ".mobile-drop" );
$subMenu.toggleClass( 'show' );
$( this ).parent( "li" ).toggleClass( 'show' );
$( this ).parents( 'li.nav-item.dropdown.mobile-drop.show' ).on( 'click', function ( e ) {
$( '.mobile-drop .show' ).removeClass( "show" );
} );
if ( !$parent.parent().hasClass( 'navbar-nav' ) ) {
$el.next().css( { "top": $el[0].offsetTop, "left": $parent.outerWidth() - 4 } );
}
return false;
} );
} );

View File

@@ -0,0 +1,8 @@
$(document).ready(function() {
$('a').each(function() {
var a = new RegExp('/' + window.location.host + '/');
if (!a.test(this.href)) {
$(this).attr("target","_blank");
}
});
});

View File

@@ -0,0 +1,13 @@
// http://ben.balter.com/2014/03/13/pages-anchor-links/
$(function() {
return $("h2, h3, h4, h5, h6").each(function(i, el) {
var $el, icon, id;
$el = $(el);
id = $el.attr('id');
icon = '<i class="fa fa-link" style="font-size: 0.8em"></i>';
if (id) {
return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon));
}
});
});

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

53
lod/upload.sh Normal file
View File

@@ -0,0 +1,53 @@
#!/bin/sh
# This is a bit messy
if [ "$#" -lt 1 ]; then
graph="http://example.com/sitc/submission/"
endpoint="http://fuseki.gsi.upm.es/hotels/data"
else if [ "$#" -lt 2 ]; then
endpoint=$1
graph_base="http://example.com/sitc"
else
if [ "$#" -lt 3 ]; then
endpoint=$1
graph=$2
else
echo "Usage: $0 [<endpoint>] [<graph_base_uri>]"
echo
exit 1
fi
fi
fi
upload(){
name=$1
file=$2
echo '###'
echo "Uploading: $graph"
echo "Graph: $graph"
echo "Endpoint: $endpoint"
curl -X POST \
--digest -u admin:$PASSWORD \
-H Content-Type:text/turtle \
-T "$file" \
--data-urlencode graph=$graph_base/$name \
-G $endpoint
}
total=0
echo -n "Password: "
read -s PASSWORD
echo "Uploading synthethic"
upload "synthetic" synthetic/reviews.ttl || exit 1
for i in *.ttl; do
identifier=$(echo ${i%.ttl} | md5sum | awk '{print $1}')
echo "Uploading $i"
upload $identifier $i
total=$((total + 1))
done
echo Uploaded $total

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -71,8 +71,7 @@
"source": [ "source": [
"* [Scikit-learn web page](http://scikit-learn.org/stable/)\n", "* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n", "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n", "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
] ]
}, },
{ {
@@ -80,7 +79,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Licence\n", "## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]
@@ -88,7 +87,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -102,7 +101,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -40,10 +40,10 @@
"\n", "\n",
"* Learn to use scikit-learn\n", "* Learn to use scikit-learn\n",
"* Learn the basic steps to apply machine learning techniques: dataset analysis, load, preprocessing, training, validation, optimization and persistence.\n", "* Learn the basic steps to apply machine learning techniques: dataset analysis, load, preprocessing, training, validation, optimization and persistence.\n",
"* Learn how to do a exploratory data analysis\n", "* Learn how to do an exploratory data analysis\n",
"* Learn how to visualise a dataset\n", "* Learn how to visualise a dataset\n",
"* Learn how to load a bundled dataset\n", "* Learn how to load a bundled dataset\n",
"* Learn how to separate the dataset into traning and testing datasets\n", "* Learn how to separate the dataset into training and testing datasets\n",
"* Learn how to train a classifier\n", "* Learn how to train a classifier\n",
"* Learn how to predict with a trained classifier\n", "* Learn how to predict with a trained classifier\n",
"* Learn how to evaluate the predictions\n", "* Learn how to evaluate the predictions\n",
@@ -63,9 +63,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Scikit-learn web page](http://scikit-learn.org/stable/)\n", "* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n", "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
] ]
}, },
{ {
@@ -73,7 +71,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## LIcence\n", "## LIcence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]
@@ -81,7 +79,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -95,7 +93,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -87,10 +87,10 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"Scikit-learn provides algorithms for solving the following problems:\n", "Scikit-learn provides algorithms for solving the following problems:\n",
"* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, kNN, ...), SVM, Random forest, Perceptron, etc. \n", "* **Classification**: Identifying to which category an object belongs. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, C4.5, ...), kNN, SVM, Random forest, Perceptron, etc. \n",
"* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n", "* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n",
"* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n", "* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n",
"* ** Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc." "* **Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
] ]
}, },
{ {
@@ -105,7 +105,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"In addition, scikit-learn helps in several tasks:\n", "In addition, scikit-learn helps in several tasks:\n",
"* **Model selection**: Comparing, validating, choosing parameters and models, and persisting models. Some of the [available functionalities](http://scikit-learn.org/stable/model_selection.html#model-selection) are cross-validation or grid search for optimizing the parameters. \n", "* **Model selection**: Comparing, validating, choosing parameters and models, and persisting models. Some [available functionalities](http://scikit-learn.org/stable/model_selection.html#model-selection) are cross-validation or grid search for optimizing the parameters. \n",
"* **Preprocessing**: Several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. Some of the available [preprocessing functions](http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing) are scaling and normalizing data, or imputing missing values." "* **Preprocessing**: Several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. Some of the available [preprocessing functions](http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing) are scaling and normalizing data, or imputing missing values."
] ]
}, },
@@ -128,9 +128,9 @@
"\n", "\n",
"If it is not installed, install it with conda: `conda install scikit-learn`.\n", "If it is not installed, install it with conda: `conda install scikit-learn`.\n",
"\n", "\n",
"If you have installed scipy and numpy, you can also installed using pip: `pip install -U scikit-learn`.\n", "If you have installed scipy and numpy, you can also install using pip: `pip install -U scikit-learn`.\n",
"\n", "\n",
"It is not recommended to use pip for installing scipy and numpy. Instead, use conda or install the linux package *python-sklearn*." "It is not recommended to use pip to install scipy and numpy. Instead, use conda or install the Linux package *python-sklearn*."
] ]
}, },
{ {
@@ -156,7 +156,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Licence\n", "## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")\n", "![](./images/EscUpmPolit_p.gif \"UPM\")\n",
"\n", "\n",
"# Course Notes for Learning Intelligent Systems\n", "# Course Notes for Learning Intelligent Systems\n",
"\n", "\n",
@@ -34,11 +34,11 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The goal of this notebook is to learn how to read and load a sample dataset.\n", "This notebook aims to learn how to read and load a sample dataset.\n",
"\n", "\n",
"Scikit-learn comes with some bundled [datasets](http://scikit-learn.org/stable/datasets/): iris, digits, boston, etc.\n", "Scikit-learn comes with some bundled [datasets](https://scikit-learn.org/stable/datasets.html): iris, digits, boston, etc.\n",
"\n", "\n",
"In this notebook we are going to use the Iris dataset." "In this notebook, we will use the Iris dataset."
] ]
}, },
{ {
@@ -54,16 +54,25 @@
"source": [ "source": [
"The [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), available at [UCI dataset repository](https://archive.ics.uci.edu/ml/datasets/Iris), is a classic dataset for classification.\n", "The [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), available at [UCI dataset repository](https://archive.ics.uci.edu/ml/datasets/Iris), is a classic dataset for classification.\n",
"\n", "\n",
"The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features.\n", "The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features, a machine learning model will learn to differentiate the species of Iris.\n",
"\n", "\n",
"![Iris](files/images/iris-dataset.jpg)" "![Iris dataset](./images/iris-dataset.jpg \"Iris\")"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In ordert to read the dataset, we import the datasets bundle and then load the Iris dataset. " "Here you can see the species and the features.\n",
"![Iris features](./images/iris-features.png \"Iris features\")\n",
"![Iris classes](./images/iris-classes.png \"Iris classes\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To read the dataset, we import the datasets bundle and then load the Iris dataset. "
] ]
}, },
{ {
@@ -180,7 +189,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"#Using numpy, I can print the dimensions (here we are working with 2D matriz)\n", "#Using numpy, I can print the dimensions (here we are working with a 2D matrix)\n",
"print(iris.data.ndim)" "print(iris.data.ndim)"
] ]
}, },
@@ -218,7 +227,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In following sessions we will learn how to load a dataset from a file (csv, excel, ...) using the pandas library." "In the following sessions, we will learn how to load a dataset from a file (CSV, Excel, ...) using the pandas library."
] ]
}, },
{ {
@@ -246,7 +255,7 @@
"source": [ "source": [
"## Licence\n", "## Licence\n",
"\n", "\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -49,7 +49,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The goal of this notebook is to learn how to analyse a dataset. We will cover other tasks such as cleaning or munging (changing the format) the dataset in other sessions." "This notebook aims to learn how to analyse a dataset. We will cover other tasks such as cleaning or munging (changing the format) the dataset in other sessions."
] ]
}, },
{ {
@@ -65,13 +65,13 @@
"source": [ "source": [
"This section covers different ways to inspect the distribution of samples per feature.\n", "This section covers different ways to inspect the distribution of samples per feature.\n",
"\n", "\n",
"First of all, let's see how many samples of each class we have, using a [histogram](https://en.wikipedia.org/wiki/Histogram). \n", "First of all, let's see how many samples we have in each class using a [histogram](https://en.wikipedia.org/wiki/Histogram). \n",
"\n", "\n",
"A histogram is a graphical representation of the distribution of numerical data. It is an estimation of the probability distribution of a continuous variable (quantitative variable). \n", "A histogram is a graphical representation of the distribution of numerical data. It estimates the probability distribution of a continuous variable (quantitative variable). \n",
"\n", "\n",
"For building a histogram, we need first to 'bin' the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. \n", "For building a histogram, we need to 'bin' the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. \n",
"\n", "\n",
"In our case, since the values are not continuous and we have only three values, we do not need to bin them." "Since the values are not continuous and we have only three values, we do not need to bin them."
] ]
}, },
{ {
@@ -115,7 +115,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"As can be seen, we have the same distribution of samples for every class.\n", "As can be seen, we have the same distribution of samples for every class.\n",
"The next step is to see the distribution of the features" "The next step is to see the distribution of the features."
] ]
}, },
{ {
@@ -184,7 +184,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"As we can see, the Setosa class seems to be linearly separable with these two features.\n", "As we can see, the Setosa class seems linearly separable with these two features.\n",
"\n", "\n",
"Another nice visualisation is given below." "Another nice visualisation is given below."
] ]
@@ -228,7 +228,6 @@
"source": [ "source": [
"* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n", "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
"* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n", "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
"* [Mastering Pandas](http://proquest.safaribooksonline.com/book/programming/python/9781783981960), Femi Anthony, Packt Publishing, 2015.\n",
"* [Matplotlib web page](http://matplotlib.org/index.html)\n", "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
"* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n", "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
"* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n", "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n",
@@ -242,7 +241,7 @@
"source": [ "source": [
"## Licence\n", "## Licence\n",
"\n", "\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]

File diff suppressed because one or more lines are too long

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -76,7 +76,7 @@
"source": [ "source": [
"A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n", "A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n",
"\n", "\n",
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)." "We will use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
] ]
}, },
{ {
@@ -122,9 +122,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n", "Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might misbehave if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
"\n", "\n",
"The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set. Later, the same transformation will be applied on the testing set." "The preprocessing module further provides a utility class `StandardScaler` to compute a training set's mean and standard deviation. Later, the same transformation will be applied on the testing set."
] ]
}, },
{ {
@@ -163,7 +163,6 @@
"source": [ "source": [
"* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n", "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
"* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n", "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
"* [Mastering Pandas](http://proquest.safaribooksonline.com/book/programming/python/9781783981960), Femi Anthony, Packt Publishing, 2015.\n",
"* [Matplotlib web page](http://matplotlib.org/index.html)\n", "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
"* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n", "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
"* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)" "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)"
@@ -174,7 +173,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Licences\n", "### Licences\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -53,9 +53,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"This is an introduction of general ideas about machine learning and the interface of scikit-learn, taken from the [scikit-learn tutorial](http://www.astroml.org/sklearn_tutorial/general_concepts.html). \n", "This is an introduction to general ideas about machine learning and the interface of scikit-learn, taken from the [scikit-learn tutorial](http://www.astroml.org/sklearn_tutorial/general_concepts.html). \n",
"\n", "\n",
"You can skip it during the lab session and read it later," "You can skip it during the lab session and read it later."
] ]
}, },
{ {
@@ -69,20 +69,20 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Machine learning algorithms are programs that learn a model from a dataset with the aim of making predictions or learning structures to organize the data.\n", "Machine learning algorithms are programs that learn a model from a dataset to make predictions or learn structures to organize the data.\n",
"\n", "\n",
"In scikit-learn, machine learning algorithms take as an input a *numpy* array (n_samples, n_features), where\n", "In scikit-learn, machine learning algorithms take as input a *numpy* array (n_samples, n_features), where\n",
"* **n_samples**: number of samples. Each sample is an item to process (i.e. classify). A sample can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits.\n", "* **n_samples**: number of samples. Each sample is an item to process (i.e., classify). A sample can be a document, a picture, a sound, a video, a row in a database or CSV file, or whatever you can describe with a fixed set of quantitative traits.\n",
"* **n_features**: The number of features or distinct traits that can be used to describe each item in a quantitative manner.\n", "* **n_features**: The number of features or distinct traits that can be used to describe each item quantitatively.\n",
"\n", "\n",
"The number of features should be defined in advance. There is a specific type of feature sets that are high dimensional (e.g. millions of features), but most of the values are zero for a given sample. Using (numpy) arrays, all those values that are zero would also take up memory. For this reason, these feature sets are often represented with sparse matrices (scipy.sparse) instead of (numpy) arrays.\n", "The number of features should be defined in advance. A specific type of feature set is high-dimensional (e.g., millions of features), but most values are zero for a given sample. Using (numpy) arrays, all those zero values would also take up memory. For this reason, these feature sets are often represented with sparse matrices (scipy.sparse) instead of (numpy) arrays.\n",
"\n", "\n",
"The first step in machine learning is **identifying the relevant features** from the input data, and the second step is **extracting the features** from the input data. \n", "The first step in machine learning is **identifying the relevant features** from the input data, and the second step is **extracting the features** from the input data. \n",
"\n", "\n",
"[Machine learning algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/) can be classified according to learning style into:\n", "[Machine learning algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/) can be classified according to learning style into:\n",
"* **Supervised learning**: input data (training dataset) has a known label or result. Example problems are classification and regression. A model is prepared through a training process where it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data.\n", "* **Supervised learning**: input data (training dataset) has a known label or result. Example problems are classification and regression. A model is prepared through a training process where it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data.\n",
"* **Unsupervised learning**: input data is not labeled. A model is prepared by deducing structures present in the input data. This may be to extract general rules. Example problems are clustering, dimensionality reduction and association rule learning.\n", "* **Unsupervised learning**: input data is not labeled. A model is prepared by deducing structures present in the input data. This may be to extract general rules. Example problems are clustering, dimensionality reduction, and association rule learning.\n",
"* **Semi-supervised learning**:i nput data is a mixture of labeled and unlabeled examples. There is a desired prediction problem but the model must learn the structures to organize the data as well as make predictions. Example problems are classification and regression." "* **Semi-supervised learning**: input data is a mixture of labeled and unlabeled examples. There is a desired prediction problem, but the model must learn the structures to organize the data and make predictions. Example problems are classification and regression."
] ]
}, },
{ {
@@ -96,8 +96,8 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In *supervised machine learning models*, the machine learning algorithm takes as an input a training dataset, composed of feature vectors and labels, and produces a predictive model which is used for make prediction on new data.\n", "In *supervised machine learning models*, the machine learning algorithm takes as input a training dataset, composed of feature vectors and labels, and produces a predictive model used to predict new data.\n",
"![](files/images/plot_ML_flow_chart_1.png)" "![](./images/plot_ML_flow_chart_1.png)"
] ]
}, },
{ {
@@ -111,7 +111,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In *unsupervised machine learning models*, the machine learning model algorithm takes as an input the feature vectors and produces a predictive model that is used to fit its parameters so as to best summarize regularities found in the data.\n", "In *unsupervised machine learning models*, the machine learning model algorithm takes as input the feature vectors. It produces a predictive model that is used to fit its parameters to summarize the best regularities found in the data.\n",
"![](files/images/plot_ML_flow_chart_3.png)" "![](files/images/plot_ML_flow_chart_3.png)"
] ]
}, },
@@ -129,15 +129,15 @@
"scikit-learn has a uniform interface for all the estimators, some methods are only available if the estimator is supervised or unsupervised:\n", "scikit-learn has a uniform interface for all the estimators, some methods are only available if the estimator is supervised or unsupervised:\n",
"\n", "\n",
"* Available in *all estimators*:\n", "* Available in *all estimators*:\n",
" * **model.fit()**: fit training data. For supervised learning applications, this accepts two arguments: the data X and the labels y (e.g. model.fit(X, y)). For unsupervised learning applications, this accepts only a single argument, the data X (e.g. model.fit(X)).\n", " * **model.fit()**: fit training data. For supervised learning applications, this accepts two arguments: the data X and the labels y (e.g., model.fit(X, y)). For unsupervised learning applications, this accepts only a single argument, the data X (e.g. model.fit(X)).\n",
"\n", "\n",
"* Available in *supervised estimators*:\n", "* Available in *supervised estimators*:\n",
" * **model.predict()**: given a trained model, predict the label of a new set of data. This method accepts one argument, the new data X_new (e.g. model.predict(X_new)), and returns the learned label for each object in the array.\n", " * **model.predict()**: given a trained model, predict the label of a new dataset. This method accepts one argument, the new data X_new (e.g., model.predict(X_new)), and returns the learned label for each object in the array.\n",
" * **model.predict_proba()**: For classification problems, some estimators also provide this method, which returns the probability that a new observation has each categorical label. In this case, the label with the highest probability is returned by model.predict().\n", " * **model.predict_proba()**: For classification problems, some estimators also provide this method, which returns the probability that a new observation has each categorical label. In this case, the label with the highest probability is returned by model.predict().\n",
"\n", "\n",
"* Available in *unsupervised estimators*:\n", "* Available in *unsupervised estimators*:\n",
" * **model.transform()**: given an unsupervised model, transform new data into the new basis. This also accepts one argument X_new, and returns the new representation of the data based on the unsupervised model.\n", " * **model.transform()**: given an unsupervised model, transform new data into the new basis. This also accepts one argument X_new, and returns the new representation of the data based on the unsupervised model.\n",
" * **model.fit_transform()**: some estimators implement this method, which performs a fit and a transform on the same input data.\n", " * **model.fit_transform()**: Some estimators implement this method, which performs a fit and a transform on the same input data.\n",
"\n", "\n",
"\n", "\n",
"![](files/images/plot_ML_flow_chart_2.png)" "![](files/images/plot_ML_flow_chart_2.png)"
@@ -154,7 +154,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [General concepts of machine learning with scikit-learn](http://www.astroml.org/sklearn_tutorial/general_concepts.html)\n", "* [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/index.html)\n",
"* [A Tour of Machine Learning Algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/)" "* [A Tour of Machine Learning Algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/)"
] ]
}, },
@@ -169,7 +169,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]
@@ -177,7 +177,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -191,7 +191,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.5.6" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

File diff suppressed because one or more lines are too long

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -56,9 +56,9 @@
"source": [ "source": [
"The goal of this notebook is to learn how to create a classification object using a [decision tree learning algorithm](https://en.wikipedia.org/wiki/Decision_tree_learning). \n", "The goal of this notebook is to learn how to create a classification object using a [decision tree learning algorithm](https://en.wikipedia.org/wiki/Decision_tree_learning). \n",
"\n", "\n",
"There are a number of well known machine learning algorithms for decision tree learning, such as ID3, C4.5, C5.0 and CART. The scikit-learn uses an optimised version of the [CART (Classification and Regression Trees) algorithm](https://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees).\n", "There are several well-known machine learning algorithms for decision tree learning, such as ID3, C4.5, C5.0, and CART. The scikit-learn uses an optimised version of the [CART (Classification and Regression Trees) algorithm](https://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees).\n",
"\n", "\n",
"This notebook will follow the same steps that the previous notebook for learning using the [kNN Model](2_5_1_kNN_Model.ipynb), and details some peculiarities of the decision tree algorithms.\n", "This notebook will follow the same steps as the previous notebook for learning using the [kNN Model](2_5_1_kNN_Model.ipynb), and details some peculiarities of the decision tree algorithms.\n",
"\n", "\n",
"You need to install pydotplus: `conda install pydotplus` for the visualization." "You need to install pydotplus: `conda install pydotplus` for the visualization."
] ]
@@ -69,7 +69,7 @@
"source": [ "source": [
"## Load data and preprocessing\n", "## Load data and preprocessing\n",
"\n", "\n",
"Here we repeat the same operations for loading data and preprocessing than in the previous notebooks." "Here we repeat the same operations for loading data and preprocessing as in the previous notebooks."
] ]
}, },
{ {
@@ -130,12 +130,7 @@
{ {
"data": { "data": {
"text/plain": [ "text/plain": [
"DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=3,\n", "DecisionTreeClassifier(max_depth=3, random_state=1)"
" max_features=None, max_leaf_nodes=None,\n",
" min_impurity_decrease=0.0, min_impurity_split=None,\n",
" min_samples_leaf=1, min_samples_split=2,\n",
" min_weight_fraction_leaf=0.0, presort=False, random_state=1,\n",
" splitter='best')"
] ]
}, },
"execution_count": 2, "execution_count": 2,
@@ -267,8 +262,8 @@
"The current version of pydot does not work well in Python 3.\n", "The current version of pydot does not work well in Python 3.\n",
"For obtaining an image, you need to install `pip install pydotplus` and then `conda install graphviz`.\n", "For obtaining an image, you need to install `pip install pydotplus` and then `conda install graphviz`.\n",
"\n", "\n",
"You can skip this example. Since it can require installing additional packages, we include here the result.\n", "You can skip this example. Since it can require installing additional packages, we have included the result here.\n",
"![Decision Tree](files/images/cart.png)" "![Decision Tree](./images/cart.png)"
] ]
}, },
{ {
@@ -277,20 +272,23 @@
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
"ename": "ModuleNotFoundError", "ename": "InvocationException",
"evalue": "No module named 'pydotplus'", "evalue": "GraphViz's executables not found",
"output_type": "error", "output_type": "error",
"traceback": [ "traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", "\u001b[0;31mInvocationException\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-7-1bf5ec7fb043>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mIPython\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdisplay\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mImage\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0msklearn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexternals\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msix\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mpydotplus\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpydot\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mdot_data\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/tmp/ipykernel_47326/3723147494.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 12\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0mgraph\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpydot\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraph_from_dot_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdot_data\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgetvalue\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 14\u001b[0;31m \u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite_png\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'iris-tree.png'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 15\u001b[0m \u001b[0mImage\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate_png\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pydotplus'" "\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36m<lambda>\u001b[0;34m(path, f, prog)\u001b[0m\n\u001b[1;32m 1808\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mpath\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1809\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mfrmt\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1810\u001b[0;31m \u001b[0mprog\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprog\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1811\u001b[0m )\n\u001b[1;32m 1812\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36mwrite\u001b[0;34m(self, path, prog, format)\u001b[0m\n\u001b[1;32m 1916\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1917\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1918\u001b[0;31m \u001b[0mfobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1919\u001b[0m \u001b[0;32mfinally\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1920\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36mcreate\u001b[0;34m(self, prog, format)\u001b[0m\n\u001b[1;32m 1957\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprogs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfind_graphviz\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1958\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprogs\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1959\u001b[0;31m raise InvocationException(\n\u001b[0m\u001b[1;32m 1960\u001b[0m 'GraphViz\\'s executables not found')\n\u001b[1;32m 1961\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mInvocationException\u001b[0m: GraphViz's executables not found"
] ]
} }
], ],
"source": [ "source": [
"from IPython.display import Image \n", "from IPython.display import Image \n",
"from sklearn.externals.six import StringIO\n", "from six import StringIO\n",
"import pydotplus as pydot\n", "import pydotplus as pydot\n",
"\n", "\n",
"dot_data = StringIO() \n", "dot_data = StringIO() \n",
@@ -332,7 +330,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Next we are going to export the pseudocode of the the learnt decision tree." "Next, we will export the pseudocode of the learnt decision tree."
] ]
}, },
{ {
@@ -380,14 +378,14 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Precision, recall and f-score" "### Precision, recall, and f-score"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall and F1-score\n", "For evaluating classification algorithms, we usually calculate three metrics: precision, recall, and F1-score\n",
"\n", "\n",
"* **Precision**: This computes the proportion of instances predicted as positives that were correctly evaluated (it measures how right our classifier is when it says that an instance is positive).\n", "* **Precision**: This computes the proportion of instances predicted as positives that were correctly evaluated (it measures how right our classifier is when it says that an instance is positive).\n",
"* **Recall**: This counts the proportion of positive instances that were correctly evaluated (measuring how right our classifier is when faced with a positive instance).\n", "* **Recall**: This counts the proportion of positive instances that were correctly evaluated (measuring how right our classifier is when faced with a positive instance).\n",
@@ -414,7 +412,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Another useful metric is the confusion matrix" "Another useful metric is the confusion matrix."
] ]
}, },
{ {
@@ -430,7 +428,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"We see we classify well all the 'setosa' and 'versicolor' samples. " "We classify all the 'setosa' and 'versicolor' samples well. "
] ]
}, },
{ {
@@ -444,7 +442,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In order to avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**.\n", "To avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**.\n",
"\n", "\n",
"Sklearn comes with other strategies for [cross validation](http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation), such as stratified K-fold, label k-fold, Leave-One-Out, Leave-P-Out, Leave-One-Label-Out, Leave-P-Label-Out or Shuffle & Split." "Sklearn comes with other strategies for [cross validation](http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation), such as stratified K-fold, label k-fold, Leave-One-Out, Leave-P-Out, Leave-One-Label-Out, Leave-P-Label-Out or Shuffle & Split."
] ]
@@ -468,7 +466,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n", "# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n", "cv = KFold(10, shuffle=True, random_state=33)\n",
"\n", "\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n", "# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n", "scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"print(scores)" "print(scores)"
] ]
@@ -477,7 +475,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure" "We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure."
] ]
}, },
{ {
@@ -510,10 +508,8 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n", "* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n", "* [Parameter estimation using grid search with cross-validation](https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html)\n",
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
"* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)" "* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
] ]
}, },
@@ -522,15 +518,24 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Licence\n", "## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]
} }
], ],
"metadata": { "metadata": {
"datacleaner": {
"position": {
"top": "50px"
},
"python": {
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
},
"window_display": false
},
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -544,7 +549,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -39,7 +39,7 @@
"* [Train classifier](#Train-classifier)\n", "* [Train classifier](#Train-classifier)\n",
"* [More about Pipelines](#More-about-Pipelines)\n", "* [More about Pipelines](#More-about-Pipelines)\n",
"* [Tuning the algorithm](#Tuning-the-algorithm)\n", "* [Tuning the algorithm](#Tuning-the-algorithm)\n",
"\t* [Grid Search for Parameter optimization](#Grid-Search-for-Parameter-optimization)\n", "\t* [Grid Search for Hyperparameter optimization](#Grid-Search-for-Hyperparameter-optimization)\n",
"* [Evaluating the algorithm](#Evaluating-the-algorithm)\n", "* [Evaluating the algorithm](#Evaluating-the-algorithm)\n",
"\t* [K-Fold validation](#K-Fold-validation)\n", "\t* [K-Fold validation](#K-Fold-validation)\n",
"* [References](#References)\n" "* [References](#References)\n"
@@ -56,9 +56,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the parameters of the estimator?\n", "In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the hyperparameters of the estimator?\n",
"\n", "\n",
"The goal of this notebook is to learn how to tune an algorithm by opimizing its parameters using grid search." "This notebook aims to learn how to tune an algorithm by optimizing its hyperparameters using grid search."
] ]
}, },
{ {
@@ -137,7 +137,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n", "# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n", "cv = KFold(10, shuffle=True, random_state=33)\n",
"\n", "\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n", "# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n", "scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"\n", "\n",
"from scipy.stats import sem\n", "from scipy.stats import sem\n",
@@ -189,7 +189,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"We can get the list of parameters of the model. As you will observe, the parameters of the estimators in the pipeline can be accessed using the &lt;estimator&gt;__&lt;parameter&gt; syntax. We will use this for tuning the parameters." "We can get the list of model parameters. As you will observe, the parameters of the estimators in the pipeline can be accessed using the &lt;estimator&gt;__&lt;parameter&gt; syntax. We will use this for tuning the parameters."
] ]
}, },
{ {
@@ -205,7 +205,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Let's see what happens if we change a parameter" "Let's see what happens if we change a parameter."
] ]
}, },
{ {
@@ -284,7 +284,7 @@
"\n", "\n",
"Look at the [API](http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) of *scikit-learn* to understand better the algorithm, as well as which parameters can be tuned. As you see, we can change several ones, such as *criterion*, *splitter*, *max_features*, *max_depth*, *min_samples_split*, *class_weight*, etc.\n", "Look at the [API](http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) of *scikit-learn* to understand better the algorithm, as well as which parameters can be tuned. As you see, we can change several ones, such as *criterion*, *splitter*, *max_features*, *max_depth*, *min_samples_split*, *class_weight*, etc.\n",
"\n", "\n",
"We can get the full list parameters of an estimator with the method *get_params()*. " "We can get an estimator's full list of parameters with the method *get_params()*. "
] ]
}, },
{ {
@@ -300,30 +300,30 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"You can try different values for these parameters and observe the results." "You can try different values for these hyperparameters and observe the results."
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Grid Search for Parameter optimization" "### Grid Search for Hyperparameter optimization"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Changing manually the parameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the parameters as an *optimization problem*. \n", "Changing manually the hyperparameters to find their optimal values is not practical. Instead, we can consider finding the optimal value of the hyperparameters as an *optimization problem*. \n",
"\n", "\n",
"The sklearn comes with several optimization techniques for this purpose, such as **grid search** and **randomized search**. In this notebook we are going to introduce the former one." "Sklearn has several optimization techniques, such as **grid search** and **randomized search**. In this notebook, we are going to introduce the former one."
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The sklearn provides an object that, given data, computes the score during the fit of an estimator on a parameter grid and chooses the parameters to maximize the cross-validation score. " "Sklearn provides an object that, given data, computes the score during the fit of an estimator on a hyperparameter grid and chooses the hyperparameters to maximize the cross-validation score. "
] ]
}, },
{ {
@@ -351,7 +351,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Now we are going to show the results of grid search" "Now we are going to show the results of the grid search"
] ]
}, },
{ {
@@ -371,7 +371,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"We can now evaluate the KFold with this optimized parameter as follows." "We can now evaluate the KFold with this optimized hyperparameter as follows."
] ]
}, },
{ {
@@ -392,7 +392,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n", "# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n", "cv = KFold(10, shuffle=True, random_state=33)\n",
"\n", "\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n", "# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n", "scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"def mean_score(scores):\n", "def mean_score(scores):\n",
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n", " return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
@@ -405,7 +405,7 @@
"source": [ "source": [
"We have got an *improvement* from 0.947 to 0.953 with k-fold.\n", "We have got an *improvement* from 0.947 to 0.953 with k-fold.\n",
"\n", "\n",
"We are now to try to fit the best combination of the parameters of the algorithm. It can take some time to compute it." "We are now trying to fit the best combination of the hyperparameters of the algorithm. It can take some time to compute it."
] ]
}, },
{ {
@@ -414,12 +414,12 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Set the parameters by cross-validation\n", "# Set the hyperparameters by cross-validation\n",
"\n", "\n",
"from sklearn.metrics import classification_report\n", "from sklearn.metrics import classification_report, recall_score, precision_score, make_scorer\n",
"\n", "\n",
"# set of parameters to test\n", "# set of hyperparameters to test\n",
"tuned_parameters = [{'max_depth': np.arange(3, 10),\n", "tuned_hyperparameters = [{'max_depth': np.arange(3, 10),\n",
"# 'max_weights': [1, 10, 100, 1000]},\n", "# 'max_weights': [1, 10, 100, 1000]},\n",
" 'criterion': ['gini', 'entropy'], \n", " 'criterion': ['gini', 'entropy'], \n",
" 'splitter': ['best', 'random'],\n", " 'splitter': ['best', 'random'],\n",
@@ -431,14 +431,19 @@
"scores = ['precision', 'recall']\n", "scores = ['precision', 'recall']\n",
"\n", "\n",
"for score in scores:\n", "for score in scores:\n",
" print(\"# Tuning hyper-parameters for %s\" % score)\n", " print(\"# Tuning hyperparameters for %s\" % score)\n",
" print()\n", " print()\n",
"\n", "\n",
" if score == 'precision':\n",
" scorer = make_scorer(precision_score, average='weighted', zero_division=0)\n",
" elif score == 'recall':\n",
" scorer = make_scorer(recall_score, average='weighted', zero_division=0)\n",
" \n",
" # cv = the fold of the cross-validation cv, defaulted to 5\n", " # cv = the fold of the cross-validation cv, defaulted to 5\n",
" gs = GridSearchCV(DecisionTreeClassifier(), tuned_parameters, cv=10, scoring='%s_weighted' % score)\n", " gs = GridSearchCV(DecisionTreeClassifier(), tuned_hyperparameters, cv=10, scoring=scorer)\n",
" gs.fit(x_train, y_train)\n", " gs.fit(x_train, y_train)\n",
"\n", "\n",
" print(\"Best parameters set found on development set:\")\n", " print(\"Best hyperparameters set found on development set:\")\n",
" print()\n", " print()\n",
" print(gs.best_params_)\n", " print(gs.best_params_)\n",
" print()\n", " print()\n",
@@ -487,7 +492,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n", "# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n", "cv = KFold(10, shuffle=True, random_state=33)\n",
"\n", "\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n", "# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n", "scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"def mean_score(scores):\n", "def mean_score(scores):\n",
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n", " return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
@@ -512,10 +517,8 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n", "* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n", "* [Hyperparameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
"* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)" "* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
] ]
}, },
@@ -530,7 +533,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]
@@ -538,7 +541,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -552,7 +555,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -48,9 +48,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The goal of this notebook is to learn how to save a model in the the scikit by using Pythons built-in persistence model, namely pickle\n", "The goal of this notebook is to learn how to save a model in the scikit by using Pythons built-in persistence model, namely pickle\n",
"\n", "\n",
"First we recap the previous tasks: load data, preprocess and train the model." "First, we recap the previous tasks: load data, preprocess, and train the model."
] ]
}, },
{ {
@@ -107,7 +107,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"A more efficient alternative to pickle is joblib, especially for big data problems. In this case the model can only be saved to a file and not to a string." "A more efficient alternative to pickle is joblib, especially for big data problems. In this case, the model can only be saved to a file and not to a string."
] ]
}, },
{ {
@@ -117,7 +117,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# save model\n", "# save model\n",
"from sklearn.externals import joblib\n", "import joblib\n",
"joblib.dump(model, 'filename.pkl') \n", "joblib.dump(model, 'filename.pkl') \n",
"\n", "\n",
"#load model\n", "#load model\n",
@@ -136,7 +136,9 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Tutorial scikit-learn](http://scikit-learn.org/stable/tutorial/basic/tutorial.html)\n", "* [Tutorial scikit-learn](http://scikit-learn.org/stable/tutorial/basic/tutorial.html)\n",
"* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)" "* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)\n",
"* [scikit-learn : Machine Learning Simplified](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
"* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
] ]
}, },
{ {
@@ -144,15 +146,24 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Licence\n", "## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]
} }
], ],
"metadata": { "metadata": {
"datacleaner": {
"position": {
"top": "50px"
},
"python": {
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
},
"window_display": false
},
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -166,7 +177,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")" "![](./images/EscUpmPolit_p.gif \"UPM\")"
] ]
}, },
{ {
@@ -52,7 +52,7 @@
"\n", "\n",
"Particularly in high-dimensional spaces, data can more easily be separated linearly and the simplicity of classifiers such as naive Bayes and linear SVMs might lead to better generalization than is achieved by other classifiers.\n", "Particularly in high-dimensional spaces, data can more easily be separated linearly and the simplicity of classifiers such as naive Bayes and linear SVMs might lead to better generalization than is achieved by other classifiers.\n",
"\n", "\n",
"The plots show training points in solid colors and testing points semi-transparent. The lower right shows the classification accuracy on the test set.\n", "The plots show training points in solid colors and testing points in semi-transparent colors. The lower right shows the classification accuracy on the test set.\n",
"\n", "\n",
"The [DummyClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html#sklearn.dummy.DummyClassifier) is a classifier that makes predictions using simple rules. It is useful as a simple baseline to compare with other (real) classifiers. \n", "The [DummyClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html#sklearn.dummy.DummyClassifier) is a classifier that makes predictions using simple rules. It is useful as a simple baseline to compare with other (real) classifiers. \n",
"\n", "\n",
@@ -94,7 +94,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Licence\n", "## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n", "\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid." "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
] ]

BIN
ml1/images/iris-classes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 944 KiB

View File

@@ -47,7 +47,7 @@ def get_code(tree, feature_names, target_names,
recurse(left, right, threshold, features, 0, 0) recurse(left, right, threshold, features, 0, 0)
# Taken from http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#example-tree-plot-iris-py # Taken from https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt

View File

@@ -2,6 +2,7 @@ import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap from matplotlib.colors import ListedColormap
from sklearn import neighbors, datasets from sklearn import neighbors, datasets
import seaborn as sns
from sklearn.neighbors import KNeighborsClassifier from sklearn.neighbors import KNeighborsClassifier
# Taken from http://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html # Taken from http://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html
@@ -19,9 +20,9 @@ def plot_classification_iris():
h = .02 # step size in the mesh h = .02 # step size in the mesh
n_neighbors = 15 n_neighbors = 15
# Create color maps # Create color maps
cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA', '#AAAAFF']) cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue'])
cmap_bold = ListedColormap(['#FF0000', '#00FF00', '#0000FF']) cmap_bold = ['darkorange', 'c', 'darkblue']
for weights in ['uniform', 'distance']: for weights in ['uniform', 'distance']:
# we create an instance of Neighbours Classifier and fit the data. # we create an instance of Neighbours Classifier and fit the data.
@@ -29,7 +30,7 @@ def plot_classification_iris():
clf.fit(X, y) clf.fit(X, y)
# Plot the decision boundary. For that, we will assign a color to each # Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max]. # point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
@@ -38,14 +39,17 @@ def plot_classification_iris():
# Put the result into a color plot # Put the result into a color plot
Z = Z.reshape(xx.shape) Z = Z.reshape(xx.shape)
plt.figure() plt.figure(figsize=(8, 6))
plt.pcolormesh(xx, yy, Z, cmap=cmap_light) plt.contourf(xx, yy, Z, cmap=cmap_light)
# Plot also the training points # Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold) sns.scatterplot(x=X[:, 0], y=X[:, 1], hue=iris.target_names[y],
palette=cmap_bold, alpha=1.0, edgecolor="black")
plt.xlim(xx.min(), xx.max()) plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max()) plt.ylim(yy.min(), yy.max())
plt.title("3-Class classification (k = %i, weights = '%s')" plt.title("3-Class classification (k = %i, weights = '%s')"
% (n_neighbors, weights)) % (n_neighbors, weights))
plt.xlabel(iris.feature_names[0])
plt.ylabel(iris.feature_names[1])
plt.show() plt.show()

View File

@@ -74,9 +74,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n", "* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n", "* [Scikit-learn videos and notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
] ]
}, },
{ {
@@ -92,7 +90,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -106,7 +104,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -50,30 +50,30 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"In this session we will work with the Titanic dataset. This dataset is provided by [Kaggle](http://www.kaggle.com). Kaggle is a crowdsourcing platform that organizes competitions where researchers and companies post their data and users compete to obtain the best models.\n", "In this session, we will work with the Titanic dataset. This dataset is provided by [Kaggle](http://www.kaggle.com). Kaggle is a crowdsourcing platform that organizes competitions where researchers and companies post their data and users compete to obtain the best models.\n",
"\n", "\n",
"![Titanic](images/titanic.jpg)\n", "![Titanic](images/titanic.jpg)\n",
"\n", "\n",
"\n", "\n",
"The main objective is predicting which passengers survived the sinking of the Titanic.\n", "The main objective is to predict which passengers survived the sinking of the Titanic.\n",
"\n", "\n",
"The data is available [here](https://www.kaggle.com/c/titanic/data). There are two files, one for training ([train.csv](files/data-titanic/train.csv)) and another file for testing [test.csv](files/data-titanic/test.csv). A local copy has been included in this notebook under the folder *data-titanic*.\n", "The data is available [here](https://www.kaggle.com/c/titanic/data). There are two files, one for training ([train.csv](files/data-titanic/train.csv)) and another file for testing [test.csv](files/data-titanic/test.csv). A local copy has been included in this notebook under the folder *data-titanic*.\n",
"\n", "\n",
"\n", "\n",
"Here follows a description of the variables.\n", "Here follows a description of the variables.\n",
"\n", "\n",
"|Variable | Description| Values|\n", "| Variable | Description | Values |\n",
"|-------------------------------|\n", "|------------|---------------------------------|-----------------|\n",
"| survival| Survival| (0 = No; 1 = Yes)|\n", "| survival | Survival |(0 = No; 1 = Yes)|\n",
"|Pclass |Name | |\n", "| Pclass | Name | |\n",
"|Sex |Sex | male, female|\n", "| Sex | Sex | male, female |\n",
"|Age |Age|\n", "| Age | Age | |\n",
"|SibSp |Number of Siblings/Spouses Aboard||\n", "| SibSp |Number of Siblings/Spouses Aboard| |\n",
"|Parch |Number of Parents/Children Aboard||\n", "| Parch |Number of Parents/Children Aboard| |\n",
"|Ticket|Ticket Number||\n", "| Ticket | Ticket Number | |\n",
"|Fare |Passenger Fare||\n", "| Fare | Passenger Fare | |\n",
"|Cabin |Cabin||\n", "| Cabin | Cabin | |\n",
"|Embarked |Port of Embarkation| (C = Cherbourg; Q = Queenstown; S = Southampton)|\n", "| Embarked | Port of Embarkation | (C = Cherbourg; Q = Queenstown; S = Southampton)|\n",
"\n", "\n",
"\n", "\n",
"The definitions used for SibSp and Parch are:\n", "The definitions used for SibSp and Parch are:\n",
@@ -213,8 +213,7 @@
"* [Pandas API input-output](http://pandas.pydata.org/pandas-docs/stable/api.html#input-output)\n", "* [Pandas API input-output](http://pandas.pydata.org/pandas-docs/stable/api.html#input-output)\n",
"* [Pandas API - pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)\n", "* [Pandas API - pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)\n",
"* [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html)\n", "* [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html)\n",
"* [An introduction to NumPy and Scipy](http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf)\n", "* [An introduction to NumPy and Scipy](https://sites.engineering.ucsb.edu/~shell/che210d/numpy.pdf)\n"
"* [NumPy tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)"
] ]
}, },
{ {

View File

@@ -433,10 +433,9 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Pandas](http://pandas.pydata.org/)\n", "* [Pandas](http://pandas.pydata.org/)\n",
"* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n", "* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
"* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
"* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n", "* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
"* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)" "* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)"
] ]
}, },
{ {
@@ -458,7 +457,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -472,7 +471,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -373,8 +373,8 @@
"source": [ "source": [
"#Mean age of passengers per Passenger class\n", "#Mean age of passengers per Passenger class\n",
"\n", "\n",
"#First we calculate the mean\n", "#First we calculate the mean for the numeric columns\n",
"df.groupby('Pclass').mean()" "df.select_dtypes(np.number).groupby('Pclass').mean()"
] ]
}, },
{ {
@@ -404,7 +404,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"#Mean Age and SibSp of passengers grouped by passenger class and sex\n", "#Mean Age and SibSp of passengers grouped by passenger class and sex\n",
"df.groupby(['Pclass', 'Sex'])['Age','SibSp'].mean()" "df.groupby(['Pclass', 'Sex'])[['Age','SibSp']].mean()"
] ]
}, },
{ {
@@ -414,7 +414,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"#Show mean Age and SibSp for passengers older than 25 grouped by Passenger Class and Sex\n", "#Show mean Age and SibSp for passengers older than 25 grouped by Passenger Class and Sex\n",
"df[df.Age > 25].groupby(['Pclass', 'Sex'])['Age','SibSp'].mean()" "df[df.Age > 25].groupby(['Pclass', 'Sex'])[['Age','SibSp']].mean()"
] ]
}, },
{ {
@@ -424,7 +424,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Mean age, SibSp , Survived of passengers older than 25 which survived, grouped by Passenger Class and Sex \n", "# Mean age, SibSp , Survived of passengers older than 25 which survived, grouped by Passenger Class and Sex \n",
"df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].mean()" "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])[['Age','SibSp','Survived']].mean()"
] ]
}, },
{ {
@@ -436,7 +436,7 @@
"# We can also decide which function apply in each column\n", "# We can also decide which function apply in each column\n",
"\n", "\n",
"#Show mean Age, mean SibSp, and number of passengers older than 25 that survived, grouped by Passenger Class and Sex\n", "#Show mean Age, mean SibSp, and number of passengers older than 25 that survived, grouped by Passenger Class and Sex\n",
"df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].agg({'Age': np.mean, \n", "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])[['Age','SibSp','Survived']].agg({'Age': np.mean, \n",
" 'SibSp': np.mean, 'Survived': np.sum})" " 'SibSp': np.mean, 'Survived': np.sum})"
] ]
}, },
@@ -451,7 +451,10 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Pivot tables are an intuitive way to analyze data, and alternative to group columns." "Pivot tables are an intuitive way to analyze data, and an alternative to group columns.\n",
"\n",
"This command makes a table with rows Sex and columns Pclass, and\n",
"averages the result of the column Survived, thereby giving the percentage of survivors in each grouping."
] ]
}, },
{ {
@@ -460,7 +463,14 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"pd.pivot_table(df, index='Sex')" "pd.pivot_table(df, index='Sex', columns='Pclass', values=['Survived'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we want to analyze multi-index, the percentage of survivoers, given sex and age, and distributed by Pclass."
] ]
}, },
{ {
@@ -469,7 +479,14 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"pd.pivot_table(df, index=['Sex', 'Pclass'])" "pd.pivot_table(df, index=['Sex', 'Age'], columns=['Pclass'], values=['Survived'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nevertheless, this is not very useful since we have a row per age. Thus, we define a partition."
] ]
}, },
{ {
@@ -478,7 +495,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"pd.pivot_table(df, index=['Sex', 'Pclass'], values=['Age', 'SibSp'])" "# Partition each of the passengers into 3 categories based on their age\n",
"age = pd.cut(df['Age'], [0,12,18,80])"
] ]
}, },
{ {
@@ -487,7 +505,14 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"pd.pivot_table(df, index=['Sex', 'Pclass'], values=['Age', 'SibSp'], aggfunc=np.mean)" "pd.pivot_table(df, index=['Sex', age], columns=['Pclass'], values=['Survived'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can change the function used for aggregating each group."
] ]
}, },
{ {
@@ -496,8 +521,18 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Try np.sum, np.size, len\n", "# default\n",
"pd.pivot_table(df, index=['Sex', 'Pclass'], values=['Age', 'SibSp'], aggfunc=[np.mean, np.sum])" "pd.pivot_table(df, index=['Sex', age], columns=['Pclass'], values=['Survived'], aggfunc=np.mean)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Two agg functions\n",
"pd.pivot_table(df, index=['Sex', age], columns=['Pclass'], values=['Survived'], aggfunc=[np.mean, np.sum])"
] ]
}, },
{ {
@@ -600,8 +635,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Fill missing values with the median\n", "# Fill missing values with the median, we avoid empty (None) values with numeric_only\n",
"df_filled = df.fillna(df.median())\n", "df_filled = df.fillna(df.median(numeric_only=True))\n",
"df_filled[-5:]" "df_filled[-5:]"
] ]
}, },
@@ -685,7 +720,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# But we are working on a copy \n", "# But we are working on a copy, so we get a warning\n",
"df.iloc[889]['Sex'] = np.nan" "df.iloc[889]['Sex'] = np.nan"
] ]
}, },
@@ -695,7 +730,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# If we want to change, we should not chain selections\n", "# If we want to change it, we should not chain selections\n",
"# The selection can be done with the column name\n", "# The selection can be done with the column name\n",
"df.loc[889, 'Sex']" "df.loc[889, 'Sex']"
] ]
@@ -932,11 +967,11 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Pandas](http://pandas.pydata.org/)\n", "* [Pandas](http://pandas.pydata.org/)\n",
"* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n", "* [Learning Pandas, Michael Heydt, Packt Publishing, 2017](https://learning.oreilly.com/library/view/learning-pandas/9781787123137/)\n",
"* [Useful Pandas Snippets](https://gist.github.com/bsweger/e5817488d161f37dcbd2)\n", "* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
"* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
"* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n", "* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
"* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)" "* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)\n",
"* [Useful Pandas Snippets](https://gist.github.com/bsweger/e5817488d161f37dcbd2)"
] ]
}, },
{ {
@@ -958,7 +993,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -972,7 +1007,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.11.5"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -220,7 +220,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Analise distributon\n", "# Analise distribution\n",
"df.hist(figsize=(10,10))\n", "df.hist(figsize=(10,10))\n",
"plt.show()" "plt.show()"
] ]
@@ -233,7 +233,7 @@
"source": [ "source": [
"# We can see the pairwise correlation between variables. A value near 0 means low correlation\n", "# We can see the pairwise correlation between variables. A value near 0 means low correlation\n",
"# while a value near -1 or 1 indicates strong correlation.\n", "# while a value near -1 or 1 indicates strong correlation.\n",
"df.corr()" "df.corr(numeric_only = True)"
] ]
}, },
{ {
@@ -249,11 +249,10 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# General description of relationship betweek variables uwing Seaborn PairGrid\n", "# General description of relationship between variables uwing Seaborn PairGrid\n",
"# We use df_clean, since the null values of df would gives us an error, you can check it.\n", "# We use df_clean, since the null values of df would gives us an error, you can check it.\n",
"g = sns.PairGrid(df_clean, hue=\"Survived\")\n", "g = sns.PairGrid(df_clean, hue=\"Survived\")\n",
"g.map_diag(plt.hist)\n", "g.map(sns.scatterplot)\n",
"g.map_offdiag(plt.scatter)\n",
"g.add_legend()" "g.add_legend()"
] ]
}, },
@@ -367,7 +366,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Now we visualise age and survived to see if there is some relationship\n", "# Now we visualise age and survived to see if there is some relationship\n",
"sns.FacetGrid(df, hue=\"Survived\", size=5).map(sns.kdeplot, \"Age\").add_legend()" "sns.FacetGrid(df, hue=\"Survived\", height=5).map(sns.kdeplot, \"Age\").add_legend()"
] ]
}, },
{ {
@@ -567,7 +566,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Plot with seaborn\n", "# Plot with seaborn\n",
"sns.countplot('Sex', data=df)" "sns.countplot(x='Sex', data=df)"
] ]
}, },
{ {
@@ -683,16 +682,6 @@
"df.groupby('Pclass').size()" "df.groupby('Pclass').size()"
] ]
}, },
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Distribution\n",
"sns.countplot('Pclass', data=df)"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -725,7 +714,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"sns.factorplot('Pclass',data=df,hue='Sex',kind='count')" "sns.catplot(x='Pclass',data=df,hue='Sex',kind='count')"
] ]
}, },
{ {
@@ -906,7 +895,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Distribution\n", "# Distribution\n",
"sns.countplot('Embarked', data=df)" "sns.countplot(x='Embarked', data=df)"
] ]
}, },
{ {
@@ -997,7 +986,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Distribution\n", "# Distribution\n",
"sns.countplot('SibSp', data=df)" "sns.countplot(x='SibSp', data=df)"
] ]
}, },
{ {
@@ -1180,7 +1169,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Distribution\n", "# Distribution\n",
"sns.countplot('Parch', data=df)" "sns.countplot(x='Parch', data=df)"
] ]
}, },
{ {
@@ -1233,7 +1222,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df.groupby(['Pclass', 'Sex', 'Parch'])['Parch', 'SibSp', 'Survived'].agg({'Parch': np.size, 'SibSp': np.mean, 'Survived': np.mean})" "df.groupby(['Pclass', 'Sex', 'Parch'])[['Parch', 'SibSp', 'Survived']].agg({'Parch': np.size, 'SibSp': np.mean, 'Survived': np.mean})"
] ]
}, },
{ {
@@ -1576,7 +1565,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -1590,7 +1579,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -72,7 +72,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv\"\n", "Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv.\n",
"\n", "\n",
"Print *df*." "Print *df*."
] ]
@@ -214,7 +214,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"df['FamilySize'] = df['SibSp'] + df['Parch']\n", "df['FamilySize'] = df['SibSp'] + df['Parch']\n",
"df.head()" "df"
] ]
}, },
{ {
@@ -377,8 +377,8 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Group ages to simplify machine learning algorithms. 0: 0-5, 1: 6-10, 2: 11-15, 3: 16-59 and 4: 60-80\n", "# Group ages to simplify machine learning algorithms. 0: 0-5, 1: 6-10, 2: 11-15, 3: 16-59 and 4: 60-80\n",
"df['AgeGroup'] = 0\n", "df['AgeGroup'] = np.nan\n",
"df.loc[(.Age<6),'AgeGroup'] = 0\n", "df.loc[(df.Age<6),'AgeGroup'] = 0\n",
"df.loc[(df.Age>=6) & (df.Age < 11),'AgeGroup'] = 1\n", "df.loc[(df.Age>=6) & (df.Age < 11),'AgeGroup'] = 1\n",
"df.loc[(df.Age>=11) & (df.Age < 16),'AgeGroup'] = 2\n", "df.loc[(df.Age>=11) & (df.Age < 16),'AgeGroup'] = 2\n",
"df.loc[(df.Age>=16) & (df.Age < 60),'AgeGroup'] = 3\n", "df.loc[(df.Age>=16) & (df.Age < 60),'AgeGroup'] = 3\n",
@@ -404,8 +404,8 @@
" if np.isnan(big_string):\n", " if np.isnan(big_string):\n",
" return 'X'\n", " return 'X'\n",
" for substring in substrings:\n", " for substring in substrings:\n",
" if big_string.find(substring) != 1:\n", " if substring in big_string:\n",
" return substring\n", " return substring[0::]\n",
" print(big_string)\n", " print(big_string)\n",
" return 'X'\n", " return 'X'\n",
" \n", " \n",
@@ -478,8 +478,17 @@
} }
], ],
"metadata": { "metadata": {
"datacleaner": {
"position": {
"top": "50px"
},
"python": {
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
},
"window_display": false
},
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -493,7 +502,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -78,7 +78,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015." "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka and Vahid Mirjalili, Packt Publishing, 2019."
] ]
}, },
{ {
@@ -100,7 +100,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -114,7 +114,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

View File

@@ -222,7 +222,7 @@
"kernel = types_of_kernels[0]\n", "kernel = types_of_kernels[0]\n",
"gamma = 3.0\n", "gamma = 3.0\n",
"\n", "\n",
"# Create kNN model\n", "# Create SVM model\n",
"model = SVC(kernel=kernel, probability=True, gamma=gamma)" "model = SVC(kernel=kernel, probability=True, gamma=gamma)"
] ]
}, },
@@ -276,7 +276,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"We can evaluate the accuracy if the model always predict the most frequent class, following this [refeference](http://blog.kaggle.com/2015/10/23/scikit-learn-video-9-better-evaluation-of-classification-models/)." "We can evaluate the accuracy if the model always predict the most frequent class, following this [reference](https://medium.com/analytics-vidhya/model-validation-for-classification-5ff4a0373090)."
] ]
}, },
{ {
@@ -351,10 +351,10 @@
"We can obtain more information from the confussion matrix and the metric F1-score.\n", "We can obtain more information from the confussion matrix and the metric F1-score.\n",
"In a confussion matrix, we can see:\n", "In a confussion matrix, we can see:\n",
"\n", "\n",
"||**Predicted**: 0| **Predicted: 1**|\n", "| |**Predicted**: 0| **Predicted: 1**|\n",
"|---------------------------|\n", "|-------------|----------------|-----------------|\n",
"|**Actual: 0**| TN | FP |\n", "|**Actual: 0**| TN | FP |\n",
"|**Actual: 1**| FN|TP|\n", "|**Actual: 1**| FN | TP |\n",
"\n", "\n",
"* **True negatives (TN)**: actual negatives that were predicted as negatives\n", "* **True negatives (TN)**: actual negatives that were predicted as negatives\n",
"* **False positives (FP)**: actual negatives that were predicted as positives\n", "* **False positives (FP)**: actual negatives that were predicted as positives\n",
@@ -418,7 +418,7 @@
"plt.ylim([0.0, 1.0])\n", "plt.ylim([0.0, 1.0])\n",
"plt.title('ROC curve for Titanic')\n", "plt.title('ROC curve for Titanic')\n",
"plt.xlabel('False Positive Rate (1 - Recall)')\n", "plt.xlabel('False Positive Rate (1 - Recall)')\n",
"plt.xlabel('True Positive Rate (Sensitivity)')\n", "plt.ylabel('True Positive Rate (Sensitivity)')\n",
"plt.grid(True)" "plt.grid(True)"
] ]
}, },
@@ -535,13 +535,13 @@
"source": [ "source": [
"# This step will take some time\n", "# This step will take some time\n",
"# Cross-validationt\n", "# Cross-validationt\n",
"cv = KFold(n_splits=5, shuffle=False, random_state=33)\n", "cv = KFold(n_splits=5, shuffle=True, random_state=33)\n",
"# StratifiedKFold has is a variation of k-fold which returns stratified folds:\n", "# StratifiedKFold has is a variation of k-fold which returns stratified folds:\n",
"# each set contains approximately the same percentage of samples of each target class as the complete set.\n", "# each set contains approximately the same percentage of samples of each target class as the complete set.\n",
"#cv = StratifiedKFold(y, n_folds=3, shuffle=False, random_state=33)\n", "#cv = StratifiedKFold(y, n_folds=3, shuffle=True, random_state=33)\n",
"scores = cross_val_score(model, X, y, cv=cv)\n", "scores = cross_val_score(model, X, y, cv=cv)\n",
"print(\"Scores in every iteration\", scores)\n", "print(\"Scores in every iteration\", scores)\n",
"print(\"Accuracy: %0.2f (+/- %0.2f)\" % (scores.mean(), scores.std() * 2))\n" "print(\"Accuracy: %0.2f (+/- %0.2f)\" % (scores.mean(), scores.std() * 2))"
] ]
}, },
{ {
@@ -644,7 +644,7 @@
"source": [ "source": [
"* [Titanic Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n", "* [Titanic Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
"* [API SVC scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)\n", "* [API SVC scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)\n",
"* [Better evaluation of classification models](http://blog.kaggle.com/2015/10/23/scikit-learn-video-9-better-evaluation-of-classification-models/)" "* [How to choose the right metric for evaluating an ML model](https://www.kaggle.com/vipulgandhi/how-to-choose-right-metric-for-evaluating-ml-model)"
] ]
}, },
{ {
@@ -666,7 +666,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@@ -680,7 +680,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.7.1" "version": "3.8.12"
}, },
"latex_envs": { "latex_envs": {
"LaTeX_envs_menu_present": true, "LaTeX_envs_menu_present": true,

Some files were not shown because too many files have changed in this diff Show More