Compare commits
104 Commits
dveni-patc
...
0150ce7cf7
Author | SHA1 | Date | |
---|---|---|---|
|
0150ce7cf7 | ||
|
08dfe5c147 | ||
|
78e62af098 | ||
|
3f5eba3e84 | ||
|
2de1cda8f1 | ||
|
cc442c35f3 | ||
|
1100c352fa | ||
|
9b573d292d | ||
|
dd8a4f50d8 | ||
|
47148f2ccc | ||
|
8ffda8123a | ||
|
6629837e7d | ||
|
ba08a9a264 | ||
|
4b8fd30f42 | ||
|
d879369930 | ||
|
4da01f3ae6 | ||
|
da9a01e26b | ||
|
dc23b178d7 | ||
|
5410d6115d | ||
|
6749aa5deb | ||
|
c31e6c1676 | ||
|
1c7496c8ac | ||
|
35b1ae4ec8 | ||
|
58fc6f5e9c | ||
|
91147becee | ||
|
1530995243 | ||
|
0c0960cec7 | ||
|
3363c953f4 | ||
|
542ce2708d | ||
|
380340d66d | ||
|
7f49f8990b | ||
|
419ea57824 | ||
|
7d6010114d | ||
|
f9d8234e14 | ||
|
d41fa61c65 | ||
|
05a4588acf | ||
|
50933f6c94 | ||
|
68ba528dd7 | ||
|
897bb487b1 | ||
|
41d3bdea75 | ||
|
0a9cd3bd5e | ||
|
2c7c9e58e0 | ||
|
f0278aea33 | ||
|
7bf0fb6479 | ||
|
4d87b07ed9 | ||
|
7d71ba5f7a | ||
|
1124c9129c | ||
|
df6449b55f | ||
|
d99eeb733a | ||
|
a43fb4c78c | ||
|
bf21e3ceab | ||
|
e41d233828 | ||
|
a7c6be5b96 | ||
|
11a1ea80d3 | ||
|
a209d18a5b | ||
|
ffefd8c2e3 | ||
|
f43cde73e4 | ||
|
8784fdc773 | ||
|
a6d5f9ddeb | ||
|
2e72a4d729 | ||
|
9426b4c061 | ||
|
5e5979d515 | ||
|
270dcec611 | ||
|
e6e52b43ee | ||
|
3b7675fa3f | ||
|
44c63412f9 | ||
|
5febbc21a4 | ||
|
66ed4ba258 | ||
|
95cd25aef4 | ||
|
955e74fc8e | ||
|
6743dad100 | ||
|
729f7684c2 | ||
|
ae8d3d3ba2 | ||
|
2ba0e2f3d9 | ||
|
c9114cc796 | ||
|
b80c097362 | ||
|
161cd8492b | ||
|
3d6d96dd8a | ||
|
44aa3d24fb | ||
|
8925a4a3c1 | ||
|
23913811df | ||
|
7b4391f187 | ||
|
0c100dbadc | ||
|
2f7cbe9e45 | ||
|
b43125ca59 | ||
|
5144b7f228 | ||
|
8b6d6de169 | ||
|
7271b5e632 | ||
|
bd99321d6b | ||
|
91b8f66056 | ||
|
242a0a9252 | ||
|
d8d25c4dc3 | ||
|
4979fe6877 | ||
|
c5e0f146c4 | ||
|
167475029e | ||
|
da79a18bfc | ||
|
47761c11aa | ||
|
fd5aa4a1fd | ||
|
396a7b17ca | ||
|
2248188219 | ||
|
21dc5ec3de | ||
|
db99033727 | ||
|
5459e801d5 | ||
|
75f08ea170 |
@@ -5,8 +5,8 @@ Exercises for Intelligent Systems Course at Universidad Politécnica de Madrid,
|
|||||||
|
|
||||||
For following this course:
|
For following this course:
|
||||||
- Follow the instructions to install the environment: https://github.com/gsi-upm/sitc/blob/master/python/1_1_Notebooks.ipynb (Just install 'conda')
|
- Follow the instructions to install the environment: https://github.com/gsi-upm/sitc/blob/master/python/1_1_Notebooks.ipynb (Just install 'conda')
|
||||||
- Download the course: use 'https://github.com/gsi-upm/sitc'
|
- Download the course: use 'https://github.com/gsi-upm/sitc' (or clone the repository to receive updates).
|
||||||
- Run in a terminal in the foloder sitc: jupyter notebook (and enjoy)
|
- Run in a terminal in the folder sitc: jupyter notebook (and enjoy)
|
||||||
|
|
||||||
Topics
|
Topics
|
||||||
* Python: quick introduction to Python
|
* Python: quick introduction to Python
|
||||||
|
484
lod/00_SPARQL_Tutorial.ipynb
Normal file
@@ -0,0 +1,484 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<header style=\"width:100%;position:relative\">\n",
|
||||||
|
" <div style=\"width:80%;float:right;\">\n",
|
||||||
|
" <h1>Course Notes for Learning Intelligent Systems</h1>\n",
|
||||||
|
" <h3>Department of Telematic Engineering Systems</h3>\n",
|
||||||
|
" <h5>Universidad Politécnica de Madrid. © Carlos A. Iglesias </h5>\n",
|
||||||
|
" </div>\n",
|
||||||
|
" <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
|
||||||
|
"</header>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Introduction\n",
|
||||||
|
"\n",
|
||||||
|
"This lecture provides an introduction to RDF and the SPARQL query language.\n",
|
||||||
|
"\n",
|
||||||
|
"This is the first in a series of notebooks about SPARQL, which consists of:\n",
|
||||||
|
"\n",
|
||||||
|
"* This notebook, which explains basic concepts of RDF and SPARQL\n",
|
||||||
|
"* [A notebook](01_SPARQL_Introduction.ipynb) that provides an introduction to SPARQL through a collection of exercises of increasing difficulty.\n",
|
||||||
|
"* [An optional notebook](02_SPARQL_Custom_Endpoint.ipynb) with queries to a custom dataset.\n",
|
||||||
|
"The dataset is meant to be done after the [RDF exercises](../rdf/RDF.ipynb) and it is out of the scope of this course.\n",
|
||||||
|
"You can consult it if you are interested."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# RDF basics\n",
|
||||||
|
"This section is taken from [[1](#1), [2](#2)].\n",
|
||||||
|
"\n",
|
||||||
|
"RDF allows us to make statements about resources. The format of these statements is simple. A statement always has the following structure:\n",
|
||||||
|
"\n",
|
||||||
|
" <subject> <predicate> <object>\n",
|
||||||
|
" \n",
|
||||||
|
"An RDF statement expresses a relationship between two resources. The **subject** and the **object** represent the two resources being related; the **predicate** represents the nature of their relationship.\n",
|
||||||
|
"The relationship is phrased in a directional way (from subject to object).\n",
|
||||||
|
"In RDF this relationship is known as a **property**.\n",
|
||||||
|
"Because RDF statements consist of three elements they are called **triples**.\n",
|
||||||
|
"\n",
|
||||||
|
"Here are some examples of RDF triples (informally expressed in pseudocode):\n",
|
||||||
|
"\n",
|
||||||
|
" <Bob> <is a> <person>.\n",
|
||||||
|
" <Bob> <is a friend of> <Alice>.\n",
|
||||||
|
" \n",
|
||||||
|
"Resources are identified by [IRIs](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier), which can appear in all three positions of a triple. For example, the IRI for Leonardo da Vinci in DBpedia is:\n",
|
||||||
|
"\n",
|
||||||
|
" <http://dbpedia.org/resource/Leonardo_da_Vinci>\n",
|
||||||
|
"\n",
|
||||||
|
"IRIs can be abbreviated as *prefixed names*. For example, \n",
|
||||||
|
" PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
||||||
|
" <dbr:Leonardo_da_Vinci>\n",
|
||||||
|
" \n",
|
||||||
|
"Objects can be literals: \n",
|
||||||
|
" * strings (e.g., \"plain string\" or \"string with language\"@en)\n",
|
||||||
|
" * numbers (e.g., \"13.4\"^^xsd:float)\n",
|
||||||
|
" * dates (e.g., )\n",
|
||||||
|
" * booleans\n",
|
||||||
|
" * etc.\n",
|
||||||
|
" \n",
|
||||||
|
"RDF data is stored in RDF repositories that expose SPARQL endpoints.\n",
|
||||||
|
"Let's query one of the most famous RDF repositories: [dbpedia](https://wiki.dbpedia.org/).\n",
|
||||||
|
"First, we should learn how to execute SPARQL in a notebook."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Executing SPARQL in a notebook\n",
|
||||||
|
"There are several ways to execute SPARQL in a notebook.\n",
|
||||||
|
"Some of the most popular are:\n",
|
||||||
|
"\n",
|
||||||
|
"* using libraries such as [sparql-client](https://pypi.org/project/sparql-client/) or [rdflib](https://rdflib.dev/sparqlwrapper/) that enable executing SPARQL within a Python3 kernel\n",
|
||||||
|
"* using other libraries. In our case, a light library has been developed (the file helpers.py) for accessing SPARQL endpoints using an HTTP connection.\n",
|
||||||
|
"* using the [graph notebook package](https://pypi.org/project/graph-notebook/)\n",
|
||||||
|
"* using a SPARQL kernel [sparql kernel](https://github.com/paulovn/sparql-kernel) instead of the Python3 kernel\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"We are going to use the second option to avoid installing new packages.\n",
|
||||||
|
"\n",
|
||||||
|
"To use the library, you need to:\n",
|
||||||
|
"\n",
|
||||||
|
"1. Import `sparql` from helpers (i.e., `helpers.py`, a file that is available in the github repository)\n",
|
||||||
|
"2. Use the `%%sparql` magic command to indicate the SPARQL endpoint and then the SPARQL code.\n",
|
||||||
|
"\n",
|
||||||
|
"Let's try it!\n",
|
||||||
|
"\n",
|
||||||
|
"# Queries agains DBPedia\n",
|
||||||
|
"\n",
|
||||||
|
"We are going to execute a SPARQL query against DBPedia. This section is based on [[8](#8)].\n",
|
||||||
|
"\n",
|
||||||
|
"First, we just create a query to retrieve arbitrary triples (subject, predicate, object) without any restriction (besides limiting the result to 10 triples)."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from helpers import sparql"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?s ?p ?o\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?s ?p ?o\n",
|
||||||
|
"}\n",
|
||||||
|
"LIMIT 10"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Well, it worked, but the results are not particulary interesting. \n",
|
||||||
|
"Let's search for a famous football player, Fernando Torres.\n",
|
||||||
|
"\n",
|
||||||
|
"To do so, we will search for entities whose English \"human-readable representation\" (i.e., label) matches \"Fernando Torres\":"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?athlete rdfs:label \"Fernando Torres\"@en \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Great, we found the IRI of the node: `http://dbpedia.org/resource/Fernando_Torres`\n",
|
||||||
|
"\n",
|
||||||
|
"Now we can start asking for more properties.\n",
|
||||||
|
"\n",
|
||||||
|
"To do so, go to http://dbpedia.org/resource/Fernando_Torres and you will see all the information available about Fernando Torres. Pay attention to the names of predicates to be able to create new queries. For example, we are interesting in knowing where Fernando Torres was born (`dbo:birthPlace`).\n",
|
||||||
|
"\n",
|
||||||
|
"Let's go!"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?athlete rdfs:label \"Fernando Torres\"@en ;\n",
|
||||||
|
" dbo:birthPlace ?birthPlace . \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"If we examine the SPARQL query, we find three blocks:\n",
|
||||||
|
"\n",
|
||||||
|
"* **PREFIX** section: IRIs of vocabularies and the prefix used below, to avoid long IRIs. e.g., by defining the `dbo` prefix in our example, the `dbo:birthPlace` below expands to `http://dbpedia.org/ontology/birthPlace`.\n",
|
||||||
|
"* **SELECT** section: variables we want to return (`*` is an abbreviation that selects all of the variables in a query)\n",
|
||||||
|
"* **WHERE** clause: triples where some elements are variables. These variables are bound during the query processing process and bounded variables are returned.\n",
|
||||||
|
"\n",
|
||||||
|
"Now take a closer look at the **WHERE** section.\n",
|
||||||
|
"We said earlier that triples are made out of three elements and each triple pattern should finish with a period (`.`) (although the last pattern can omit this).\n",
|
||||||
|
"However, when two or more triple patterns share the same subject, we omit it all but the first one, and use ` ;` as separator.\n",
|
||||||
|
"If if both the subject and predicate are the same, we could use a coma `,` instead.\n",
|
||||||
|
"This allows us to avoid repetition and make queries more readable.\n",
|
||||||
|
"But don't forget the space before your separators (`;` and `.`).\n",
|
||||||
|
"\n",
|
||||||
|
"The result is interesting, we know he was born in Fuenlabrada, but we see an additional (wrong) value, the Spanish national football team. The conversion process from Wikipedia to DBPedia should still be tuned :).\n",
|
||||||
|
"\n",
|
||||||
|
"We can *fix* it, by adding some more constaints.\n",
|
||||||
|
"In our case, only want a birth place that is also a municipality (i.e., its type is `http://dbpedia.org/resource/Municipalities_of_Spain`).\n",
|
||||||
|
"Let's see!"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?athlete rdfs:label \"Fernando Torres\"@en ;\n",
|
||||||
|
" dbo:birthPlace ?birthPlace .\n",
|
||||||
|
" ?birthPlace dbo:type dbr:Municipalities_of_Spain \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Great. Now it looks better.\n",
|
||||||
|
"Notice that we added a new prefix.\n",
|
||||||
|
"\n",
|
||||||
|
"Now, is Fuenlabrada is a big city?\n",
|
||||||
|
"Let's find out.\n",
|
||||||
|
"\n",
|
||||||
|
"**Hint**: you can find more subject / object / predicate nodes related to [Fuenlabrada])http://dbpedia.org/resource/Fuenlabrada) in the RDF graph just as we did before.\n",
|
||||||
|
"That is how we found the `dbo:areaTotal` property."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" dbr:Fuenlabrada dbo:areaTotal ?area \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Well, it shows 39.1 km$^2$.\n",
|
||||||
|
"\n",
|
||||||
|
"Let's go back to our Fernando Torres.\n",
|
||||||
|
"What we are really insterested in is the name of the city he was born in, not its IRI.\n",
|
||||||
|
"As we saw before, the human-readable name is provided by the `rdfs:label` property:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbp: <http://dbpedia.org/property/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?player rdfs:label \"Fernando Torres\"@en ;\n",
|
||||||
|
" dbo:birthPlace ?birthPlace .\n",
|
||||||
|
" ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
|
||||||
|
" rdfs:label ?placeName \n",
|
||||||
|
" \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Well, we are almost there. We see that we receive the city name in many languages. We want just the English name. Let's filter!"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbp: <http://dbpedia.org/property/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?player rdfs:label \"Fernando Torres\"@en ;\n",
|
||||||
|
" dbo:birthPlace ?birthPlace .\n",
|
||||||
|
" ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
|
||||||
|
" rdfs:label ?placeName .\n",
|
||||||
|
" FILTER ( LANG ( ?placeName ) = 'en' )\n",
|
||||||
|
" \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Awesome!\n",
|
||||||
|
"\n",
|
||||||
|
"But we said we don't care about the IRI of the place. We only want two pieces of data: Fernando's birth date and the name of his birthplace.\n",
|
||||||
|
"\n",
|
||||||
|
"Let's tune our query a bit more."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbp: <http://dbpedia.org/property/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?birthDate, ?placeName\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?player rdfs:label \"Fernando Torres\"@en ;\n",
|
||||||
|
" dbo:birthDate ?birthDate ;\n",
|
||||||
|
" dbo:birthPlace ?birthPlace .\n",
|
||||||
|
" ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
|
||||||
|
" rdfs:label ?placeName .\n",
|
||||||
|
" FILTER ( LANG ( ?placeName ) = 'en' )\n",
|
||||||
|
" \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Great 😃\n",
|
||||||
|
"\n",
|
||||||
|
"Are there many football players born in Fuenlabrada? Let's find out!"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbp: <http://dbpedia.org/property/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT *\n",
|
||||||
|
"WHERE\n",
|
||||||
|
" {\n",
|
||||||
|
" ?player a dbo:SoccerPlayer ; \n",
|
||||||
|
" dbo:birthPlace dbr:Fuenlabrada . \n",
|
||||||
|
" }"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Well, not that many. Observe we have used `a`.\n",
|
||||||
|
"It is just an abbreviation for `rdf:type`, both can be used interchangeably.\n",
|
||||||
|
"\n",
|
||||||
|
"If you want additional examples, you can follow the notebook by [Shawn Graham](https://github.com/o-date/sparql-and-lod/blob/master/sparql-intro.ipynb), which is based on the SPARQL tutorial by Matthew Lincoln, available [here in English](https://programminghistorian.org/en/lessons/retired/graph-databases-and-SPARQL) and [here in Spanish](https://programminghistorian.org/es/lecciones/retirada/sparql-datos-abiertos-enlazados]). You have also a local copy of these tutorials together with this notebook [here in English](https://htmlpreview.github.io/?https://github.com/gsi-upm/sitc/blob/master/lod/tutorial/graph-databases-and-SPARQL.html) and [here in Spanish](https://htmlpreview.github.io/?https://github.com/gsi-upm/sitc/blob/master/lod/tutorial/sparql-datos-abiertos-enlazados.html). \n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## References"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"* <a id=\"1\">[1]</a> [SPARQL by Example. A Tutorial. Lee Feigenbaum. W3C, 2009](https://www.w3.org/2009/Talks/0615-qbe/#q1)\n",
|
||||||
|
"* <a id=\"2\">[2]</a> [RDF Primer W3C](https://www.w3.org/TR/rdf11-primer/)\n",
|
||||||
|
"* <a id=\"3\">[3]</a> [SPARQL queries of Beatles recording sessions](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html)\n",
|
||||||
|
"* <a id=\"4\">[4]</a> [RDFLib documentation](https://rdflib.readthedocs.io/en/stable/).\n",
|
||||||
|
"* <a id=\"5\">[5]</a> [Wikidata Query Service query examples](https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples)\n",
|
||||||
|
"* <a id=\"6\">[6]</a> [RDF Graph Data Model. Learn about the RDF graph model used by Stardog.](https://www.stardog.com/tutorials/data-model)\n",
|
||||||
|
"* <a id=\"7\">[7]</a> [Learn SPARQL Write Knowledge Graph queries using SPARQL with step-by-step examples.](https://www.stardog.com/tutorials/sparql/)\n",
|
||||||
|
"* <a id=\"8\">[8]</a> [Running Basic SPARQL Queries Against DBpedia.](https://medium.com/virtuoso-blog/dbpedia-basic-queries-bc1ac172cc09)\n",
|
||||||
|
"* <a id=\"8\">[9]</a> [Intro SPARQL based on painters.](https://github.com/o-date/sparql-and-lod/blob/master/sparql-intro.ipynb)."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Licence\n",
|
||||||
|
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||||
|
"\n",
|
||||||
|
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"latex_envs": {
|
||||||
|
"LaTeX_envs_menu_present": true,
|
||||||
|
"autocomplete": true,
|
||||||
|
"bibliofile": "biblio.bib",
|
||||||
|
"cite_by": "apalike",
|
||||||
|
"current_citInitial": 1,
|
||||||
|
"eqLabelWithNumbers": true,
|
||||||
|
"eqNumInitial": 1,
|
||||||
|
"hotkeys": {
|
||||||
|
"equation": "Ctrl-E",
|
||||||
|
"itemize": "Ctrl-I"
|
||||||
|
},
|
||||||
|
"labels_anchors": false,
|
||||||
|
"latex_user_defs": false,
|
||||||
|
"report_style_numbering": false,
|
||||||
|
"user_envs_cfg": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 1
|
||||||
|
}
|
@@ -6,11 +6,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
|
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-0bfe38f97f6ab2d2",
|
"grade_id": "cell-0bfe38f97f6ab2d2",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -56,11 +57,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "99aecbad8f94966d92d72dc911d3ff99",
|
"cell_type": "markdown",
|
||||||
|
"checksum": "40ccd05ad0704781327031a84dfb9939",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-4f8492996e74bf20",
|
"grade_id": "cell-4f8492996e74bf20",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -69,7 +71,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"* This notebook\n",
|
"* This notebook\n",
|
||||||
"* External SPARQL editors (optional)\n",
|
"* External SPARQL editors (optional)\n",
|
||||||
" * YASGUI-GSI http://yasgui.cluster.gsi.dit.upm.es\n",
|
" * YASGUI-GSI http://yasgui.gsi.upm.es\n",
|
||||||
" * DBpedia virtuoso http://dbpedia.org/sparql\n",
|
" * DBpedia virtuoso http://dbpedia.org/sparql\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Using the YASGUI-GSI editor has several advantages over other options.\n",
|
"Using the YASGUI-GSI editor has several advantages over other options.\n",
|
||||||
@@ -93,18 +95,19 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "99e3107f9987cdddae7866dded27f165",
|
"cell_type": "markdown",
|
||||||
|
"checksum": "81894e9d65e5dd9f3b6e1c5f66804bf6",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-70ac24910356c3cf",
|
"grade_id": "cell-70ac24910356c3cf",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
"## Instructions\n",
|
"## Instructions\n",
|
||||||
"\n",
|
"\n",
|
||||||
"We will be using a semantic server, available at: http://fuseki.cluster.gsi.dit.upm.es/sitc.\n",
|
"We will be using a semantic server, available at: http://fuseki.gsi.upm.es/sitc.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This server contains a dataset about [Beatles songs](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html), which we will query with SPARQL.\n",
|
"This server contains a dataset about [Beatles songs](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html), which we will query with SPARQL.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -122,11 +125,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "1d332d3d11fd6b57f0ec0ac3c358c6cb",
|
"checksum": "1d332d3d11fd6b57f0ec0ac3c358c6cb",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-eb13908482825e42",
|
"grade_id": "cell-eb13908482825e42",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -144,11 +148,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
"checksum": "aca7c5538b8fc53e99c92e94e6818c83",
|
"checksum": "aca7c5538b8fc53e99c92e94e6818c83",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-b3f3d92fa2100c3d",
|
"grade_id": "cell-b3f3d92fa2100c3d",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -163,11 +168,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "e896b6560e45d5c385a43aa85e3523c7",
|
"checksum": "e896b6560e45d5c385a43aa85e3523c7",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-04410e75828c388d",
|
"grade_id": "cell-04410e75828c388d",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -193,11 +199,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "96ca90572d6b275fa515c6b976115257",
|
"cell_type": "markdown",
|
||||||
|
"checksum": "34710d3bb8e2cf826833a43adb7fb448",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-2a44c0da2c206d01",
|
"grade_id": "cell-2a44c0da2c206d01",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -210,7 +217,7 @@
|
|||||||
"Some examples are:\n",
|
"Some examples are:\n",
|
||||||
"\n",
|
"\n",
|
||||||
"* DBpedia's virtuoso query editor https://dbpedia.org/sparql\n",
|
"* DBpedia's virtuoso query editor https://dbpedia.org/sparql\n",
|
||||||
"* A javascript based client hosted at GSI: http://yasgui.cluster.gsi.dit.upm.es/\n",
|
"* A javascript based client hosted at GSI: http://yasgui.gsi.upm.es/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"[^1]: http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html"
|
"[^1]: http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html"
|
||||||
]
|
]
|
||||||
@@ -221,11 +228,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "79c60bd3d4c13f380aae5778c5ce7245",
|
"checksum": "79c60bd3d4c13f380aae5778c5ce7245",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-d645128d3af18117",
|
"grade_id": "cell-d645128d3af18117",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -241,11 +249,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "f7428fe79cd33383dfd3b09a0d951b6e",
|
"checksum": "f7428fe79cd33383dfd3b09a0d951b6e",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-8391a5322a9ad4a7",
|
"grade_id": "cell-8391a5322a9ad4a7",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -260,11 +269,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "f6b5da583694dd5cc9326c670830875d",
|
"checksum": "f6b5da583694dd5cc9326c670830875d",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-4f56a152e4d70c02",
|
"grade_id": "cell-4f56a152e4d70c02",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -313,17 +323,18 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "7a9dc62ab639143c9fc13593e50500d4",
|
"cell_type": "code",
|
||||||
|
"checksum": "3bc71f851a33fa401d18ea3ab02cf61f",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-8ce8c954513f17e7",
|
"grade_id": "cell-8ce8c954513f17e7",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"SELECT ?entity ?type\n",
|
"SELECT ?entity ?type\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
@@ -338,11 +349,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "d6a79c2f5fd005a9e15a8f67dcfd4784",
|
"checksum": "d6a79c2f5fd005a9e15a8f67dcfd4784",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-3d6d622c717c3950",
|
"grade_id": "cell-3d6d622c717c3950",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -375,17 +387,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "69e016b0224f410f03f6217ac30c03a8",
|
"cell_type": "code",
|
||||||
|
"checksum": "65be7168bedb4f6dc2f19e2138bab232",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-6e904d692b5facad",
|
"grade_id": "cell-6e904d692b5facad",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"SELECT ?entity ?prop\n",
|
"SELECT ?entity ?prop\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
@@ -401,12 +414,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "97bd5d5383bd94a72c7452bc33e4b0f9",
|
"cell_type": "code",
|
||||||
|
"checksum": "e78b57fa9baab578f5a4bd22dc499fca",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-3fc0d3c43dfd04a3",
|
"grade_id": "cell-3fc0d3c43dfd04a3",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -440,7 +454,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"SELECT ?type\n",
|
"SELECT ?type\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
@@ -465,7 +479,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"SELECT DISTINCT ?type\n",
|
"SELECT DISTINCT ?type\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
@@ -507,17 +521,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "47c4f68e342ffe59a3804de7b6a3909b",
|
"cell_type": "code",
|
||||||
|
"checksum": "35563ff455c7e8b1c91f61db97b2011b",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-e615f9a77c4bc9a5",
|
"grade_id": "cell-e615f9a77c4bc9a5",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"SELECT DISTINCT ?property\n",
|
"SELECT DISTINCT ?property\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
@@ -532,12 +547,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "c9ffeba2d4ffc3e0b95f15a0ec6012c5",
|
"cell_type": "code",
|
||||||
|
"checksum": "7603c90d8c177e2e6678baa2f1b6af36",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-9168718938ab7347",
|
"grade_id": "cell-9168718938ab7347",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -569,7 +585,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -638,17 +654,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "8b0faf938efc1a64a70515da3c132605",
|
"cell_type": "code",
|
||||||
|
"checksum": "069811507dbac4b86dc5d3adc82ba4ec",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-0223a51f609edcf9",
|
"grade_id": "cell-0223a51f609edcf9",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -667,12 +684,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "e93d7336fd125d95996e60fd312a4e4d",
|
"cell_type": "code",
|
||||||
|
"checksum": "9833a3efa75c7e2784ef5d60aae2a13e",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-3c7943c6382c62f5",
|
"grade_id": "cell-3c7943c6382c62f5",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -706,17 +724,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "271f2b194c2db4c558a46e8312b593e6",
|
"cell_type": "code",
|
||||||
|
"checksum": "b68a279085a1ed087f5e474a6602299e",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-8f43547dd788bb33",
|
"grade_id": "cell-8f43547dd788bb33",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -735,12 +754,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "9f1f7cec8ce4674971543728ada86674",
|
"cell_type": "code",
|
||||||
|
"checksum": "b4461d243cc058b1828769cc906d4947",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-e13a1c921af2f6eb",
|
"grade_id": "cell-e13a1c921af2f6eb",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -770,11 +790,12 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"SELECT *\n",
|
"SELECT *\n",
|
||||||
"WHERE { ... }\n",
|
"WHERE { ... }\n",
|
||||||
"ORDER BY <variable> <variable> ... DESC(<variable>) ASC(<variable>)\n",
|
"ORDER BY <variable> <variable> ... \n",
|
||||||
"... other statements like LIMIT ...\n",
|
"... other statements like LIMIT ...\n",
|
||||||
"```\n",
|
"```\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The results can be sorted in ascending or descending order, and using several variables."
|
"The results can be sorted in ascending or descending order, and using several variables.\n",
|
||||||
|
"By default the results are ordered in ascending order, but you can indicate the order using an optional modifier (`ASC(<variable>)`, or `DESC(<variable>)`). \n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -790,17 +811,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "9dcd9c6d51a61ac129cffa06e1463c66",
|
"cell_type": "code",
|
||||||
|
"checksum": "335403f01e484ce5563ff059e9764ff4",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-a0f0b9d9b05c9631",
|
"grade_id": "cell-a0f0b9d9b05c9631",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -820,12 +842,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "a044b3fd6b8bd4e098bbe4d818cb4e9f",
|
"cell_type": "code",
|
||||||
|
"checksum": "45530eb91cbc5b3fddcc93d96f07e579",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-bc012ca9d7ad2867",
|
"grade_id": "cell-bc012ca9d7ad2867",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -858,7 +881,7 @@
|
|||||||
" rdfs:label \"Ringo Starr\" .\n",
|
" rdfs:label \"Ringo Starr\" .\n",
|
||||||
"```\n",
|
"```\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Using this structure, and the SPARQL statements you already know, to get the **names** of all musicians that collaborated in at least one song.\n"
|
"Using this structure, and the SPARQL statements you already know, get the **names** of all musicians that collaborated in at least one song.\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -867,17 +890,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "9da7a62b6237078f5eab7e593a8eb590",
|
"cell_type": "code",
|
||||||
|
"checksum": "8fb253675d2e8510e2c6780b960721e5",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-523b963fa4e288d0",
|
"grade_id": "cell-523b963fa4e288d0",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -898,12 +922,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "c8e3a929faf2afa72207c6921382654c",
|
"cell_type": "code",
|
||||||
|
"checksum": "f4474b302bc2f634b3b2ee6e1c7e7257",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-aa9a4e18d6fda225",
|
"grade_id": "cell-aa9a4e18d6fda225",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -930,13 +955,13 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"Results can be aggregated using different functions.\n",
|
"Results can be aggregated using different functions.\n",
|
||||||
"One of the simplest functions is `COUNT`.\n",
|
"One of the simplest functions is `COUNT`.\n",
|
||||||
"The syntax for COUNT is:\n",
|
"The syntax for `COUNT` is:\n",
|
||||||
" \n",
|
" \n",
|
||||||
"```sparql\n",
|
"```sparql\n",
|
||||||
"SELECT (COUNT(?variable) as ?count_name)\n",
|
"SELECT (COUNT(?variable) as ?count_name)\n",
|
||||||
"```\n",
|
"```\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Use `COUNT` to get the number of songs in which Ringo collaborated."
|
"Use `COUNT` to get the number of songs in which Ringo collaborated. Your query should return a column named `number`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -945,17 +970,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "d8419711d2db43ad657e2658a1ea86c4",
|
"cell_type": "code",
|
||||||
|
"checksum": "c7b6620f5ba28b482197ab693cb7142a",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-e89d08031e30b299",
|
"grade_id": "cell-e89d08031e30b299",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX m: <http://learningsparql.com/ns/musician/>\n",
|
"PREFIX m: <http://learningsparql.com/ns/musician/>\n",
|
||||||
@@ -975,12 +1001,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "29404e07edf639cdc0ce0d82e654ec31",
|
"cell_type": "code",
|
||||||
|
"checksum": "c90e1427d7e48d9ae8abab40ff92e3b0",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-903d2be00885e1d2",
|
"grade_id": "cell-903d2be00885e1d2",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -1012,7 +1039,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"Once results are grouped, they can be aggregated using any aggregation function, such as `COUNT`.\n",
|
"Once results are grouped, they can be aggregated using any aggregation function, such as `COUNT`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Using `GROUP BY` and `COUNT`, get the count of songs that use each instrument:"
|
"Using `GROUP BY` and `COUNT`, get the count of songs in which Ringo Starr has played each of the instruments:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1021,17 +1048,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "7a0a7206384e7e1d9eb4450dd9e9871f",
|
"cell_type": "code",
|
||||||
|
"checksum": "7556bacb20c1fbd059dec165c982908d",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-1429e4eb5400dbc7",
|
"grade_id": "cell-1429e4eb5400dbc7",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX m: <http://learningsparql.com/ns/musician/>\n",
|
"PREFIX m: <http://learningsparql.com/ns/musician/>\n",
|
||||||
@@ -1053,12 +1081,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "bd4dc379fea969d513be0ea97ee75922",
|
"cell_type": "code",
|
||||||
|
"checksum": "34a8432e8d4cea70994c8214ed0e5eb6",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-907aaf6001e27e50",
|
"grade_id": "cell-907aaf6001e27e50",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -1094,7 +1123,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -1115,7 +1144,9 @@
|
|||||||
"Now, use the same principle to get the count of **different** instruments in each song.\n",
|
"Now, use the same principle to get the count of **different** instruments in each song.\n",
|
||||||
"Some songs have several musicians playing the same instrument, but we only care about *different* instruments in each song.\n",
|
"Some songs have several musicians playing the same instrument, but we only care about *different* instruments in each song.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Use `?number` for the count."
|
"Use `?song` for the song and `?number` for the count.\n",
|
||||||
|
"\n",
|
||||||
|
"Take into consideration that instruments are entities of type `i:Instrument`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1124,17 +1155,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "4a231b4d6874dad435512b988c17c39e",
|
"cell_type": "code",
|
||||||
|
"checksum": "3139d9b7e620266946ffe1ae0cf67581",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-ee208c762d00da9c",
|
"grade_id": "cell-ee208c762d00da9c",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
@@ -1144,6 +1176,8 @@
|
|||||||
" [] a s:Song ;\n",
|
" [] a s:Song ;\n",
|
||||||
" rdfs:label ?song ;\n",
|
" rdfs:label ?song ;\n",
|
||||||
" ?instrument ?musician .\n",
|
" ?instrument ?musician .\n",
|
||||||
|
" \n",
|
||||||
|
"?instrument a s:Instrument .\n",
|
||||||
"}\n",
|
"}\n",
|
||||||
"# YOUR ANSWER HERE\n",
|
"# YOUR ANSWER HERE\n",
|
||||||
"ORDER BY DESC(?number)"
|
"ORDER BY DESC(?number)"
|
||||||
@@ -1156,19 +1190,20 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "8118099bf14d9f0eb241c4d93ea6f0b9",
|
"cell_type": "code",
|
||||||
|
"checksum": "5abf6eb7a67ebc9f7612b876105c1960",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-ddeec32b8ac3d894",
|
"grade_id": "cell-ddeec32b8ac3d894",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"s = solution()\n",
|
"s = solution()\n",
|
||||||
"assert s['columns']['number'][0] == '27'"
|
"assert s['columns']['number'][0] == '25'"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1193,7 +1228,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
||||||
@@ -1213,10 +1248,10 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"However, there are some songs that do not have a vocalist (at least, in the dataset).\n",
|
"However, there are some songs that do not have a vocalist (at least, in the dataset).\n",
|
||||||
"Those songs will not appear in the list above, because we they do not match part of the `WHERE` clause.\n",
|
"Those songs will not appear in the list above, because they do not match part of the `WHERE` clause.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"In these cases, we can specify optional values in a query using the `OPTIONAL` keyword.\n",
|
"In these cases, we can specify optional values in a query using the `OPTIONAL` keyword.\n",
|
||||||
"When a set of clauses are inside an OPTIONAL group, the SPARQL endpoint will try to use them in the query.\n",
|
"When a set of clauses are inside an `OPTIONAL` group, the SPARQL endpoint will try to use them in the query.\n",
|
||||||
"If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n",
|
"If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n",
|
||||||
"\n",
|
"\n",
|
||||||
"To exemplify this, we can use a property that **does not exist in the dataset**:"
|
"To exemplify this, we can use a property that **does not exist in the dataset**:"
|
||||||
@@ -1228,7 +1263,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
||||||
@@ -1261,17 +1296,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "4b0a0854457c37640aad67f375ed3a17",
|
"cell_type": "code",
|
||||||
|
"checksum": "3bc508872193750d57d07efbf334c212",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-dcd68c45c1608a28",
|
"grade_id": "cell-dcd68c45c1608a28",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
||||||
@@ -1294,12 +1330,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "f7122b2284b5d59d59ce4a2925f0bb21",
|
"cell_type": "code",
|
||||||
|
"checksum": "69edef3121b8dfab385a00cd181c956f",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-1e706b9c1c1331bc",
|
"grade_id": "cell-1e706b9c1c1331bc",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -1343,17 +1380,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "09621e7af911faf39a834e8281bc6d1f",
|
"cell_type": "code",
|
||||||
|
"checksum": "300df0a3cf9729dd4814b3153b2fedb4",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-0c7cc924a13d792a",
|
"grade_id": "cell-0c7cc924a13d792a",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
||||||
@@ -1379,12 +1417,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "cebff8ce42f3f36923e81e083a23d24c",
|
"cell_type": "code",
|
||||||
|
"checksum": "22d6fcdb72a8b2c5ab496cdbb5e2740a",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-2541abc93ab4d506",
|
"grade_id": "cell-2541abc93ab4d506",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -1416,17 +1455,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "ea9797f3b2d001ea41d7fa7a5170d5fb",
|
"cell_type": "code",
|
||||||
|
"checksum": "e4e898c8a16b8aa5865dfde2f6e68ec6",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-d750b6d64c6aa0a7",
|
"grade_id": "cell-d750b6d64c6aa0a7",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
@@ -1469,7 +1509,9 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"Now, count how many instruments each musician have played in a song.\n",
|
"Now, count how many instruments each musician have played in a song.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**."
|
"**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**.\n",
|
||||||
|
"\n",
|
||||||
|
"Use `?musician` for the musician and `?number` for the count."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1478,17 +1520,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "2d82df272d43f678d3b19bf0b41530c1",
|
"cell_type": "code",
|
||||||
|
"checksum": "fade6ab714376e0eabfa595dd6bd6a8b",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-2f5aa516f8191787",
|
"grade_id": "cell-2f5aa516f8191787",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
@@ -1513,12 +1556,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "bc83dd9577c9111b1f0ef5bd40c4ec08",
|
"cell_type": "code",
|
||||||
|
"checksum": "33e93ec2a3d1f9eb4b0310d4651b74c2",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-bcd0f7e26b6c11c2",
|
"grade_id": "cell-bcd0f7e26b6c11c2",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -1533,7 +1577,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Which songs had Ringo in dums OR Lennon in lead vocals? (UNION)"
|
"### Which songs had Ringo in drums OR Lennon in lead vocals? (UNION)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1567,17 +1611,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "a1e20e2be817a592683dea89eed0120e",
|
"cell_type": "code",
|
||||||
|
"checksum": "09262d81449c498c37e4b9d9b1dcdfed",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-d3a742bd87d9c793",
|
"grade_id": "cell-d3a742bd87d9c793",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
@@ -1597,18 +1642,19 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "087630476d73bb415b065fafbd6024f0",
|
"cell_type": "code",
|
||||||
|
"checksum": "11061e79ec06ccb3a9c496319a528366",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-409402df0e801d09",
|
"grade_id": "cell-409402df0e801d09",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"assert len(solution()['tuples']) == 246"
|
"assert len(solution()['tuples']) == 209"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1648,17 +1694,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "1d2cb88412c89c35861a4f9fccea3bf2",
|
"cell_type": "code",
|
||||||
|
"checksum": "9ddd2d1f50f841b889bfd29b175d06da",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-9d1ec854eb530235",
|
"grade_id": "cell-9d1ec854eb530235",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -1680,12 +1727,13 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "aa20aa4d11632ea5bd6004df3187d979",
|
"cell_type": "code",
|
||||||
|
"checksum": "0ea5496acd1c3edd9e188b351690a533",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-a79c688b4566dbe8",
|
"grade_id": "cell-a79c688b4566dbe8",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"points": 0,
|
"points": 1,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -1729,7 +1777,9 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"Using `GROUP_CONCAT`, get a list of the instruments that each musician could play.\n",
|
"Using `GROUP_CONCAT`, get a list of the instruments that each musician could play.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/)."
|
"You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/).\n",
|
||||||
|
"\n",
|
||||||
|
"Use `?musician` for the musician and `?instruments` for the list of instruments."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1738,17 +1788,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "508b7f8656e849838aa93cd38f1c6635",
|
"cell_type": "code",
|
||||||
|
"checksum": "d18e8b6e1d32aed395a533febb29fcb5",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-7ea1f5154cdd8324",
|
"grade_id": "cell-7ea1f5154cdd8324",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
||||||
@@ -1773,7 +1824,9 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"You can check if a string or URI matches a regular expression with `regex(?variable, \"<regex>\", \"i\")`.\n",
|
"You can check if a string or URI matches a regular expression with `regex(?variable, \"<regex>\", \"i\")`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/)."
|
"The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/).\n",
|
||||||
|
"\n",
|
||||||
|
"Use `?instrument` for the instrument and `?ins` for the url of the type."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1782,17 +1835,18 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "cff1f9c034393f8af055e1f930d5fe32",
|
"cell_type": "code",
|
||||||
|
"checksum": "f926fa3a3568d122454a12312859cda1",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-b6bee887a1b1fc60",
|
"grade_id": "cell-b6bee887a1b1fc60",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
|
"%%sparql http://fuseki.gsi.upm.es/sitc/\n",
|
||||||
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \n",
|
||||||
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
"PREFIX s: <http://learningsparql.com/ns/schema/>\n",
|
||||||
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
"PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
|
||||||
@@ -1830,7 +1884,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -1844,9 +1898,22 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.2"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"toc": {
|
||||||
|
"base_numbering": 1,
|
||||||
|
"nav_menu": {},
|
||||||
|
"number_sections": true,
|
||||||
|
"sideBar": true,
|
||||||
|
"skip_h1_title": false,
|
||||||
|
"title_cell": "Table of Contents",
|
||||||
|
"title_sidebar": "Contents",
|
||||||
|
"toc_cell": false,
|
||||||
|
"toc_position": {},
|
||||||
|
"toc_section_display": true,
|
||||||
|
"toc_window_display": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 2
|
"nbformat_minor": 4
|
||||||
}
|
}
|
||||||
|
@@ -6,11 +6,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
|
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-0bfe38f97f6ab2d2",
|
"grade_id": "cell-0bfe38f97f6ab2d2",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -31,11 +32,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "42642609861283bc33914d16750b7efa",
|
"checksum": "42642609861283bc33914d16750b7efa",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-0cd673883ee592d1",
|
"grade_id": "cell-0cd673883ee592d1",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -59,11 +61,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "a3ecb4b300a5ab82376a4a8cb01f7e6b",
|
"checksum": "a3ecb4b300a5ab82376a4a8cb01f7e6b",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-10264483046abcc4",
|
"grade_id": "cell-10264483046abcc4",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -80,11 +83,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
|
"checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-4f8492996e74bf20",
|
"grade_id": "cell-4f8492996e74bf20",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -100,11 +104,12 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
"checksum": "c5f8646518bd832a47d71f9d3218237a",
|
"checksum": "c5f8646518bd832a47d71f9d3218237a",
|
||||||
"grade": false,
|
"grade": false,
|
||||||
"grade_id": "cell-eb13908482825e42",
|
"grade_id": "cell-eb13908482825e42",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": false
|
"solution": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -148,7 +153,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
|
"%%sparql http://fuseki.gsi.upm.es/hotels\n",
|
||||||
" \n",
|
" \n",
|
||||||
"SELECT ?g (COUNT(?s) as ?count) WHERE {\n",
|
"SELECT ?g (COUNT(?s) as ?count) WHERE {\n",
|
||||||
" GRAPH ?g {\n",
|
" GRAPH ?g {\n",
|
||||||
@@ -160,14 +165,12 @@
|
|||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "markdown",
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
"source": [
|
||||||
"You should see many graphs, with different triple counts.\n",
|
"You should see many graphs, with different triple counts.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The biggest one should be http://fuseki.cluster.gsi.dit.upm.es/synthetic"
|
"The biggest one should be http://fuseki.gsi.upm.es/synthetic"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -183,11 +186,11 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
|
"%%sparql http://fuseki.gsi.upm.es/hotels\n",
|
||||||
" \n",
|
" \n",
|
||||||
"SELECT *\n",
|
"SELECT *\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
" GRAPH <http://fuseki.cluster.gsi.dit.upm.es/synthetic>{\n",
|
" GRAPH <http://fuseki.gsi.upm.es/synthetic>{\n",
|
||||||
" ?s ?p ?o .\n",
|
" ?s ?p ?o .\n",
|
||||||
" }\n",
|
" }\n",
|
||||||
"}\n",
|
"}\n",
|
||||||
@@ -233,13 +236,13 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
|
"%%sparql http://fuseki.gsi.upm.es/hotels\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX schema: <http://schema.org/>\n",
|
"PREFIX schema: <http://schema.org/>\n",
|
||||||
" \n",
|
" \n",
|
||||||
"SELECT ?s ?o\n",
|
"SELECT ?s ?o\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
" GRAPH <http://fuseki.cluster.gsi.dit.upm.es/35c20a49f8c6581be1cf7bd56d12d131>{\n",
|
" GRAPH <http://fuseki.gsi.upm.es/35c20a49f8c6581be1cf7bd56d12d131>{\n",
|
||||||
" ?s a ?o .\n",
|
" ?s a ?o .\n",
|
||||||
" }\n",
|
" }\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -264,11 +267,11 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
|
"%%sparql http://fuseki.gsi.upm.es/hotels\n",
|
||||||
" \n",
|
" \n",
|
||||||
"SELECT *\n",
|
"SELECT *\n",
|
||||||
"WHERE {\n",
|
"WHERE {\n",
|
||||||
" GRAPH <http://fuseki.cluster.gsi.dit.upm.es/synthetic>{\n",
|
" GRAPH <http://fuseki.gsi.upm.es/synthetic>{\n",
|
||||||
" ?s ?p ?o .\n",
|
" ?s ?p ?o .\n",
|
||||||
" }\n",
|
" }\n",
|
||||||
"}\n",
|
"}\n",
|
||||||
@@ -295,7 +298,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
|
"%%sparql http://fuseki.gsi.upm.es/hotels\n",
|
||||||
"\n",
|
"\n",
|
||||||
"PREFIX schema: <http://schema.org/>\n",
|
"PREFIX schema: <http://schema.org/>\n",
|
||||||
" \n",
|
" \n",
|
||||||
@@ -308,7 +311,7 @@
|
|||||||
" SELECT ?g\n",
|
" SELECT ?g\n",
|
||||||
" WHERE {\n",
|
" WHERE {\n",
|
||||||
" GRAPH ?g {}\n",
|
" GRAPH ?g {}\n",
|
||||||
" FILTER (str(?g) != 'http://fuseki.cluster.gsi.dit.upm.es/synthetic')\n",
|
" FILTER (str(?g) != 'http://fuseki.gsi.upm.es/synthetic')\n",
|
||||||
" }\n",
|
" }\n",
|
||||||
" }\n",
|
" }\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -339,12 +342,13 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
"checksum": "860c3977cd06736f1342d535944dbb63",
|
"checksum": "860c3977cd06736f1342d535944dbb63",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-9bd08e4f5842cb89",
|
"grade_id": "cell-9bd08e4f5842cb89",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"points": 0,
|
"points": 0,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -366,12 +370,13 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
"checksum": "1946a7ed4aba8d168bb3fad898c05651",
|
"checksum": "1946a7ed4aba8d168bb3fad898c05651",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-9dc1c9033198bb18",
|
"grade_id": "cell-9dc1c9033198bb18",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"points": 0,
|
"points": 0,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -393,12 +398,13 @@
|
|||||||
"metadata": {
|
"metadata": {
|
||||||
"deletable": false,
|
"deletable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
"checksum": "6714abc5226618b76dc4c1aaed6d1a49",
|
"checksum": "6714abc5226618b76dc4c1aaed6d1a49",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-6c18003ced54be23",
|
"grade_id": "cell-6c18003ced54be23",
|
||||||
"locked": false,
|
"locked": false,
|
||||||
"points": 0,
|
"points": 0,
|
||||||
"schema_version": 1,
|
"schema_version": 3,
|
||||||
"solution": true
|
"solution": true
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
@@ -435,7 +441,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -449,7 +455,20 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.2"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"toc": {
|
||||||
|
"base_numbering": 1,
|
||||||
|
"nav_menu": {},
|
||||||
|
"number_sections": true,
|
||||||
|
"sideBar": true,
|
||||||
|
"skip_h1_title": false,
|
||||||
|
"title_cell": "Table of Contents",
|
||||||
|
"title_sidebar": "Contents",
|
||||||
|
"toc_cell": false,
|
||||||
|
"toc_position": {},
|
||||||
|
"toc_section_display": true,
|
||||||
|
"toc_window_display": false
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
1429
lod/03_SPARQL_Writers.ipynb
Normal file
652
lod/04_SPARQL_Advanced.ipynb
Normal file
@@ -0,0 +1,652 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-0bfe38f97f6ab2d2",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"<header style=\"width:100%;position:relative\">\n",
|
||||||
|
" <div style=\"width:80%;float:right;\">\n",
|
||||||
|
" <h1>Course Notes for Learning Intelligent Systems</h1>\n",
|
||||||
|
" <h3>Department of Telematic Engineering Systems</h3>\n",
|
||||||
|
" <h5>Universidad Politécnica de Madrid</h5>\n",
|
||||||
|
" </div>\n",
|
||||||
|
" <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
|
||||||
|
"</header>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "bd478e6253226d24ba7f33cb9f6ba706",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-0cd673883ee592d1",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Advanced SPARQL\n",
|
||||||
|
"\n",
|
||||||
|
"This notebook complements [the SPARQL notebook](./01_SPARQL.ipynb) with some advanced commands.\n",
|
||||||
|
"\n",
|
||||||
|
"If you have not completed the exercises in the previous notebook, please do so before continuing.\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "9ea4fd529653214745b937d5fc4559e5",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-10264483046abcc4",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Objectives\n",
|
||||||
|
"\n",
|
||||||
|
"* To cover some SPARQL concepts that are less frequently used "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-4f8492996e74bf20",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Tools\n",
|
||||||
|
"\n",
|
||||||
|
"See [the SPARQL notebook](./01_SPARQL_Introduction.ipynb#Tools)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "c5f8646518bd832a47d71f9d3218237a",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-eb13908482825e42",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"Run this line to enable the `%%sparql` magic command."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from helpers import *"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Exercises"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Working with dates"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"To explore dates, we will focus on our Writers example.\n",
|
||||||
|
"\n",
|
||||||
|
"First, search for writers born in the XX century.\n",
|
||||||
|
"You can use a special filter, knowing that `\"2000\"^^xsd:date` is the first date of year 2000."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "1a23c8b9a53f7ae28f28b1c23b9706b5",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-ab7755944d46f9ca",
|
||||||
|
"locked": false,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
|
"PREFIX dct: <http://purl.org/dc/terms/>\n",
|
||||||
|
"PREFIX dbc: <http://dbpedia.org/resource/Category:>\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?escritor ?nombre (year(?fechaNac) as ?nac)\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?escritor dct:subject dbc:Spanish_novelists ;\n",
|
||||||
|
" rdfs:label ?nombre ;\n",
|
||||||
|
" dbo:birthDate ?fechaNac .\n",
|
||||||
|
" FILTER(lang(?nombre) = \"es\") .\n",
|
||||||
|
" # YOUR ANSWER HERE\n",
|
||||||
|
"}\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
"LIMIT 1000"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "e261d808f509c1e29227db94d1eab784",
|
||||||
|
"grade": true,
|
||||||
|
"grade_id": "cell-cf3821f2d33fb0f6",
|
||||||
|
"locked": true,
|
||||||
|
"points": 0,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"assert 'Ramiro Ledesma' in solution()['columns']['nombre']\n",
|
||||||
|
"assert 'Ray Loriga' in solution()['columns']['nombre']\n",
|
||||||
|
"assert all(int(x) > 1899 and int(x) < 2001 for x in solution()['columns']['nac'])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Now, get the list of Spanish novelists that are still alive.\n",
|
||||||
|
"\n",
|
||||||
|
"A person is alive if their death date is not defined and the were born less than 100 years ago.\n",
|
||||||
|
"\n",
|
||||||
|
"Remember, we can check whether the optional value for a key was bound in a SPARQL query using `BOUND(?key)`."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "e4579d551790c33ba4662562c6a05d99",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-474b1a72dec6827c",
|
||||||
|
"locked": false,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
||||||
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
||||||
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac\n",
|
||||||
|
"\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?escritor dct:subject dbc:Spanish_novelists .\n",
|
||||||
|
" ?escritor rdfs:label ?nombre .\n",
|
||||||
|
" ?escritor dbo:birthDate ?fechaNac .\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
" FILTER(lang(?nombre) = \"es\") .\n",
|
||||||
|
"}\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
"LIMIT 1000"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "770bbddef5210c28486a1929e4513ada",
|
||||||
|
"grade": true,
|
||||||
|
"grade_id": "cell-46b62dd2856bc919",
|
||||||
|
"locked": true,
|
||||||
|
"points": 0,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"assert 'Fernando Arrabal' in solution()['columns']['nombre']\n",
|
||||||
|
"assert 'Albert Espinosa' in solution()['columns']['nombre']\n",
|
||||||
|
"for year in solution()['columns']['nac']:\n",
|
||||||
|
" assert int(year) >= 1918"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Working with badly formatted dates (OPTIONAL!)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Now, get the list of Spanish novelists that died before their fifties (i.e. younger than 50 years old), or that aren't 50 years old yet.\n",
|
||||||
|
"\n",
|
||||||
|
"For the sake of simplicity, you can use the `year(<date>)` function.\n",
|
||||||
|
"\n",
|
||||||
|
"Hint: you can use boolean logic in your filters (e.g. `&&` and `||`).\n",
|
||||||
|
"\n",
|
||||||
|
"Hint 2: Some dates are not formatted properly, which makes some queries fail when they shouldn't. As a workaround, you could convert the date to string, and back to date again: `xsd:dateTime(str(?date))`."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "e55173801ab36337ad356a1bc286dbd1",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-ceefd3c8fbd39d79",
|
||||||
|
"locked": false,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
||||||
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
||||||
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac, ?fechaDef\n",
|
||||||
|
"\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?escritor dct:subject dbc:Spanish_novelists .\n",
|
||||||
|
" ?escritor rdfs:label ?nombre .\n",
|
||||||
|
" ?escritor dbo:birthDate ?fechaNac .\n",
|
||||||
|
" # YOUR ANSWER HERE\n",
|
||||||
|
"}\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
"LIMIT 100"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "1b77cfaefb8b2ec286ce7b0c70804fe0",
|
||||||
|
"grade": true,
|
||||||
|
"grade_id": "cell-461cd6ccc6c2dc79",
|
||||||
|
"locked": true,
|
||||||
|
"points": 0,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"assert 'Javier Sierra' in solution()['columns']['nombre']\n",
|
||||||
|
"assert 'http://dbpedia.org/resource/José_Ángel_Mañas' in solution()['columns']['escritor']"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Regular expressions"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"[Regular expressions](https://www.w3.org/TR/rdf-sparql-query/#funcex-regex) are a very powerful tool, but we will only cover the basics in this exercise.\n",
|
||||||
|
"\n",
|
||||||
|
"In essence, regular expressions match strings against patterns.\n",
|
||||||
|
"In their simplest form, they can be used to find substrings within a variable.\n",
|
||||||
|
"For instance, using `regex(?label, \"substring\")` would only match if and only if the `?label` variable contains `substring`.\n",
|
||||||
|
"But regular expressions can be more complex than that.\n",
|
||||||
|
"For instance, we can find patterns such as: a 10 digit number, a 5 character long string, or variables without whitespaces.\n",
|
||||||
|
"\n",
|
||||||
|
"The syntax of the regex function is the following:\n",
|
||||||
|
"\n",
|
||||||
|
"```\n",
|
||||||
|
"regex(?variable, \"pattern\", \"flags\")\n",
|
||||||
|
"```\n",
|
||||||
|
"\n",
|
||||||
|
"Flags are optional configuration options for the regular expression, such as *do not care about case* (`i` flag).\n",
|
||||||
|
"\n",
|
||||||
|
"As an example, let us find the cities in Madrid that contain \"de\" in their name."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?localidad\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?localidad <http://dbpedia.org/ontology/isPartOf> <http://dbpedia.org/resource/Community_of_Madrid> .\n",
|
||||||
|
" ?localidad rdfs:label ?nombre .\n",
|
||||||
|
" FILTER (lang(?nombre) = \"es\" ).\n",
|
||||||
|
" FILTER regex(?nombre, \"de\", \"i\")\n",
|
||||||
|
"}\n",
|
||||||
|
"LIMIT 10"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Now, use regular expressions to find Spanish novelists whose **first name** is Juan.\n",
|
||||||
|
"In other words, their name **starts with** \"Juan\"."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "b70a9a4f102c253e864d2e8aec79ce81",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-a57d3546a812f689",
|
||||||
|
"locked": false,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
||||||
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
||||||
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
||||||
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
||||||
|
"\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
"\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" {\n",
|
||||||
|
" ?escritor dct:subject dbc:Spanish_poets .\n",
|
||||||
|
" }\n",
|
||||||
|
" UNION {\n",
|
||||||
|
" ?escritor dct:subject dbc:Spanish_novelists .\n",
|
||||||
|
" }\n",
|
||||||
|
" ?escritor rdfs:label ?nombre\n",
|
||||||
|
" FILTER(lang(?nombre) = \"es\") .\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
"}\n",
|
||||||
|
"ORDER BY ?nombre\n",
|
||||||
|
"LIMIT 1000"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "66db9abddfafa91c2dc25577457f71fb",
|
||||||
|
"grade": true,
|
||||||
|
"grade_id": "cell-c149fe65008f39a9",
|
||||||
|
"locked": true,
|
||||||
|
"points": 0,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"assert len(solution()['columns']['nombre']) > 15\n",
|
||||||
|
"for i in solution()['columns']['nombre']:\n",
|
||||||
|
" assert 'Juan' in i\n",
|
||||||
|
"assert \"Robert Juan-Cantavella\" not in solution()['columns']['nombre']"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "1be6d6e4d8e74240ef07deffcbe5e71a",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-0c2f0113d97dc9de",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Group concat"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "c8dbb73a781bd24080804f289a1cea0b",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "asdasdasdddddddddddasdasdsad",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"Sometimes, it is useful to aggregate results from form different rows.\n",
|
||||||
|
"For instance, we might want to get a comma-separated list of the names in each each autonomous community in Spain.\n",
|
||||||
|
"\n",
|
||||||
|
"In those cases, we can use the `GROUP_CONCAT` function."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
||||||
|
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
||||||
|
" \n",
|
||||||
|
"SELECT ?com, GROUP_CONCAT(?name, \",\") as ?places # notice how we rename the variable\n",
|
||||||
|
"\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?com dct:subject dbc:Autonomous_communities_of_Spain .\n",
|
||||||
|
" ?localidad dbo:subdivision ?com ;\n",
|
||||||
|
" rdfs:label ?name .\n",
|
||||||
|
" FILTER (lang(?name)=\"es\")\n",
|
||||||
|
"}\n",
|
||||||
|
"\n",
|
||||||
|
"ORDER BY ?com\n",
|
||||||
|
"LIMIT 100"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"editable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"checksum": "4779fb61645634308d0ed01e0c88e8a4",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "asdiopjasdoijasdoijasd",
|
||||||
|
"locked": true,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"Try it yourself, to get a list of works by each of the authors in this query:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"deletable": false,
|
||||||
|
"nbgrader": {
|
||||||
|
"cell_type": "code",
|
||||||
|
"checksum": "e5d87d1d8eba51c510241ba75981a597",
|
||||||
|
"grade": false,
|
||||||
|
"grade_id": "cell-2e3de17c75047652",
|
||||||
|
"locked": false,
|
||||||
|
"schema_version": 3,
|
||||||
|
"solution": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql https://dbpedia.org/sparql\n",
|
||||||
|
"\n",
|
||||||
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
||||||
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
||||||
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
||||||
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
||||||
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
||||||
|
"\n",
|
||||||
|
"# YOUR ANSWER HERE\n",
|
||||||
|
"\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?escritor a dbo:Writer .\n",
|
||||||
|
" ?escritor rdfs:label ?nombre .\n",
|
||||||
|
" ?escritor dbo:birthDate ?fechaNac .\n",
|
||||||
|
" ?escritor dbo:birthPlace dbr:Madrid .\n",
|
||||||
|
" # YOUR ANSWER HERE\n",
|
||||||
|
" FILTER(lang(?nombre) = \"es\") .\n",
|
||||||
|
" FILTER(!bound(?titulo) || lang(?titulo) = \"en\") .\n",
|
||||||
|
"\n",
|
||||||
|
"}\n",
|
||||||
|
"ORDER BY ?nombre\n",
|
||||||
|
"LIMIT 100"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## References"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": []
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Licence\n",
|
||||||
|
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||||
|
"\n",
|
||||||
|
"© 2018 Universidad Politécnica de Madrid."
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.8.10"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 2
|
||||||
|
}
|
@@ -12,6 +12,7 @@ from urllib.request import Request, urlopen
|
|||||||
from urllib.parse import quote_plus, urlencode
|
from urllib.parse import quote_plus, urlencode
|
||||||
from urllib.error import HTTPError
|
from urllib.error import HTTPError
|
||||||
|
|
||||||
|
import ssl
|
||||||
import json
|
import json
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
@@ -20,7 +21,9 @@ display_javascript(js, raw=True)
|
|||||||
|
|
||||||
|
|
||||||
def send_query(query, endpoint):
|
def send_query(query, endpoint):
|
||||||
FORMATS = ",".join(["application/sparql-results+json", "text/javascript", "application/json"])
|
FORMATS = ",".join(["application/sparql-results+json",
|
||||||
|
"text/javascript",
|
||||||
|
"application/json"])
|
||||||
|
|
||||||
data = {'query': query}
|
data = {'query': query}
|
||||||
# b = quote_plus(query)
|
# b = quote_plus(query)
|
||||||
@@ -30,10 +33,18 @@ def send_query(query, endpoint):
|
|||||||
headers={'content-type': 'application/x-www-form-urlencoded',
|
headers={'content-type': 'application/x-www-form-urlencoded',
|
||||||
'accept': FORMATS},
|
'accept': FORMATS},
|
||||||
method='POST')
|
method='POST')
|
||||||
res = urlopen(r)
|
context = ssl.create_default_context()
|
||||||
|
context.check_hostname = False
|
||||||
|
context.verify_mode = ssl.CERT_NONE
|
||||||
|
|
||||||
|
res = urlopen(r, context=context, timeout=2)
|
||||||
data = res.read().decode('utf-8')
|
data = res.read().decode('utf-8')
|
||||||
if res.getcode() == 200:
|
if res.getcode() == 200:
|
||||||
|
try:
|
||||||
return json.loads(data)
|
return json.loads(data)
|
||||||
|
except Exception:
|
||||||
|
print('Got: ', data, file=sys.stderr)
|
||||||
|
raise
|
||||||
raise Exception('Error getting results: {}'.format(data))
|
raise Exception('Error getting results: {}'.format(data))
|
||||||
|
|
||||||
|
|
||||||
@@ -60,7 +71,7 @@ def solution():
|
|||||||
def query(query, endpoint=None, print_table=False):
|
def query(query, endpoint=None, print_table=False):
|
||||||
global LAST_QUERY
|
global LAST_QUERY
|
||||||
|
|
||||||
endpoint = endpoint or "http://fuseki.cluster.gsi.dit.upm.es/sitc/"
|
endpoint = endpoint or "http://fuseki.gsi.upm.es/sitc/"
|
||||||
results = send_query(query, endpoint)
|
results = send_query(query, endpoint)
|
||||||
tuples = to_table(results)
|
tuples = to_table(results)
|
||||||
|
|
||||||
|
0
lod/tests.py
Normal file
61
lod/tutorial/css/github.css
Normal file
@@ -0,0 +1,61 @@
|
|||||||
|
.highlight { padding-top: 0; margin: 0;}
|
||||||
|
.highlight .c { color: #999988; font-style: italic } /* Comment */
|
||||||
|
.highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */
|
||||||
|
.highlight .k { color: #000000; font-weight: bold } /* Keyword */
|
||||||
|
.highlight .o { color: #000000; font-weight: bold } /* Operator */
|
||||||
|
.highlight .cm { color: #999988; font-style: italic } /* Comment.Multiline */
|
||||||
|
.highlight .cp { color: #999999; font-weight: bold; font-style: italic } /* Comment.Preproc */
|
||||||
|
.highlight .c1 { color: #999988; font-style: italic } /* Comment.Single */
|
||||||
|
.highlight .cs { color: #999999; font-weight: bold; font-style: italic } /* Comment.Special */
|
||||||
|
.highlight .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */
|
||||||
|
.highlight .ge { color: #000000; font-style: italic } /* Generic.Emph */
|
||||||
|
.highlight .gr { color: #aa0000 } /* Generic.Error */
|
||||||
|
.highlight .gh { color: #999999 } /* Generic.Heading */
|
||||||
|
.highlight .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */
|
||||||
|
.highlight .go { color: #888888 } /* Generic.Output */
|
||||||
|
.highlight .gp { color: #555555 } /* Generic.Prompt */
|
||||||
|
.highlight .gs { font-weight: bold } /* Generic.Strong */
|
||||||
|
.highlight .gu { color: #aaaaaa } /* Generic.Subheading */
|
||||||
|
.highlight .gt { color: #aa0000 } /* Generic.Traceback */
|
||||||
|
.highlight .kc { color: #000000; font-weight: bold } /* Keyword.Constant */
|
||||||
|
.highlight .kd { color: #000000; font-weight: bold } /* Keyword.Declaration */
|
||||||
|
.highlight .kn { color: #000000; font-weight: bold } /* Keyword.Namespace */
|
||||||
|
.highlight .kp { color: #000000; font-weight: bold } /* Keyword.Pseudo */
|
||||||
|
.highlight .kr { color: #000000; font-weight: bold } /* Keyword.Reserved */
|
||||||
|
.highlight .kt { color: #445588; font-weight: bold } /* Keyword.Type */
|
||||||
|
.highlight .m { color: #009999 } /* Literal.Number */
|
||||||
|
.highlight .s { color: #d01040 } /* Literal.String */
|
||||||
|
.highlight .na { color: #008080 } /* Name.Attribute */
|
||||||
|
.highlight .nb { color: #0086B3 } /* Name.Builtin */
|
||||||
|
.highlight .nc { color: #445588; font-weight: bold } /* Name.Class */
|
||||||
|
.highlight .no { color: #008080 } /* Name.Constant */
|
||||||
|
.highlight .nd { color: #3c5d5d; font-weight: bold } /* Name.Decorator */
|
||||||
|
.highlight .ni { color: #800080 } /* Name.Entity */
|
||||||
|
.highlight .ne { color: #990000; font-weight: bold } /* Name.Exception */
|
||||||
|
.highlight .nf { color: #990000; font-weight: bold } /* Name.Function */
|
||||||
|
.highlight .nl { color: #990000; font-weight: bold } /* Name.Label */
|
||||||
|
.highlight .nn { color: #555555 } /* Name.Namespace */
|
||||||
|
.highlight .nt { color: #000080 } /* Name.Tag */
|
||||||
|
.highlight .nv { color: #008080 } /* Name.Variable */
|
||||||
|
.highlight .ow { color: #000000; font-weight: bold } /* Operator.Word */
|
||||||
|
.highlight .w { color: #bbbbbb } /* Text.Whitespace */
|
||||||
|
.highlight .mf { color: #009999 } /* Literal.Number.Float */
|
||||||
|
.highlight .mh { color: #009999 } /* Literal.Number.Hex */
|
||||||
|
.highlight .mi { color: #009999 } /* Literal.Number.Integer */
|
||||||
|
.highlight .mo { color: #009999 } /* Literal.Number.Oct */
|
||||||
|
.highlight .sb { color: #d01040 } /* Literal.String.Backtick */
|
||||||
|
.highlight .sc { color: #d01040 } /* Literal.String.Char */
|
||||||
|
.highlight .sd { color: #d01040 } /* Literal.String.Doc */
|
||||||
|
.highlight .s2 { color: #d01040 } /* Literal.String.Double */
|
||||||
|
.highlight .se { color: #d01040 } /* Literal.String.Escape */
|
||||||
|
.highlight .sh { color: #d01040 } /* Literal.String.Heredoc */
|
||||||
|
.highlight .si { color: #d01040 } /* Literal.String.Interpol */
|
||||||
|
.highlight .sx { color: #d01040 } /* Literal.String.Other */
|
||||||
|
.highlight .sr { color: #009926 } /* Literal.String.Regex */
|
||||||
|
.highlight .s1 { color: #d01040 } /* Literal.String.Single */
|
||||||
|
.highlight .ss { color: #990073 } /* Literal.String.Symbol */
|
||||||
|
.highlight .bp { color: #999999 } /* Name.Builtin.Pseudo */
|
||||||
|
.highlight .vc { color: #008080 } /* Name.Variable.Class */
|
||||||
|
.highlight .vg { color: #008080 } /* Name.Variable.Global */
|
||||||
|
.highlight .vi { color: #008080 } /* Name.Variable.Instance */
|
||||||
|
.highlight .il { color: #009999 } /* Literal.Number.Integer.Long */
|
968
lod/tutorial/css/style.css
Normal file
@@ -0,0 +1,968 @@
|
|||||||
|
@media screen {
|
||||||
|
|
||||||
|
body {
|
||||||
|
font-family: 'Quattrocento', Verdana, sans-serif;
|
||||||
|
font-size:16px;
|
||||||
|
background-color:#ffffff;
|
||||||
|
}
|
||||||
|
|
||||||
|
.container {
|
||||||
|
max-width: 48rem;
|
||||||
|
overflow: hidden;
|
||||||
|
text-overflow: ellipsis;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Helper classes
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
.noclear {
|
||||||
|
clear:none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.expanded {
|
||||||
|
max-width: 58rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.garnish {
|
||||||
|
width: 23%;
|
||||||
|
padding:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.full-width {
|
||||||
|
width:80%;
|
||||||
|
margin: 0 auto;
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
.float-right {
|
||||||
|
float:right;
|
||||||
|
margin-left: 1rem;
|
||||||
|
margin-bottom: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.float-left {
|
||||||
|
margin-right: 1rem;
|
||||||
|
margin-bottom: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Home Page
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
.home-block {
|
||||||
|
padding:3rem 0;
|
||||||
|
color:#666;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-block h2 {
|
||||||
|
margin:0;
|
||||||
|
font-size:2.8rem;
|
||||||
|
color:#333;
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-block p {
|
||||||
|
margin:0rem;
|
||||||
|
font-family:'Open Sans';
|
||||||
|
font-size:1.2rem;
|
||||||
|
padding-top:2rem;
|
||||||
|
text-align:justify;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-block a:visited {
|
||||||
|
color: #38c;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-stripe-1 {
|
||||||
|
color:#eee;
|
||||||
|
background:#27b;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-stripe-1 h2, .home-stripe-2 h2 {
|
||||||
|
color:#fff;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-stripe-1 a:visited, .home-stripe-1 a:link {
|
||||||
|
color:#6bf;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-stripe-2 {
|
||||||
|
color:#fff;
|
||||||
|
background:#289;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-stripe-2 a:visited, .home-stripe-2 a:link {
|
||||||
|
color:#6cd;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-image {
|
||||||
|
width: 75%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-logo img {
|
||||||
|
width: 200px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-logo a h1 {
|
||||||
|
color: #fff;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-logo {
|
||||||
|
color: #fff;
|
||||||
|
}
|
||||||
|
|
||||||
|
.home-logo li {
|
||||||
|
font-size: 1.2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.en-back {
|
||||||
|
background-color: #444444;
|
||||||
|
}
|
||||||
|
|
||||||
|
.es-back {
|
||||||
|
background-color: #535D7F;
|
||||||
|
}
|
||||||
|
|
||||||
|
.fr-back {
|
||||||
|
background-color: #3D7C81;
|
||||||
|
}
|
||||||
|
|
||||||
|
.pt-back {
|
||||||
|
background-color: #d6b664;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
.sitewide-alert {
|
||||||
|
position: relative;
|
||||||
|
margin-bottom: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Lesson Headers
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
header {
|
||||||
|
margin:-3rem 0 3rem 0;
|
||||||
|
padding:0;
|
||||||
|
font-family:'Roboto', sans-serif;
|
||||||
|
color:#ccc;
|
||||||
|
background: #efefef;
|
||||||
|
border-top:1px solid #333;
|
||||||
|
border-bottom:1px solid #333;
|
||||||
|
text-align:left;
|
||||||
|
}
|
||||||
|
|
||||||
|
header .container-fluid {
|
||||||
|
margin:0;
|
||||||
|
padding:1rem;
|
||||||
|
background: #f5f5f5;
|
||||||
|
}
|
||||||
|
|
||||||
|
header h1 {
|
||||||
|
margin:0;
|
||||||
|
padding:0;
|
||||||
|
font-size:1.8rem;
|
||||||
|
text-align:left;
|
||||||
|
}
|
||||||
|
|
||||||
|
header h2 {
|
||||||
|
font-family:'Roboto', sans-serif;
|
||||||
|
font-size:1.2rem;
|
||||||
|
color:#333;
|
||||||
|
margin: 1.5rem 0 1.5rem 0rem;
|
||||||
|
text-align:left;
|
||||||
|
}
|
||||||
|
|
||||||
|
header h3, header h4 {
|
||||||
|
font: .9rem/1.1rem 'Roboto Condensed', sans-serif;
|
||||||
|
text-transform:uppercase;
|
||||||
|
font-variant:small-caps;
|
||||||
|
letter-spacing:80%;
|
||||||
|
color:#666;
|
||||||
|
margin:.3rem 0 0 0;
|
||||||
|
padding:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
header h4 {
|
||||||
|
display:inline;
|
||||||
|
margin:0;
|
||||||
|
line-height:1.3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
header .header-image {
|
||||||
|
float:left;
|
||||||
|
border:.2rem solid gray;
|
||||||
|
margin:0;
|
||||||
|
padding:0;
|
||||||
|
max-width: 200px;
|
||||||
|
}
|
||||||
|
|
||||||
|
header .header-abstract {
|
||||||
|
font: 1rem/1.4rem 'Roboto', sans-serif;
|
||||||
|
color:#666;
|
||||||
|
margin:1rem 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
header .header-helpers {
|
||||||
|
clear:both;
|
||||||
|
background:#ccc;
|
||||||
|
color:#fff;
|
||||||
|
border-top:1px solid #999;
|
||||||
|
border-bottom:1px solid #999;
|
||||||
|
}
|
||||||
|
|
||||||
|
header ul {
|
||||||
|
margin:0;
|
||||||
|
padding:0;
|
||||||
|
list-style-type: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
header li, header .metarow {
|
||||||
|
font: .9rem/1.1rem 'Roboto Condensed';
|
||||||
|
}
|
||||||
|
|
||||||
|
header .metarow {
|
||||||
|
color:#999;
|
||||||
|
}
|
||||||
|
|
||||||
|
header .peer-review, header .open-license {
|
||||||
|
font-size: 0.9rem;
|
||||||
|
color: #666;
|
||||||
|
margin: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Lessons Index
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
/*****************
|
||||||
|
FILTER BUTTONS
|
||||||
|
******************/
|
||||||
|
ul.filter, ul.sort-by {
|
||||||
|
margin: 0 0 1rem 0;
|
||||||
|
padding: 0px;
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
li.filter,
|
||||||
|
li.sort,
|
||||||
|
#filter-none {
|
||||||
|
font: .9rem/1.1rem 'Open Sans', sans-serif;
|
||||||
|
padding: .4rem .6rem;
|
||||||
|
border:none;
|
||||||
|
border-radius: 3px;
|
||||||
|
display:inline-block;
|
||||||
|
text-transform:uppercase;
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.filter li:hover,
|
||||||
|
.sort-by li:hover,
|
||||||
|
#filter-none:hover {
|
||||||
|
cursor: pointer;
|
||||||
|
}
|
||||||
|
|
||||||
|
.activities li.current:hover,
|
||||||
|
.filter li.current:hover,
|
||||||
|
.sort-by li.current:hover {
|
||||||
|
cursor:default;
|
||||||
|
}
|
||||||
|
|
||||||
|
.topic li a {
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.activities li {
|
||||||
|
background-color:#38c;
|
||||||
|
color:#fff;
|
||||||
|
}
|
||||||
|
|
||||||
|
.activities li:hover {
|
||||||
|
background-color:#16a;
|
||||||
|
}
|
||||||
|
|
||||||
|
.activities li.current {
|
||||||
|
background-color:#059;
|
||||||
|
}
|
||||||
|
|
||||||
|
.topics li {
|
||||||
|
background-color:#eee;
|
||||||
|
color: #38a;
|
||||||
|
}
|
||||||
|
|
||||||
|
.topics li:hover {
|
||||||
|
background-color:#ccc;
|
||||||
|
}
|
||||||
|
|
||||||
|
.topics li.current {
|
||||||
|
background-color:#aaa;
|
||||||
|
color: #333;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
#filter-none {
|
||||||
|
width:99.5%;
|
||||||
|
clear:both;
|
||||||
|
text-align:center;
|
||||||
|
margin-bottom:1rem;
|
||||||
|
background-color:#fefefe;
|
||||||
|
color:#666;
|
||||||
|
border:1px solid #999;
|
||||||
|
}
|
||||||
|
|
||||||
|
#filter-none:hover {
|
||||||
|
background-color:#ededed;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*****************
|
||||||
|
SEARCH
|
||||||
|
*****************/
|
||||||
|
|
||||||
|
.search-input {
|
||||||
|
width:55%;
|
||||||
|
clear:both;
|
||||||
|
margin-bottom:1rem;
|
||||||
|
background-color:#fefefe;
|
||||||
|
color:#666;
|
||||||
|
border:1px solid #999;
|
||||||
|
font: .9rem/1.1rem 'Open Sans',
|
||||||
|
sans-serif;
|
||||||
|
padding: .4rem .6rem;
|
||||||
|
border-radius: 3px;
|
||||||
|
display:inline-block;
|
||||||
|
text-transform:uppercase;
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
#search-button,
|
||||||
|
#enable-search-button {
|
||||||
|
background-color: #efefef;
|
||||||
|
color: rgb(153, 143, 143);
|
||||||
|
width: 35%;
|
||||||
|
font: .9rem/1.1rem 'Open Sans',
|
||||||
|
sans-serif;
|
||||||
|
padding: .4rem .6rem;
|
||||||
|
border: none;
|
||||||
|
border-radius: 3px;
|
||||||
|
display: inline-block;
|
||||||
|
text-transform: uppercase;
|
||||||
|
text-decoration: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
@media only screen and (max-width: 767px) {
|
||||||
|
/* phones */
|
||||||
|
#search-button,
|
||||||
|
#enable-search-button {
|
||||||
|
width: 80%;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
#search-info-button {
|
||||||
|
padding: 0.5rem;
|
||||||
|
color: rgb(153, 143, 143);
|
||||||
|
}
|
||||||
|
|
||||||
|
#search-info {
|
||||||
|
display: none;
|
||||||
|
height:0px;
|
||||||
|
background:#efefef;
|
||||||
|
overflow:hidden;
|
||||||
|
transition:0.5s;
|
||||||
|
-webkit-transition:0.5s;
|
||||||
|
width: 100%;
|
||||||
|
text-align: left;
|
||||||
|
box-sizing: border-box;
|
||||||
|
}
|
||||||
|
|
||||||
|
#search-info.visible {
|
||||||
|
display: block;
|
||||||
|
height: fit-content;
|
||||||
|
height: -moz-max-content;
|
||||||
|
padding: 10px;
|
||||||
|
margin-top: 10px;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*****************
|
||||||
|
SORT BUTTONS
|
||||||
|
*****************/
|
||||||
|
|
||||||
|
li.sort {
|
||||||
|
background-color: #efefef;
|
||||||
|
color:#666;
|
||||||
|
width:49.5%;
|
||||||
|
}
|
||||||
|
|
||||||
|
li.sort:hover {
|
||||||
|
text-decoration: none;
|
||||||
|
background-color:#cecece;
|
||||||
|
}
|
||||||
|
|
||||||
|
#current-sort {
|
||||||
|
font-size:75%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.sort.my-desc:after, .sort-desc:after {
|
||||||
|
width: 0;
|
||||||
|
height: 0;
|
||||||
|
border-left: .4rem solid transparent;
|
||||||
|
border-right: .4rem solid transparent;
|
||||||
|
border-top: .4rem solid;
|
||||||
|
content:"";
|
||||||
|
position: relative;
|
||||||
|
top:.75rem;
|
||||||
|
right:-.3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.sort.my-asc:after, .sort-asc:after {
|
||||||
|
width: 0;
|
||||||
|
height: 0;
|
||||||
|
border-left: .4rem solid transparent;
|
||||||
|
border-right: .4rem solid transparent;
|
||||||
|
border-bottom: .4rem solid;
|
||||||
|
content:"";
|
||||||
|
position: relative;
|
||||||
|
bottom:.75rem;
|
||||||
|
right:-.3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.sort-desc:after {
|
||||||
|
top:1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.sort-asc:after {
|
||||||
|
bottom:1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*****************************
|
||||||
|
LESSON INDEX RESULTS LIST
|
||||||
|
*****************************/
|
||||||
|
|
||||||
|
h2.results-title {
|
||||||
|
margin:1rem 0;
|
||||||
|
font: 1.6rem/2rem 'Roboto Condensed';
|
||||||
|
color:#666;
|
||||||
|
text-transform:uppercase;
|
||||||
|
}
|
||||||
|
|
||||||
|
#results-value {
|
||||||
|
color:#000;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
#lesson-list .list ul {
|
||||||
|
margin:0;
|
||||||
|
padding:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
#lesson-list .list li {
|
||||||
|
list-style-type:none;
|
||||||
|
margin:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
.lesson-description {
|
||||||
|
margin-bottom:2rem;
|
||||||
|
padding:0rem;
|
||||||
|
min-height:120px;
|
||||||
|
text-align:left;
|
||||||
|
}
|
||||||
|
|
||||||
|
.lesson-description img {
|
||||||
|
width:100%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.lesson-image {
|
||||||
|
width:120px;
|
||||||
|
float:left;
|
||||||
|
margin-right:1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.above-title {
|
||||||
|
margin:0 0 .2rem 0;
|
||||||
|
font: .8rem/1rem 'Roboto Condensed';
|
||||||
|
color:#999;
|
||||||
|
text-transform:uppercase;
|
||||||
|
clear:none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.lesson-description h2.title {
|
||||||
|
font: 1.2rem/1.3rem 'Crete Round', serif;
|
||||||
|
margin:0 0 .8rem 0;
|
||||||
|
clear:none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.list .date,
|
||||||
|
.lesson-description .activity,
|
||||||
|
.lesson-description .topics,
|
||||||
|
.lesson-description .difficulty {
|
||||||
|
display: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
#pre-loader {
|
||||||
|
visibility: hidden;
|
||||||
|
display: flex;
|
||||||
|
justify-content: center;
|
||||||
|
align-items: center;
|
||||||
|
height: 100vh;
|
||||||
|
width: 100%;
|
||||||
|
position: fixed;
|
||||||
|
top: 0;
|
||||||
|
left: 0;
|
||||||
|
z-index: 9999;
|
||||||
|
transition: opacity 0.3s linear;
|
||||||
|
background: rgba(211, 211, 211, 0.8);
|
||||||
|
}
|
||||||
|
/* =============================================================================
|
||||||
|
Top Navigation Bar
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
.navbar {
|
||||||
|
padding: .6rem 1rem;
|
||||||
|
margin: 0 0 3rem 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.navbar-dark .navbar-nav .nav-link {
|
||||||
|
font-family:'Open Sans';
|
||||||
|
text-transform:uppercase;
|
||||||
|
color:#fff;
|
||||||
|
font-size:.9rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.btn-group > .btn-secondary {
|
||||||
|
border-color: #333333;
|
||||||
|
background-color: #888888;
|
||||||
|
}
|
||||||
|
|
||||||
|
.lang {
|
||||||
|
text-transform:lowercase !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
.navbar-dark .navbar-nav .nav-link:hover, .navbar-dark .navbar-brand:hover {
|
||||||
|
color:#39a;
|
||||||
|
}
|
||||||
|
|
||||||
|
.navbar-toggler-icon {
|
||||||
|
background-image: url("data:image/svg+xml;charset=utf8,%3Csvg viewBox='0 0 32 32' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath stroke='rgba(255,255,255, 1)' stroke-width='2' stroke-linecap='round' stroke-miterlimit='10' d='M4 8h24M4 16h24M4 24h24'/%3E%3C/svg%3E");
|
||||||
|
}
|
||||||
|
|
||||||
|
.navbar-collapse {
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
.navbar-dark .navbar-brand {
|
||||||
|
font-family:'Crete Round', serif;
|
||||||
|
color:#fff;
|
||||||
|
letter-spacing: .02em;
|
||||||
|
}
|
||||||
|
|
||||||
|
.btn-group > a.btn {
|
||||||
|
padding-left: 1rem;
|
||||||
|
padding-right: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
a.dropdown-item {
|
||||||
|
border-bottom:1px solid #ccc;
|
||||||
|
font-family:'Roboto';
|
||||||
|
}
|
||||||
|
|
||||||
|
.dropdown-menu {
|
||||||
|
position: absolute;
|
||||||
|
background: #fff;
|
||||||
|
border: 1px solid #ccc;
|
||||||
|
margin:0;
|
||||||
|
padding:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.dropdown-menu a {
|
||||||
|
font-size:.8rem;
|
||||||
|
line-height:2rem;
|
||||||
|
text-transform:uppercase;
|
||||||
|
}
|
||||||
|
|
||||||
|
.dropdown-menu a:last-child {
|
||||||
|
border-bottom:none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.dropdown-menu:after, .dropdown-menu:before {
|
||||||
|
bottom: 100%;
|
||||||
|
left: 20%;
|
||||||
|
border: solid transparent;
|
||||||
|
content: " ";
|
||||||
|
height: 0;
|
||||||
|
width: 0;
|
||||||
|
position: absolute;
|
||||||
|
pointer-events: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.dropdown-menu:after {
|
||||||
|
border-color: rgba(255, 255, 255, 0);
|
||||||
|
border-bottom-color: #fff;
|
||||||
|
border-width: 12px;
|
||||||
|
margin-left: -12px;
|
||||||
|
}
|
||||||
|
.dropdown-menu:before {
|
||||||
|
border-color: rgba(51, 153, 170, 0);
|
||||||
|
border-bottom-color: #ccc;
|
||||||
|
border-width: 13px;
|
||||||
|
margin-left: -13px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.navbar-dark .navbar-nav .nav-link:focus {
|
||||||
|
color: #ccc;
|
||||||
|
}
|
||||||
|
|
||||||
|
.header-link {
|
||||||
|
position: absolute;
|
||||||
|
right: 0.6em;
|
||||||
|
opacity: 0;
|
||||||
|
-webkit-transition: opacity 0.2s ease-in-out 0.1s;
|
||||||
|
-moz-transition: opacity 0.2s ease-in-out 0.1s;
|
||||||
|
-ms-transition: opacity 0.2s ease-in-out 0.1s;
|
||||||
|
}
|
||||||
|
|
||||||
|
h2:hover .header-link,
|
||||||
|
h3:hover .header-link,
|
||||||
|
h4:hover .header-link,
|
||||||
|
h5:hover .header-link,
|
||||||
|
h6:hover .header-link {
|
||||||
|
opacity: 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Lesson Typography
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
a {text-decoration:none;}
|
||||||
|
|
||||||
|
a:link {color: #38c;}
|
||||||
|
a:visited {color: #39a;}
|
||||||
|
a:hover {color: #555;}
|
||||||
|
a:active {color: #555;}
|
||||||
|
|
||||||
|
b, strong { font-weight: bold; }
|
||||||
|
|
||||||
|
blockquote { margin: 1em 2em; padding: 0 1em 0 1em; font-style: italic; border:1px solid #666; background: #eeeeee;}
|
||||||
|
|
||||||
|
hr {
|
||||||
|
display: block; height: 1px; border: 0; border-top: 1px solid #ccc; margin: 2em 0; padding: 0; }
|
||||||
|
|
||||||
|
img {
|
||||||
|
max-width:100%;
|
||||||
|
}
|
||||||
|
|
||||||
|
ins { background: #ff9; color: #000; text-decoration: none; }
|
||||||
|
|
||||||
|
|
||||||
|
h1,h2,h3,h4,h5 {
|
||||||
|
font-family:'Crete Round', serif;
|
||||||
|
font-weight:normal;
|
||||||
|
clear:both;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
h1 {
|
||||||
|
font-size:2rem;
|
||||||
|
margin-bottom:1.5rem;
|
||||||
|
letter-spacing:-.03rem;
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
h2 {
|
||||||
|
font-size:1.6rem;
|
||||||
|
margin-top:3rem;
|
||||||
|
letter-spacing:-.02rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
h3 {
|
||||||
|
font-size:1.4rem;
|
||||||
|
margin-top:2.5rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
h4 {
|
||||||
|
font-size:1.2rem;
|
||||||
|
margin-top:1.8rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
h5 {
|
||||||
|
font-size:1.0rem;
|
||||||
|
margin-top:1.4rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 a, h2 a, h3 a, h4 a, h5 a {
|
||||||
|
text-decoration:none;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1 a:link { color: #38c; }
|
||||||
|
h1 a:visited {color: #39a; }
|
||||||
|
|
||||||
|
|
||||||
|
/* select button generated by codeblocks.js */
|
||||||
|
.fa-align-left {opacity: 0.2;}
|
||||||
|
.highlight:hover .fa-align-left {opacity: 1;}
|
||||||
|
|
||||||
|
q { quotes: none; }
|
||||||
|
q:before, q:after { content: ""; content: none; }
|
||||||
|
|
||||||
|
small { font-size: 85%; }
|
||||||
|
|
||||||
|
/* Position subscript and superscript content without affecting line-height: h5bp.com/k */
|
||||||
|
sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
|
||||||
|
sup { top: -0.5em; }
|
||||||
|
sub { bottom: -0.25em; }
|
||||||
|
|
||||||
|
li {
|
||||||
|
margin-bottom:.5rem;
|
||||||
|
line-height:1.4rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
li.nav-item {
|
||||||
|
margin-bottom:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.alert {
|
||||||
|
font-family: 'Roboto';
|
||||||
|
}
|
||||||
|
|
||||||
|
.alert h2, .alert h3, .alert h4 {
|
||||||
|
margin-top:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Code Highlighting
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
code {
|
||||||
|
font-family: monospace, serif;
|
||||||
|
font-size:.9rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.highlight {
|
||||||
|
margin: 1rem 0 1rem 0;
|
||||||
|
padding:.5rem .2rem;
|
||||||
|
font-size:.9rem;
|
||||||
|
white-space: pre;
|
||||||
|
word-wrap: normal;
|
||||||
|
overflow: auto;
|
||||||
|
border: 1px solid #eee;
|
||||||
|
background: #fafafa;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Figures
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
figure {
|
||||||
|
margin: 0 auto .5rem;
|
||||||
|
text-align: center;
|
||||||
|
display:table;
|
||||||
|
}
|
||||||
|
|
||||||
|
figcaption {
|
||||||
|
margin-top:.5rem;
|
||||||
|
font-family:'Open Sans';
|
||||||
|
font-size:0.8em;
|
||||||
|
color: #666;
|
||||||
|
display:block;
|
||||||
|
caption-side: bottom;
|
||||||
|
}
|
||||||
|
|
||||||
|
.author-info, .citation-info {
|
||||||
|
border-top:1px solid #333;
|
||||||
|
padding-top:1rem;
|
||||||
|
margin-top:2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.author-name, .suggested-citation-header {
|
||||||
|
font-family:'Roboto Condensed';
|
||||||
|
font-weight: 600;
|
||||||
|
font-size:1.2rem;
|
||||||
|
color: #666;
|
||||||
|
text-transform:uppercase;
|
||||||
|
}
|
||||||
|
|
||||||
|
.author-description p, .suggested-citation-text p {
|
||||||
|
font-size:0.9rem;
|
||||||
|
font-family:'Open Sans';
|
||||||
|
color: #666;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Tables
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
table {
|
||||||
|
width: 100%;
|
||||||
|
margin-bottom: 1em;
|
||||||
|
}
|
||||||
|
|
||||||
|
th, td {
|
||||||
|
padding: 10px;
|
||||||
|
text-align: left;
|
||||||
|
border-bottom: 1px solid #ddd;
|
||||||
|
}
|
||||||
|
|
||||||
|
thead {
|
||||||
|
background-color: #535353;
|
||||||
|
color: #fff;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
|
||||||
|
tr:nth-child(even) {background-color: #f2f2f2}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Blog Index and Layout
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
.blog-header {
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blog-header h2 {
|
||||||
|
margin:0;
|
||||||
|
line-height: 2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blog-header h3 { /*author*/
|
||||||
|
margin-top:.4rem;
|
||||||
|
color: #666;
|
||||||
|
font-size:1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blog-header h4{
|
||||||
|
color: #999;
|
||||||
|
font-size:1rem;
|
||||||
|
margin-bottom:.2rem;
|
||||||
|
font-family:'Roboto Condensed';
|
||||||
|
text-transform:uppercase;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blog-header figure {
|
||||||
|
max-width:80%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blog-header figcaption {
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
|
||||||
|
.blog-page-header {
|
||||||
|
margin-bottom:3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Project Team
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
.contact-box {
|
||||||
|
margin-bottom:3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* =============================================================================
|
||||||
|
Footer
|
||||||
|
========================================================================== */
|
||||||
|
|
||||||
|
footer[role="contentinfo"] {
|
||||||
|
margin-top: 2rem;
|
||||||
|
padding: 2rem 0;
|
||||||
|
font-family:'Open Sans';
|
||||||
|
font-size:.9rem;
|
||||||
|
color: #fff;
|
||||||
|
background-color:#666;
|
||||||
|
text-align:center;
|
||||||
|
}
|
||||||
|
|
||||||
|
footer a, footer a:link, footer a:visited {
|
||||||
|
color: #fff;
|
||||||
|
border-bottom:1px #eee dotted;
|
||||||
|
}
|
||||||
|
|
||||||
|
footer a:hover {
|
||||||
|
text-decoration: none;
|
||||||
|
border-bottom:1px #fff solid;
|
||||||
|
}
|
||||||
|
|
||||||
|
footer .fa {
|
||||||
|
margin: 0 .2rem 0rem 0rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.footer-head {
|
||||||
|
font-size:1.1rem;
|
||||||
|
line-height:1.4rem;
|
||||||
|
margin-bottom:1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
} /* end screen */
|
||||||
|
|
||||||
|
@media only screen and (max-width: 768px) {
|
||||||
|
.garnish {
|
||||||
|
display:none;
|
||||||
|
}
|
||||||
|
.dropdown-menu:after, .dropdown-menu:before {
|
||||||
|
display:none;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Print Styling */
|
||||||
|
|
||||||
|
@media screen {
|
||||||
|
/* Class to hide elements only shown when printing */
|
||||||
|
.hide-print {
|
||||||
|
display: none !important;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@media print {
|
||||||
|
* { background: transparent !important; color: black !important; box-shadow:none !important; text-shadow: none !important; filter:none !important; -ms-filter: none !important; } /* Black prints faster: h5bp.com/s */
|
||||||
|
a, a:visited { text-decoration: underline; }
|
||||||
|
a[href]:after { content: " (" attr(href) ")"; }
|
||||||
|
abbr[title]:after { content: " (" attr(title) ")"; }
|
||||||
|
a[href^="javascript:"]:after, a[href^="#"]:after { content: ""; } /* Don't show links for images, or javascript/internal links */
|
||||||
|
pre, blockquote {
|
||||||
|
border: 1px solid #999;
|
||||||
|
page-break-inside: avoid;
|
||||||
|
margin: 0.5cm;
|
||||||
|
padding: 0.5cm
|
||||||
|
}
|
||||||
|
thead { display: table-header-group; } /* h5bp.com/t */
|
||||||
|
tr, img { page-break-inside: avoid; }
|
||||||
|
img { max-width: 100% !important; }
|
||||||
|
@page {
|
||||||
|
margin: 1.5cm;
|
||||||
|
}
|
||||||
|
|
||||||
|
body { font-size: 0.85rem;}
|
||||||
|
p, h2, h3 { orphans: 3; widows: 3; }
|
||||||
|
h1, h2, h3 { page-break-after: avoid; }
|
||||||
|
h1 { font-size: 1.4rem; }
|
||||||
|
h2 { font-size: 1.1rem; }
|
||||||
|
h3 { font-size: 1rem; }
|
||||||
|
h4 { font-size: 0.9rem; }
|
||||||
|
.header-bottom {
|
||||||
|
margin-bottom: 2rem;
|
||||||
|
page-break-after: always;
|
||||||
|
}
|
||||||
|
.hide-screen {
|
||||||
|
/* Hide elements that only appear on screen */
|
||||||
|
display: none !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
.print-header {
|
||||||
|
/* format navbar for print */
|
||||||
|
display: block;
|
||||||
|
z-index:1030;
|
||||||
|
width: 100%;
|
||||||
|
height: 3rem;
|
||||||
|
padding: .6rem 1rem;
|
||||||
|
margin-bottom: 1rem;
|
||||||
|
color:#fff;
|
||||||
|
white-space: nowrap;
|
||||||
|
font-family: 'Crete Round', serif;
|
||||||
|
border-bottom: 1px solid lightgrey;
|
||||||
|
}
|
||||||
|
}
|
1527
lod/tutorial/en/lessons/retired/graph-databases-and-SPARQL.html
Normal file
BIN
lod/tutorial/gallery/graph-databases-and-SPARQL.png
Normal file
After Width: | Height: | Size: 41 KiB |
1527
lod/tutorial/graph-databases-and-SPARQL.html
Normal file
17
lod/tutorial/images/ORCIDiD_iconvector.svg
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
<?xml version="1.0" encoding="utf-8"?>
|
||||||
|
<!-- Generator: Adobe Illustrator 19.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
|
||||||
|
<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
|
||||||
|
viewBox="0 0 256 256" style="enable-background:new 0 0 256 256;" xml:space="preserve">
|
||||||
|
<style type="text/css">
|
||||||
|
.st0{fill:#A6CE39;}
|
||||||
|
.st1{fill:#FFFFFF;}
|
||||||
|
</style>
|
||||||
|
<path class="st0" d="M256,128c0,70.7-57.3,128-128,128C57.3,256,0,198.7,0,128C0,57.3,57.3,0,128,0C198.7,0,256,57.3,256,128z"/>
|
||||||
|
<g>
|
||||||
|
<path class="st1" d="M86.3,186.2H70.9V79.1h15.4v48.4V186.2z"/>
|
||||||
|
<path class="st1" d="M108.9,79.1h41.6c39.6,0,57,28.3,57,53.6c0,27.5-21.5,53.6-56.8,53.6h-41.8V79.1z M124.3,172.4h24.5
|
||||||
|
c34.9,0,42.9-26.5,42.9-39.7c0-21.5-13.7-39.7-43.7-39.7h-23.7V172.4z"/>
|
||||||
|
<path class="st1" d="M88.7,56.8c0,5.5-4.5,10.1-10.1,10.1c-5.6,0-10.1-4.6-10.1-10.1c0-5.6,4.5-10.1,10.1-10.1
|
||||||
|
C84.2,46.7,88.7,51.3,88.7,56.8z"/>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 983 B |
BIN
lod/tutorial/images/doi_icon.jpg
Normal file
After Width: | Height: | Size: 65 KiB |
BIN
lod/tutorial/images/favicons/en_favicon.ico
Normal file
After Width: | Height: | Size: 318 B |
BIN
lod/tutorial/images/favicons/es_favicon.ico
Normal file
After Width: | Height: | Size: 318 B |
BIN
lod/tutorial/images/graph-databases-and-SPARQL.png
Normal file
After Width: | Height: | Size: 41 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-01.png
Normal file
After Width: | Height: | Size: 19 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-02.png
Normal file
After Width: | Height: | Size: 9.6 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-03.png
Normal file
After Width: | Height: | Size: 21 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-04.png
Normal file
After Width: | Height: | Size: 22 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-05.png
Normal file
After Width: | Height: | Size: 34 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-06.png
Normal file
After Width: | Height: | Size: 190 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-07.png
Normal file
After Width: | Height: | Size: 46 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-08.png
Normal file
After Width: | Height: | Size: 46 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-09.png
Normal file
After Width: | Height: | Size: 112 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-10.png
Normal file
After Width: | Height: | Size: 31 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-11.png
Normal file
After Width: | Height: | Size: 132 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-12.png
Normal file
After Width: | Height: | Size: 58 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-13.png
Normal file
After Width: | Height: | Size: 27 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-14.png
Normal file
After Width: | Height: | Size: 97 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-15.png
Normal file
After Width: | Height: | Size: 351 KiB |
@@ -0,0 +1,40 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||||
|
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||||
|
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
|
||||||
|
-->
|
||||||
|
<!-- Title: %3 Pages: 1 -->
|
||||||
|
<svg width="262pt" height="124pt"
|
||||||
|
viewBox="0.00 0.00 261.53 124.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||||
|
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 120)">
|
||||||
|
<title>%3</title>
|
||||||
|
<polygon fill="white" stroke="none" points="-4,4 -4,-120 257.533,-120 257.533,4 -4,4"/>
|
||||||
|
<!-- nw -->
|
||||||
|
<g id="node1" class="node"><title>nw</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="49.3505" cy="-98" rx="49.2014" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="49.3505" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
|
||||||
|
</g>
|
||||||
|
<!-- oil -->
|
||||||
|
<g id="node3" class="node"><title>oil</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="117.35" cy="-18" rx="42.8742" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="117.35" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->oil -->
|
||||||
|
<g id="edge1" class="edge"><title>nw->oil</title>
|
||||||
|
<path fill="none" stroke="gray" d="M63.7715,-80.4582C73.3018,-69.5265 85.9453,-55.0236 96.5567,-42.8517"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="99.2502,-45.0882 103.183,-35.2505 93.9738,-40.4882 99.2502,-45.0882"/>
|
||||||
|
<text text-anchor="middle" x="108.138" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb -->
|
||||||
|
<g id="node2" class="node"><title>wb</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="185.35" cy="-98" rx="68.3645" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="185.35" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->oil -->
|
||||||
|
<g id="edge2" class="edge"><title>wb->oil</title>
|
||||||
|
<path fill="none" stroke="gray" d="M170.595,-80.0752C161.138,-69.2266 148.718,-54.9801 138.252,-42.9755"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="140.589,-40.3299 131.38,-35.0922 135.313,-44.9299 140.589,-40.3299"/>
|
||||||
|
<text text-anchor="middle" x="176.138" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 2.2 KiB |
104
lod/tutorial/images/graph-databases-and-SPARQL/sparql01.svg
Normal file
@@ -0,0 +1,104 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||||
|
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||||
|
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
|
||||||
|
-->
|
||||||
|
<!-- Title: %3 Pages: 1 -->
|
||||||
|
<svg width="438pt" height="212pt"
|
||||||
|
viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||||
|
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
|
||||||
|
<title>%3</title>
|
||||||
|
<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
|
||||||
|
<!-- nw -->
|
||||||
|
<g id="node1" class="node"><title>nw</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="132" cy="-186" rx="49.2014" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr -->
|
||||||
|
<g id="node2" class="node"><title>rr</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="132" cy="-98" rx="60.0217" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->rr -->
|
||||||
|
<g id="edge1" class="edge"><title>nw->rr</title>
|
||||||
|
<path fill="none" stroke="gray" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
|
||||||
|
<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
|
||||||
|
</g>
|
||||||
|
<!-- oil -->
|
||||||
|
<g id="node5" class="node"><title>oil</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="253" cy="-98" rx="42.8742" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->oil -->
|
||||||
|
<g id="edge3" class="edge"><title>nw->oil</title>
|
||||||
|
<path fill="none" stroke="gray" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
|
||||||
|
<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
<!-- nwd -->
|
||||||
|
<g id="node7" class="node"><title>nwd</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->nwd -->
|
||||||
|
<g id="edge2" class="edge"><title>nw->nwd</title>
|
||||||
|
<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
|
||||||
|
<text text-anchor="middle" x="106.566" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="106.566" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created in</text>
|
||||||
|
</g>
|
||||||
|
<!-- d -->
|
||||||
|
<g id="node6" class="node"><title>d</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="263" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr->d -->
|
||||||
|
<g id="edge5" class="edge"><title>rr->d</title>
|
||||||
|
<path fill="none" stroke="gray" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
|
||||||
|
<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
|
||||||
|
</g>
|
||||||
|
<!-- rrb -->
|
||||||
|
<g id="node8" class="node"><title>rrb</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr->rrb -->
|
||||||
|
<g id="edge4" class="edge"><title>rr->rrb</title>
|
||||||
|
<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
|
||||||
|
<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
|
||||||
|
</g>
|
||||||
|
<!-- jv -->
|
||||||
|
<g id="node3" class="node"><title>jv</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="372" cy="-98" rx="57.9076" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
|
||||||
|
</g>
|
||||||
|
<!-- jv->d -->
|
||||||
|
<g id="edge6" class="edge"><title>jv->d</title>
|
||||||
|
<path fill="none" stroke="gray" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
|
||||||
|
<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb -->
|
||||||
|
<g id="node4" class="node"><title>wb</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="277" cy="-186" rx="68.3645" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->jv -->
|
||||||
|
<g id="edge7" class="edge"><title>wb->jv</title>
|
||||||
|
<path fill="none" stroke="gray" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
|
||||||
|
<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->oil -->
|
||||||
|
<g id="edge8" class="edge"><title>wb->oil</title>
|
||||||
|
<path fill="none" stroke="gray" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
|
||||||
|
<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 6.2 KiB |
104
lod/tutorial/images/graph-databases-and-SPARQL/sparql02-1.svg
Normal file
@@ -0,0 +1,104 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||||
|
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||||
|
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
|
||||||
|
-->
|
||||||
|
<!-- Title: %3 Pages: 1 -->
|
||||||
|
<svg width="438pt" height="212pt"
|
||||||
|
viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||||
|
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
|
||||||
|
<title>%3</title>
|
||||||
|
<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
|
||||||
|
<!-- nw -->
|
||||||
|
<g id="node1" class="node"><title>nw</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="132" cy="-186" rx="49.2014" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr -->
|
||||||
|
<g id="node2" class="node"><title>rr</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="132" cy="-98" rx="60.0217" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->rr -->
|
||||||
|
<g id="edge1" class="edge"><title>nw->rr</title>
|
||||||
|
<path fill="none" stroke="orange" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
|
||||||
|
<polygon fill="orange" stroke="orange" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
|
||||||
|
<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
|
||||||
|
</g>
|
||||||
|
<!-- oil -->
|
||||||
|
<g id="node5" class="node"><title>oil</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="253" cy="-98" rx="42.8742" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->oil -->
|
||||||
|
<g id="edge3" class="edge"><title>nw->oil</title>
|
||||||
|
<path fill="none" stroke="gray" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
|
||||||
|
<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
<!-- nwd -->
|
||||||
|
<g id="node7" class="node"><title>nwd</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->nwd -->
|
||||||
|
<g id="edge2" class="edge"><title>nw->nwd</title>
|
||||||
|
<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
|
||||||
|
<text text-anchor="middle" x="106.566" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="106.566" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created in</text>
|
||||||
|
</g>
|
||||||
|
<!-- d -->
|
||||||
|
<g id="node6" class="node"><title>d</title>
|
||||||
|
<ellipse fill="none" stroke="orange" cx="263" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr->d -->
|
||||||
|
<g id="edge5" class="edge"><title>rr->d</title>
|
||||||
|
<path fill="none" stroke="orange" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
|
||||||
|
<polygon fill="orange" stroke="orange" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
|
||||||
|
<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
|
||||||
|
</g>
|
||||||
|
<!-- rrb -->
|
||||||
|
<g id="node8" class="node"><title>rrb</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr->rrb -->
|
||||||
|
<g id="edge4" class="edge"><title>rr->rrb</title>
|
||||||
|
<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
|
||||||
|
<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
|
||||||
|
</g>
|
||||||
|
<!-- jv -->
|
||||||
|
<g id="node3" class="node"><title>jv</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="372" cy="-98" rx="57.9076" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
|
||||||
|
</g>
|
||||||
|
<!-- jv->d -->
|
||||||
|
<g id="edge6" class="edge"><title>jv->d</title>
|
||||||
|
<path fill="none" stroke="orange" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
|
||||||
|
<polygon fill="orange" stroke="orange" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
|
||||||
|
<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb -->
|
||||||
|
<g id="node4" class="node"><title>wb</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="277" cy="-186" rx="68.3645" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->jv -->
|
||||||
|
<g id="edge7" class="edge"><title>wb->jv</title>
|
||||||
|
<path fill="none" stroke="orange" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
|
||||||
|
<polygon fill="orange" stroke="orange" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
|
||||||
|
<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->oil -->
|
||||||
|
<g id="edge8" class="edge"><title>wb->oil</title>
|
||||||
|
<path fill="none" stroke="gray" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
|
||||||
|
<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 6.2 KiB |
104
lod/tutorial/images/graph-databases-and-SPARQL/sparql02.svg
Normal file
@@ -0,0 +1,104 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||||
|
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||||
|
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
|
||||||
|
-->
|
||||||
|
<!-- Title: %3 Pages: 1 -->
|
||||||
|
<svg width="438pt" height="212pt"
|
||||||
|
viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||||
|
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
|
||||||
|
<title>%3</title>
|
||||||
|
<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
|
||||||
|
<!-- nw -->
|
||||||
|
<g id="node1" class="node"><title>nw</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="132" cy="-186" rx="49.2014" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr -->
|
||||||
|
<g id="node2" class="node"><title>rr</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="132" cy="-98" rx="60.0217" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->rr -->
|
||||||
|
<g id="edge1" class="edge"><title>nw->rr</title>
|
||||||
|
<path fill="none" stroke="gray" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
|
||||||
|
<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
|
||||||
|
</g>
|
||||||
|
<!-- oil -->
|
||||||
|
<g id="node5" class="node"><title>oil</title>
|
||||||
|
<ellipse fill="none" stroke="orange" cx="253" cy="-98" rx="42.8742" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->oil -->
|
||||||
|
<g id="edge3" class="edge"><title>nw->oil</title>
|
||||||
|
<path fill="none" stroke="orange" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
|
||||||
|
<polygon fill="orange" stroke="orange" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
|
||||||
|
<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
<!-- nwd -->
|
||||||
|
<g id="node7" class="node"><title>nwd</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
|
||||||
|
</g>
|
||||||
|
<!-- nw->nwd -->
|
||||||
|
<g id="edge2" class="edge"><title>nw->nwd</title>
|
||||||
|
<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
|
||||||
|
<text text-anchor="middle" x="105.455" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="105.455" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">createdin</text>
|
||||||
|
</g>
|
||||||
|
<!-- d -->
|
||||||
|
<g id="node6" class="node"><title>d</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="263" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr->d -->
|
||||||
|
<g id="edge5" class="edge"><title>rr->d</title>
|
||||||
|
<path fill="none" stroke="gray" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
|
||||||
|
<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
|
||||||
|
</g>
|
||||||
|
<!-- rrb -->
|
||||||
|
<g id="node8" class="node"><title>rrb</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
|
||||||
|
</g>
|
||||||
|
<!-- rr->rrb -->
|
||||||
|
<g id="edge4" class="edge"><title>rr->rrb</title>
|
||||||
|
<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
|
||||||
|
<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
|
||||||
|
</g>
|
||||||
|
<!-- jv -->
|
||||||
|
<g id="node3" class="node"><title>jv</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="372" cy="-98" rx="57.9076" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
|
||||||
|
</g>
|
||||||
|
<!-- jv->d -->
|
||||||
|
<g id="edge6" class="edge"><title>jv->d</title>
|
||||||
|
<path fill="none" stroke="gray" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
|
||||||
|
<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb -->
|
||||||
|
<g id="node4" class="node"><title>wb</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="277" cy="-186" rx="68.3645" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->jv -->
|
||||||
|
<g id="edge7" class="edge"><title>wb->jv</title>
|
||||||
|
<path fill="none" stroke="gray" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
|
||||||
|
<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
|
||||||
|
<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
|
||||||
|
</g>
|
||||||
|
<!-- wb->oil -->
|
||||||
|
<g id="edge8" class="edge"><title>wb->oil</title>
|
||||||
|
<path fill="none" stroke="orange" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
|
||||||
|
<polygon fill="orange" stroke="orange" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
|
||||||
|
<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
|
||||||
|
</g>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 6.2 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql03.png
Normal file
After Width: | Height: | Size: 34 KiB |
127
lod/tutorial/images/graph-databases-and-SPARQL/sparql04-1.svg
Normal file
@@ -0,0 +1,127 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||||
|
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||||
|
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
|
||||||
|
-->
|
||||||
|
<!-- Title: %3 Pages: 1 -->
|
||||||
|
<svg width="636pt" height="364pt"
|
||||||
|
viewBox="0.00 0.00 636.30 364.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||||
|
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 360)">
|
||||||
|
<title>%3</title>
|
||||||
|
<polygon fill="white" stroke="none" points="-4,4 -4,-360 632.301,-360 632.301,4 -4,4"/>
|
||||||
|
<!-- o -->
|
||||||
|
<g id="node1" class="node"><title>o</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="274.27" cy="-338" rx="53.3595" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="274.27" y="-335" font-family="Helvetica,sans-Serif" font-size="10.00">object/PPA82633</text>
|
||||||
|
</g>
|
||||||
|
<!-- th1 -->
|
||||||
|
<g id="node2" class="node"><title>th1</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="40.2702" cy="-258" rx="40.0417" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="40.2702" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">thes:x11409</text>
|
||||||
|
</g>
|
||||||
|
<!-- o->th1 -->
|
||||||
|
<g id="edge1" class="edge"><title>o->th1</title>
|
||||||
|
<path fill="none" stroke="red" d="M224.341,-331.342C190.446,-326.351 145.117,-317.397 107.454,-302 93.6639,-296.363 79.5997,-287.87 67.927,-279.917"/>
|
||||||
|
<polygon fill="red" stroke="red" points="69.704,-276.888 59.5111,-273.997 65.6765,-282.613 69.704,-276.888"/>
|
||||||
|
<text text-anchor="middle" x="147.178" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:p45_consists_of</text>
|
||||||
|
</g>
|
||||||
|
<!-- dep -->
|
||||||
|
<g id="node4" class="node"><title>dep</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="172.27" cy="-258" rx="74.1479" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="172.27" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">person-institution/147800</text>
|
||||||
|
</g>
|
||||||
|
<!-- o->dep -->
|
||||||
|
<g id="edge3" class="edge"><title>o->dep</title>
|
||||||
|
<path fill="none" stroke="red" d="M235.65,-325.351C222.058,-319.869 207.403,-312.234 196.239,-302 191.195,-297.376 186.961,-291.439 183.525,-285.462"/>
|
||||||
|
<polygon fill="red" stroke="red" points="186.46,-283.516 178.779,-276.219 180.234,-286.714 186.46,-283.516"/>
|
||||||
|
<text text-anchor="middle" x="228.286" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P62_depicts</text>
|
||||||
|
</g>
|
||||||
|
<!-- etc -->
|
||||||
|
<g id="node6" class="node"><title>etc</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="274.27" cy="-18" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="274.27" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">etc...</text>
|
||||||
|
</g>
|
||||||
|
<!-- o->etc -->
|
||||||
|
<g id="edge10" class="edge"><title>o->etc</title>
|
||||||
|
<path fill="none" stroke="gray" d="M274.27,-319.958C274.27,-304.156 274.27,-279.99 274.27,-259 274.27,-259 274.27,-259 274.27,-97 274.27,-80.1099 274.27,-61.1626 274.27,-46.172"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="277.77,-46.0417 274.27,-36.0418 270.77,-46.0418 277.77,-46.0417"/>
|
||||||
|
</g>
|
||||||
|
<!-- own -->
|
||||||
|
<g id="node7" class="node"><title>own</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="395.27" cy="-258" rx="93.1176" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="395.27" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">thesIdentifier:the-british-museum</text>
|
||||||
|
</g>
|
||||||
|
<!-- o->own -->
|
||||||
|
<g id="edge5" class="edge"><title>o->own</title>
|
||||||
|
<path fill="none" stroke="red" d="M297.887,-321.776C315.86,-310.19 340.856,-294.077 361.023,-281.077"/>
|
||||||
|
<polygon fill="red" stroke="red" points="363.112,-283.894 369.621,-275.534 359.319,-278.011 363.112,-283.894"/>
|
||||||
|
<text text-anchor="middle" x="392.854" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P54_has_current_owner</text>
|
||||||
|
</g>
|
||||||
|
<!-- con -->
|
||||||
|
<g id="node8" class="node"><title>con</title>
|
||||||
|
<ellipse fill="none" stroke="red" cx="524.27" cy="-178" rx="80.1403" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="524.27" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">object/PPA82633/concept/1</text>
|
||||||
|
</g>
|
||||||
|
<!-- o->con -->
|
||||||
|
<g id="edge6" class="edge"><title>o->con</title>
|
||||||
|
<path fill="none" stroke="red" d="M322.863,-330.474C381.749,-321.472 475.782,-303.207 497.27,-276 513.047,-256.024 519.612,-227.389 522.339,-206.409"/>
|
||||||
|
<polygon fill="red" stroke="red" points="525.845,-206.541 523.445,-196.222 518.886,-205.786 525.845,-206.541"/>
|
||||||
|
<text text-anchor="middle" x="548.839" y="-255.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P128_carries</text>
|
||||||
|
</g>
|
||||||
|
<!-- th1lab -->
|
||||||
|
<g id="node3" class="node"><title>th1lab</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="40.2702" cy="-178" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="40.2702" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">paper</text>
|
||||||
|
</g>
|
||||||
|
<!-- th1->th1lab -->
|
||||||
|
<g id="edge2" class="edge"><title>th1->th1lab</title>
|
||||||
|
<path fill="none" stroke="gray" d="M40.2702,-239.689C40.2702,-229.894 40.2702,-217.422 40.2702,-206.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="43.7703,-206.262 40.2702,-196.262 36.7703,-206.262 43.7703,-206.262"/>
|
||||||
|
<text text-anchor="middle" x="66.2858" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
|
||||||
|
</g>
|
||||||
|
<!-- deplab -->
|
||||||
|
<g id="node5" class="node"><title>deplab</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="172.27" cy="-178" rx="66.8537" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="172.27" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">Julius Caesar Scaliger</text>
|
||||||
|
</g>
|
||||||
|
<!-- dep->deplab -->
|
||||||
|
<g id="edge4" class="edge"><title>dep->deplab</title>
|
||||||
|
<path fill="none" stroke="gray" d="M172.27,-239.689C172.27,-229.894 172.27,-217.422 172.27,-206.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="175.77,-206.262 172.27,-196.262 168.77,-206.262 175.77,-206.262"/>
|
||||||
|
<text text-anchor="middle" x="198.286" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
|
||||||
|
</g>
|
||||||
|
<!-- contype -->
|
||||||
|
<g id="node9" class="node"><title>contype</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="431.27" cy="-98" rx="85.8678" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="431.27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">ecrm:E73_Information_Object</text>
|
||||||
|
</g>
|
||||||
|
<!-- con->contype -->
|
||||||
|
<g id="edge7" class="edge"><title>con->contype</title>
|
||||||
|
<path fill="none" stroke="gray" d="M504.547,-160.458C491.321,-149.365 473.71,-134.595 459.068,-122.314"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="461.196,-119.531 451.285,-115.787 456.698,-124.895 461.196,-119.531"/>
|
||||||
|
<text text-anchor="middle" x="494.61" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">rdf:type</text>
|
||||||
|
</g>
|
||||||
|
<!-- concon -->
|
||||||
|
<g id="node10" class="node"><title>concon</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="576.27" cy="-98" rx="40.8927" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="576.27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">thes:x12440</text>
|
||||||
|
</g>
|
||||||
|
<!-- con->concon -->
|
||||||
|
<g id="edge8" class="edge"><title>con->concon</title>
|
||||||
|
<path fill="none" stroke="gray" d="M535.553,-160.075C542.599,-149.507 551.795,-135.713 559.664,-123.91"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="562.73,-125.619 565.365,-115.357 556.906,-121.736 562.73,-125.619"/>
|
||||||
|
<text text-anchor="middle" x="588.96" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P129_is_about</text>
|
||||||
|
</g>
|
||||||
|
<!-- conlab -->
|
||||||
|
<g id="node11" class="node"><title>conlab</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="576.27" cy="-18" rx="33.894" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="576.27" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">academic</text>
|
||||||
|
</g>
|
||||||
|
<!-- concon->conlab -->
|
||||||
|
<g id="edge9" class="edge"><title>concon->conlab</title>
|
||||||
|
<path fill="none" stroke="gray" d="M576.27,-79.6893C576.27,-69.8938 576.27,-57.4218 576.27,-46.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="579.77,-46.2623 576.27,-36.2623 572.77,-46.2624 579.77,-46.2623"/>
|
||||||
|
<text text-anchor="middle" x="602.286" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
|
||||||
|
</g>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 7.8 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql04.png
Normal file
After Width: | Height: | Size: 190 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql05.png
Normal file
After Width: | Height: | Size: 46 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql06.png
Normal file
After Width: | Height: | Size: 112 KiB |
114
lod/tutorial/images/graph-databases-and-SPARQL/sparql07.svg
Normal file
@@ -0,0 +1,114 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
|
||||||
|
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
|
||||||
|
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
|
||||||
|
-->
|
||||||
|
<!-- Title: %3 Pages: 1 -->
|
||||||
|
<svg width="367pt" height="364pt"
|
||||||
|
viewBox="0.00 0.00 367.21 364.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||||
|
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 360)">
|
||||||
|
<title>%3</title>
|
||||||
|
<polygon fill="white" stroke="none" points="-4,4 -4,-360 363.215,-360 363.215,4 -4,4"/>
|
||||||
|
<!-- obj -->
|
||||||
|
<g id="node1" class="node"><title>obj</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="148.735" cy="-338" rx="148.97" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="148.735" y="-335" font-family="Helvetica,sans-Serif" font-size="10.00">http://collection.britishmuseum.org/id/object/PPA82633</text>
|
||||||
|
</g>
|
||||||
|
<!-- object_type -->
|
||||||
|
<g id="node2" class="node"><title>object_type</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="51.7348" cy="-258" rx="38.5366" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="51.7348" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">object_type</text>
|
||||||
|
</g>
|
||||||
|
<!-- obj->object_type -->
|
||||||
|
<g id="edge1" class="edge"><title>obj->object_type</title>
|
||||||
|
<path fill="none" stroke="gray" d="M98.3951,-320.855C88.4182,-315.964 78.6502,-309.764 70.9106,-302 66.4004,-297.476 62.8568,-291.706 60.1099,-285.873"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="63.1969,-284.17 56.2095,-276.205 56.7054,-286.789 63.1969,-284.17"/>
|
||||||
|
<text text-anchor="middle" x="108.647" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">bmo:PX_object_type</text>
|
||||||
|
</g>
|
||||||
|
<!-- production -->
|
||||||
|
<g id="node4" class="node"><title>production</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="148.735" cy="-258" rx="36.3999" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="148.735" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">production</text>
|
||||||
|
</g>
|
||||||
|
<!-- obj->production -->
|
||||||
|
<g id="edge3" class="edge"><title>obj->production</title>
|
||||||
|
<path fill="none" stroke="gray" d="M148.735,-319.689C148.735,-309.894 148.735,-297.422 148.735,-286.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="152.235,-286.262 148.735,-276.262 145.235,-286.262 152.235,-286.262"/>
|
||||||
|
<text text-anchor="middle" x="203.657" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P108i_was_produced_by</text>
|
||||||
|
</g>
|
||||||
|
<!-- other -->
|
||||||
|
<g id="node8" class="node"><title>other</title>
|
||||||
|
<text text-anchor="middle" x="281.735" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">Other top-level object attributes</text>
|
||||||
|
</g>
|
||||||
|
<!-- obj->other -->
|
||||||
|
<g id="edge7" class="edge"><title>obj->other</title>
|
||||||
|
<path fill="none" stroke="gray" d="M223.709,-322.337C236.677,-317.399 249.318,-310.804 259.735,-302 264.947,-297.595 269.068,-291.646 272.265,-285.579"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="275.598,-286.706 276.554,-276.154 269.227,-283.807 275.598,-286.706"/>
|
||||||
|
<text text-anchor="middle" x="271.069" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
|
||||||
|
</g>
|
||||||
|
<!-- print -->
|
||||||
|
<g id="node3" class="node"><title>print</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="51.7348" cy="-178" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="51.7348" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">print</text>
|
||||||
|
</g>
|
||||||
|
<!-- object_type->print -->
|
||||||
|
<g id="edge2" class="edge"><title>object_type->print</title>
|
||||||
|
<path fill="none" stroke="gray" d="M51.7348,-239.689C51.7348,-229.894 51.7348,-217.422 51.7348,-206.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="55.2349,-206.262 51.7348,-196.262 48.2349,-206.262 55.2349,-206.262"/>
|
||||||
|
<text text-anchor="middle" x="77.7504" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
|
||||||
|
</g>
|
||||||
|
<!-- date -->
|
||||||
|
<g id="node5" class="node"><title>date</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="134.735" cy="-178" rx="27" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="134.735" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">date</text>
|
||||||
|
</g>
|
||||||
|
<!-- production->date -->
|
||||||
|
<g id="edge4" class="edge"><title>production->date</title>
|
||||||
|
<path fill="none" stroke="gray" d="M143.042,-240.075C141.328,-234.389 139.605,-227.974 138.481,-222 137.543,-217.015 136.839,-211.66 136.311,-206.48"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="139.778,-205.937 135.456,-196.264 132.803,-206.521 139.778,-205.937"/>
|
||||||
|
<text text-anchor="middle" x="175.862" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P9_consists_of</text>
|
||||||
|
</g>
|
||||||
|
<!-- other_prod -->
|
||||||
|
<g id="node9" class="node"><title>other_prod</title>
|
||||||
|
<text text-anchor="middle" x="234.735" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">Other production info</text>
|
||||||
|
</g>
|
||||||
|
<!-- production->other_prod -->
|
||||||
|
<g id="edge8" class="edge"><title>production->other_prod</title>
|
||||||
|
<path fill="none" stroke="gray" d="M176.351,-246.343C188.633,-240.562 202.579,-232.439 212.735,-222 217.34,-217.267 221.193,-211.376 224.322,-205.489"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="227.508,-206.94 228.646,-196.406 221.187,-203.931 227.508,-206.94"/>
|
||||||
|
<text text-anchor="middle" x="222.069" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
|
||||||
|
</g>
|
||||||
|
<!-- timespan -->
|
||||||
|
<g id="node6" class="node"><title>timespan</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="134.735" cy="-98" rx="32.8294" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="134.735" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">timespan</text>
|
||||||
|
</g>
|
||||||
|
<!-- date->timespan -->
|
||||||
|
<g id="edge5" class="edge"><title>date->timespan</title>
|
||||||
|
<path fill="none" stroke="gray" d="M134.735,-159.689C134.735,-149.894 134.735,-137.422 134.735,-126.335"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="138.235,-126.262 134.735,-116.262 131.235,-126.262 138.235,-126.262"/>
|
||||||
|
<text text-anchor="middle" x="178.088" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P4_has_time-span</text>
|
||||||
|
</g>
|
||||||
|
<!-- start_date -->
|
||||||
|
<g id="node7" class="node"><title>start_date</title>
|
||||||
|
<ellipse fill="none" stroke="gray" cx="66.7348" cy="-18" rx="34.828" ry="18"/>
|
||||||
|
<text text-anchor="middle" x="66.7348" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">start_date</text>
|
||||||
|
</g>
|
||||||
|
<!-- timespan->start_date -->
|
||||||
|
<g id="edge6" class="edge"><title>timespan->start_date</title>
|
||||||
|
<path fill="none" stroke="gray" d="M105.598,-89.5265C91.4682,-84.285 75.76,-75.6942 67.3129,-62 64.4467,-57.3534 63.1708,-51.8529 62.8105,-46.3654"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="66.3171,-46.1414 63.0686,-36.0569 59.3193,-45.9661 66.3171,-46.1414"/>
|
||||||
|
<text text-anchor="middle" x="124.446" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P82a_begin_of_the_begin</text>
|
||||||
|
</g>
|
||||||
|
<!-- other_date -->
|
||||||
|
<g id="node10" class="node"><title>other_date</title>
|
||||||
|
<text text-anchor="middle" x="202.735" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">End of begin, Start of end...</text>
|
||||||
|
</g>
|
||||||
|
<!-- timespan->other_date -->
|
||||||
|
<g id="edge9" class="edge"><title>timespan->other_date</title>
|
||||||
|
<path fill="none" stroke="gray" d="M155.643,-84.0943C164.163,-78.1192 173.652,-70.4665 180.735,-62 184.802,-57.138 188.386,-51.398 191.417,-45.7224"/>
|
||||||
|
<polygon fill="gray" stroke="gray" points="194.728,-46.9184 195.983,-36.3981 188.442,-43.8394 194.728,-46.9184"/>
|
||||||
|
<text text-anchor="middle" x="190.069" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
|
||||||
|
</g>
|
||||||
|
</g>
|
||||||
|
</svg>
|
After Width: | Height: | Size: 7.3 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql08.png
Normal file
After Width: | Height: | Size: 132 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql09-1.png
Normal file
After Width: | Height: | Size: 27 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql09.png
Normal file
After Width: | Height: | Size: 58 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql10.png
Normal file
After Width: | Height: | Size: 97 KiB |
BIN
lod/tutorial/images/graph-databases-and-SPARQL/sparql11.png
Normal file
After Width: | Height: | Size: 351 KiB |
34
lod/tutorial/js/bootstrap-4-navbar.js
vendored
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
|
||||||
|
/*!
|
||||||
|
* Bootstrap 4 multi dropdown navbar ( https://bootstrapthemes.co/demo/resource/bootstrap-4-multi-dropdown-navbar/ )
|
||||||
|
* Copyright 2017.
|
||||||
|
* Licensed under the GPL license
|
||||||
|
*/
|
||||||
|
|
||||||
|
|
||||||
|
$( document ).ready( function () {
|
||||||
|
$( '.mobile-drop a.dropdown-toggle' ).on( 'click', function ( e ) {
|
||||||
|
var $el = $( this );
|
||||||
|
var $parent = $( this ).offsetParent( ".mobile-drop" );
|
||||||
|
if ($('.show.mobile-drop').length > 0){
|
||||||
|
$('.show.mobile-drop').each(function(item){
|
||||||
|
$(this).toggleClass('show');
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
var $subMenu = $( this ).next( ".mobile-drop" );
|
||||||
|
$subMenu.toggleClass( 'show' );
|
||||||
|
|
||||||
|
$( this ).parent( "li" ).toggleClass( 'show' );
|
||||||
|
|
||||||
|
$( this ).parents( 'li.nav-item.dropdown.mobile-drop.show' ).on( 'click', function ( e ) {
|
||||||
|
$( '.mobile-drop .show' ).removeClass( "show" );
|
||||||
|
} );
|
||||||
|
|
||||||
|
if ( !$parent.parent().hasClass( 'navbar-nav' ) ) {
|
||||||
|
$el.next().css( { "top": $el[0].offsetTop, "left": $parent.outerWidth() - 4 } );
|
||||||
|
}
|
||||||
|
|
||||||
|
return false;
|
||||||
|
} );
|
||||||
|
} );
|
8
lod/tutorial/js/ext_links.js
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
$(document).ready(function() {
|
||||||
|
$('a').each(function() {
|
||||||
|
var a = new RegExp('/' + window.location.host + '/');
|
||||||
|
if (!a.test(this.href)) {
|
||||||
|
$(this).attr("target","_blank");
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
13
lod/tutorial/js/header_links.js
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
// http://ben.balter.com/2014/03/13/pages-anchor-links/
|
||||||
|
|
||||||
|
$(function() {
|
||||||
|
return $("h2, h3, h4, h5, h6").each(function(i, el) {
|
||||||
|
var $el, icon, id;
|
||||||
|
$el = $(el);
|
||||||
|
id = $el.attr('id');
|
||||||
|
icon = '<i class="fa fa-link" style="font-size: 0.8em"></i>';
|
||||||
|
if (id) {
|
||||||
|
return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon));
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
1377
lod/tutorial/sparql-datos-abiertos-enlazados.html
Normal file
4112
lod/tutorial/sparql-intro.ipynb
Normal file
2513
lod/tutorial/sparql-vanGogh.ipynb
Normal file
53
lod/upload.sh
Normal file
@@ -0,0 +1,53 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
# This is a bit messy
|
||||||
|
if [ "$#" -lt 1 ]; then
|
||||||
|
graph="http://example.com/sitc/submission/"
|
||||||
|
endpoint="http://fuseki.gsi.upm.es/hotels/data"
|
||||||
|
else if [ "$#" -lt 2 ]; then
|
||||||
|
endpoint=$1
|
||||||
|
graph_base="http://example.com/sitc"
|
||||||
|
else
|
||||||
|
if [ "$#" -lt 3 ]; then
|
||||||
|
endpoint=$1
|
||||||
|
graph=$2
|
||||||
|
else
|
||||||
|
echo "Usage: $0 [<endpoint>] [<graph_base_uri>]"
|
||||||
|
echo
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
|
||||||
|
upload(){
|
||||||
|
name=$1
|
||||||
|
file=$2
|
||||||
|
echo '###'
|
||||||
|
echo "Uploading: $graph"
|
||||||
|
echo "Graph: $graph"
|
||||||
|
echo "Endpoint: $endpoint"
|
||||||
|
curl -X POST \
|
||||||
|
--digest -u admin:$PASSWORD \
|
||||||
|
-H Content-Type:text/turtle \
|
||||||
|
-T "$file" \
|
||||||
|
--data-urlencode graph=$graph_base/$name \
|
||||||
|
-G $endpoint
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
total=0
|
||||||
|
echo -n "Password: "
|
||||||
|
read -s PASSWORD
|
||||||
|
|
||||||
|
echo "Uploading synthethic"
|
||||||
|
upload "synthetic" synthetic/reviews.ttl || exit 1
|
||||||
|
|
||||||
|
for i in *.ttl; do
|
||||||
|
identifier=$(echo ${i%.ttl} | md5sum | awk '{print $1}')
|
||||||
|
echo "Uploading $i"
|
||||||
|
upload $identifier $i
|
||||||
|
total=$((total + 1))
|
||||||
|
done
|
||||||
|
echo Uploaded $total
|
@@ -71,8 +71,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
|
"* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
|
||||||
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
|
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
|
||||||
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
|
"* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
|
||||||
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -88,7 +87,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -102,7 +101,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.7"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -63,9 +63,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
|
"* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
|
||||||
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
|
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
|
||||||
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
|
|
||||||
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -81,7 +79,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -95,7 +93,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.7"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -87,10 +87,10 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Scikit-learn provides algorithms for solving the following problems:\n",
|
"Scikit-learn provides algorithms for solving the following problems:\n",
|
||||||
"* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, kNN, ...), SVM, Random forest, Perceptron, etc. \n",
|
"* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, C4.5, ...), kNN, SVM, Random forest, Perceptron, etc. \n",
|
||||||
"* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n",
|
"* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n",
|
||||||
"* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n",
|
"* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n",
|
||||||
"* ** Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
|
"* **Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@@ -36,7 +36,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"The goal of this notebook is to learn how to read and load a sample dataset.\n",
|
"The goal of this notebook is to learn how to read and load a sample dataset.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Scikit-learn comes with some bundled [datasets](http://scikit-learn.org/stable/datasets/): iris, digits, boston, etc.\n",
|
"Scikit-learn comes with some bundled [datasets](https://scikit-learn.org/stable/datasets.html): iris, digits, boston, etc.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"In this notebook we are going to use the Iris dataset."
|
"In this notebook we are going to use the Iris dataset."
|
||||||
]
|
]
|
||||||
@@ -54,7 +54,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"The [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), available at [UCI dataset repository](https://archive.ics.uci.edu/ml/datasets/Iris), is a classic dataset for classification.\n",
|
"The [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), available at [UCI dataset repository](https://archive.ics.uci.edu/ml/datasets/Iris), is a classic dataset for classification.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features.\n",
|
"The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features, a machine learning model will learn to differentiate the species of Iris.\n",
|
||||||
"\n",
|
"\n",
|
||||||
""
|
""
|
||||||
]
|
]
|
||||||
@@ -63,7 +63,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"In ordert to read the dataset, we import the datasets bundle and then load the Iris dataset. "
|
"In order to read the dataset, we import the datasets bundle and then load the Iris dataset. "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@@ -228,7 +228,6 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
|
"* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
|
||||||
"* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
|
"* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
|
||||||
"* [Mastering Pandas](http://proquest.safaribooksonline.com/book/programming/python/9781783981960), Femi Anthony, Packt Publishing, 2015.\n",
|
|
||||||
"* [Matplotlib web page](http://matplotlib.org/index.html)\n",
|
"* [Matplotlib web page](http://matplotlib.org/index.html)\n",
|
||||||
"* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
|
"* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
|
||||||
"* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n",
|
"* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n",
|
||||||
|
@@ -163,7 +163,6 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
|
"* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
|
||||||
"* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
|
"* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
|
||||||
"* [Mastering Pandas](http://proquest.safaribooksonline.com/book/programming/python/9781783981960), Femi Anthony, Packt Publishing, 2015.\n",
|
|
||||||
"* [Matplotlib web page](http://matplotlib.org/index.html)\n",
|
"* [Matplotlib web page](http://matplotlib.org/index.html)\n",
|
||||||
"* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
|
"* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
|
||||||
"* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)"
|
"* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)"
|
||||||
|
@@ -154,7 +154,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [General concepts of machine learning with scikit-learn](http://www.astroml.org/sklearn_tutorial/general_concepts.html)\n",
|
"* [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/index.html)\n",
|
||||||
"* [A Tour of Machine Learning Algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/)"
|
"* [A Tour of Machine Learning Algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -177,7 +177,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -191,7 +191,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.5.6"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -130,12 +130,7 @@
|
|||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=3,\n",
|
"DecisionTreeClassifier(max_depth=3, random_state=1)"
|
||||||
" max_features=None, max_leaf_nodes=None,\n",
|
|
||||||
" min_impurity_decrease=0.0, min_impurity_split=None,\n",
|
|
||||||
" min_samples_leaf=1, min_samples_split=2,\n",
|
|
||||||
" min_weight_fraction_leaf=0.0, presort=False, random_state=1,\n",
|
|
||||||
" splitter='best')"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 2,
|
"execution_count": 2,
|
||||||
@@ -277,20 +272,23 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"ename": "ModuleNotFoundError",
|
"ename": "InvocationException",
|
||||||
"evalue": "No module named 'pydotplus'",
|
"evalue": "GraphViz's executables not found",
|
||||||
"output_type": "error",
|
"output_type": "error",
|
||||||
"traceback": [
|
"traceback": [
|
||||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||||
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
|
"\u001b[0;31mInvocationException\u001b[0m Traceback (most recent call last)",
|
||||||
"\u001b[0;32m<ipython-input-7-1bf5ec7fb043>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mIPython\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdisplay\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mImage\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0msklearn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexternals\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msix\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mpydotplus\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpydot\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mdot_data\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
"\u001b[0;32m/tmp/ipykernel_47326/3723147494.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 12\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0mgraph\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpydot\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraph_from_dot_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdot_data\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgetvalue\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 14\u001b[0;31m \u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite_png\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'iris-tree.png'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 15\u001b[0m \u001b[0mImage\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate_png\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||||
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pydotplus'"
|
"\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36m<lambda>\u001b[0;34m(path, f, prog)\u001b[0m\n\u001b[1;32m 1808\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mpath\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1809\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mfrmt\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1810\u001b[0;31m \u001b[0mprog\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprog\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1811\u001b[0m )\n\u001b[1;32m 1812\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||||
|
"\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36mwrite\u001b[0;34m(self, path, prog, format)\u001b[0m\n\u001b[1;32m 1916\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1917\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1918\u001b[0;31m \u001b[0mfobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1919\u001b[0m \u001b[0;32mfinally\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1920\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||||
|
"\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36mcreate\u001b[0;34m(self, prog, format)\u001b[0m\n\u001b[1;32m 1957\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprogs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfind_graphviz\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1958\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprogs\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1959\u001b[0;31m raise InvocationException(\n\u001b[0m\u001b[1;32m 1960\u001b[0m 'GraphViz\\'s executables not found')\n\u001b[1;32m 1961\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||||
|
"\u001b[0;31mInvocationException\u001b[0m: GraphViz's executables not found"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"from IPython.display import Image \n",
|
"from IPython.display import Image \n",
|
||||||
"from sklearn.externals.six import StringIO\n",
|
"from six import StringIO\n",
|
||||||
"import pydotplus as pydot\n",
|
"import pydotplus as pydot\n",
|
||||||
"\n",
|
"\n",
|
||||||
"dot_data = StringIO() \n",
|
"dot_data = StringIO() \n",
|
||||||
@@ -510,10 +508,8 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n",
|
"* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
|
||||||
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
|
"* [Parameter estimation using grid search with cross-validation](https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html)\n",
|
||||||
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
|
|
||||||
"* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
|
|
||||||
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
|
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -529,8 +525,17 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -544,7 +549,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -39,7 +39,7 @@
|
|||||||
"* [Train classifier](#Train-classifier)\n",
|
"* [Train classifier](#Train-classifier)\n",
|
||||||
"* [More about Pipelines](#More-about-Pipelines)\n",
|
"* [More about Pipelines](#More-about-Pipelines)\n",
|
||||||
"* [Tuning the algorithm](#Tuning-the-algorithm)\n",
|
"* [Tuning the algorithm](#Tuning-the-algorithm)\n",
|
||||||
"\t* [Grid Search for Parameter optimization](#Grid-Search-for-Parameter-optimization)\n",
|
"\t* [Grid Search for Hyperparameter optimization](#Grid-Search-for-Hyperparameter-optimization)\n",
|
||||||
"* [Evaluating the algorithm](#Evaluating-the-algorithm)\n",
|
"* [Evaluating the algorithm](#Evaluating-the-algorithm)\n",
|
||||||
"\t* [K-Fold validation](#K-Fold-validation)\n",
|
"\t* [K-Fold validation](#K-Fold-validation)\n",
|
||||||
"* [References](#References)\n"
|
"* [References](#References)\n"
|
||||||
@@ -56,9 +56,9 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the parameters of the estimator?\n",
|
"In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the hyperparameters of the estimator?\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The goal of this notebook is to learn how to tune an algorithm by opimizing its parameters using grid search."
|
"The goal of this notebook is to learn how to tune an algorithm by opimizing its hyperparameters using grid search."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -300,21 +300,21 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"You can try different values for these parameters and observe the results."
|
"You can try different values for these hyperparameters and observe the results."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Grid Search for Parameter optimization"
|
"### Grid Search for Hyperparameter optimization"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Changing manually the parameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the parameters as an *optimization problem*. \n",
|
"Changing manually the hyperparameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the hyperparameters as an *optimization problem*. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"The sklearn comes with several optimization techniques for this purpose, such as **grid search** and **randomized search**. In this notebook we are going to introduce the former one."
|
"The sklearn comes with several optimization techniques for this purpose, such as **grid search** and **randomized search**. In this notebook we are going to introduce the former one."
|
||||||
]
|
]
|
||||||
@@ -323,7 +323,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"The sklearn provides an object that, given data, computes the score during the fit of an estimator on a parameter grid and chooses the parameters to maximize the cross-validation score. "
|
"The sklearn provides an object that, given data, computes the score during the fit of an estimator on a hyperparameter grid and chooses the hyperparameters to maximize the cross-validation score. "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -371,7 +371,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"We can now evaluate the KFold with this optimized parameter as follows."
|
"We can now evaluate the KFold with this optimized hyperparameter as follows."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -405,7 +405,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"We have got an *improvement* from 0.947 to 0.953 with k-fold.\n",
|
"We have got an *improvement* from 0.947 to 0.953 with k-fold.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"We are now to try to fit the best combination of the parameters of the algorithm. It can take some time to compute it."
|
"We are now to try to fit the best combination of the hyperparameters of the algorithm. It can take some time to compute it."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -414,12 +414,12 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Set the parameters by cross-validation\n",
|
"# Set the hyperparameters by cross-validation\n",
|
||||||
"\n",
|
"\n",
|
||||||
"from sklearn.metrics import classification_report\n",
|
"from sklearn.metrics import classification_report, recall_score, precision_score, make_scorer\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# set of parameters to test\n",
|
"# set of hyperparameters to test\n",
|
||||||
"tuned_parameters = [{'max_depth': np.arange(3, 10),\n",
|
"tuned_hyperparameters = [{'max_depth': np.arange(3, 10),\n",
|
||||||
"# 'max_weights': [1, 10, 100, 1000]},\n",
|
"# 'max_weights': [1, 10, 100, 1000]},\n",
|
||||||
" 'criterion': ['gini', 'entropy'], \n",
|
" 'criterion': ['gini', 'entropy'], \n",
|
||||||
" 'splitter': ['best', 'random'],\n",
|
" 'splitter': ['best', 'random'],\n",
|
||||||
@@ -431,14 +431,19 @@
|
|||||||
"scores = ['precision', 'recall']\n",
|
"scores = ['precision', 'recall']\n",
|
||||||
"\n",
|
"\n",
|
||||||
"for score in scores:\n",
|
"for score in scores:\n",
|
||||||
" print(\"# Tuning hyper-parameters for %s\" % score)\n",
|
" print(\"# Tuning hyperparameters for %s\" % score)\n",
|
||||||
" print()\n",
|
" print()\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
" if score == 'precision':\n",
|
||||||
|
" scorer = make_scorer(precision_score, average='weighted', zero_division=0)\n",
|
||||||
|
" elif score == 'recall':\n",
|
||||||
|
" scorer = make_scorer(recall_score, average='weighted', zero_division=0)\n",
|
||||||
|
" \n",
|
||||||
" # cv = the fold of the cross-validation cv, defaulted to 5\n",
|
" # cv = the fold of the cross-validation cv, defaulted to 5\n",
|
||||||
" gs = GridSearchCV(DecisionTreeClassifier(), tuned_parameters, cv=10, scoring='%s_weighted' % score)\n",
|
" gs = GridSearchCV(DecisionTreeClassifier(), tuned_hyperparameters, cv=10, scoring=scorer)\n",
|
||||||
" gs.fit(x_train, y_train)\n",
|
" gs.fit(x_train, y_train)\n",
|
||||||
"\n",
|
"\n",
|
||||||
" print(\"Best parameters set found on development set:\")\n",
|
" print(\"Best hyperparameters set found on development set:\")\n",
|
||||||
" print()\n",
|
" print()\n",
|
||||||
" print(gs.best_params_)\n",
|
" print(gs.best_params_)\n",
|
||||||
" print()\n",
|
" print()\n",
|
||||||
@@ -512,10 +517,8 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n",
|
"* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
|
||||||
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
|
"* [Hyperparameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
|
||||||
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
|
|
||||||
"* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
|
|
||||||
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
|
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -538,7 +541,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -552,7 +555,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.7"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -117,7 +117,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# save model\n",
|
"# save model\n",
|
||||||
"from sklearn.externals import joblib\n",
|
"import joblib\n",
|
||||||
"joblib.dump(model, 'filename.pkl') \n",
|
"joblib.dump(model, 'filename.pkl') \n",
|
||||||
"\n",
|
"\n",
|
||||||
"#load model\n",
|
"#load model\n",
|
||||||
@@ -136,7 +136,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Tutorial scikit-learn](http://scikit-learn.org/stable/tutorial/basic/tutorial.html)\n",
|
"* [Tutorial scikit-learn](http://scikit-learn.org/stable/tutorial/basic/tutorial.html)\n",
|
||||||
"* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)"
|
"* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)\n",
|
||||||
|
"* [scikit-learn : Machine Learning Simplified](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
|
||||||
|
"* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -151,8 +153,17 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -166,7 +177,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.7"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -47,7 +47,7 @@ def get_code(tree, feature_names, target_names,
|
|||||||
|
|
||||||
recurse(left, right, threshold, features, 0, 0)
|
recurse(left, right, threshold, features, 0, 0)
|
||||||
|
|
||||||
# Taken from http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#example-tree-plot-iris-py
|
# Taken from https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import matplotlib.pyplot as plt
|
import matplotlib.pyplot as plt
|
||||||
|
|
||||||
|
@@ -2,6 +2,7 @@ import numpy as np
|
|||||||
import matplotlib.pyplot as plt
|
import matplotlib.pyplot as plt
|
||||||
from matplotlib.colors import ListedColormap
|
from matplotlib.colors import ListedColormap
|
||||||
from sklearn import neighbors, datasets
|
from sklearn import neighbors, datasets
|
||||||
|
import seaborn as sns
|
||||||
from sklearn.neighbors import KNeighborsClassifier
|
from sklearn.neighbors import KNeighborsClassifier
|
||||||
|
|
||||||
# Taken from http://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html
|
# Taken from http://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html
|
||||||
@@ -20,8 +21,8 @@ def plot_classification_iris():
|
|||||||
n_neighbors = 15
|
n_neighbors = 15
|
||||||
|
|
||||||
# Create color maps
|
# Create color maps
|
||||||
cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA', '#AAAAFF'])
|
cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue'])
|
||||||
cmap_bold = ListedColormap(['#FF0000', '#00FF00', '#0000FF'])
|
cmap_bold = ['darkorange', 'c', 'darkblue']
|
||||||
|
|
||||||
for weights in ['uniform', 'distance']:
|
for weights in ['uniform', 'distance']:
|
||||||
# we create an instance of Neighbours Classifier and fit the data.
|
# we create an instance of Neighbours Classifier and fit the data.
|
||||||
@@ -29,7 +30,7 @@ def plot_classification_iris():
|
|||||||
clf.fit(X, y)
|
clf.fit(X, y)
|
||||||
|
|
||||||
# Plot the decision boundary. For that, we will assign a color to each
|
# Plot the decision boundary. For that, we will assign a color to each
|
||||||
# point in the mesh [x_min, m_max]x[y_min, y_max].
|
# point in the mesh [x_min, x_max]x[y_min, y_max].
|
||||||
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
|
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
|
||||||
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
|
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
|
||||||
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
|
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
|
||||||
@@ -38,14 +39,17 @@ def plot_classification_iris():
|
|||||||
|
|
||||||
# Put the result into a color plot
|
# Put the result into a color plot
|
||||||
Z = Z.reshape(xx.shape)
|
Z = Z.reshape(xx.shape)
|
||||||
plt.figure()
|
plt.figure(figsize=(8, 6))
|
||||||
plt.pcolormesh(xx, yy, Z, cmap=cmap_light)
|
plt.contourf(xx, yy, Z, cmap=cmap_light)
|
||||||
|
|
||||||
# Plot also the training points
|
# Plot also the training points
|
||||||
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold)
|
sns.scatterplot(x=X[:, 0], y=X[:, 1], hue=iris.target_names[y],
|
||||||
|
palette=cmap_bold, alpha=1.0, edgecolor="black")
|
||||||
plt.xlim(xx.min(), xx.max())
|
plt.xlim(xx.min(), xx.max())
|
||||||
plt.ylim(yy.min(), yy.max())
|
plt.ylim(yy.min(), yy.max())
|
||||||
plt.title("3-Class classification (k = %i, weights = '%s')"
|
plt.title("3-Class classification (k = %i, weights = '%s')"
|
||||||
% (n_neighbors, weights))
|
% (n_neighbors, weights))
|
||||||
|
plt.xlabel(iris.feature_names[0])
|
||||||
|
plt.ylabel(iris.feature_names[1])
|
||||||
|
|
||||||
plt.show()
|
plt.show()
|
@@ -74,9 +74,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
|
"* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
|
||||||
"* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
|
"* [Scikit-learn videos and notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
|
||||||
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
|
|
||||||
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -92,7 +90,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -106,7 +104,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -213,8 +213,7 @@
|
|||||||
"* [Pandas API input-output](http://pandas.pydata.org/pandas-docs/stable/api.html#input-output)\n",
|
"* [Pandas API input-output](http://pandas.pydata.org/pandas-docs/stable/api.html#input-output)\n",
|
||||||
"* [Pandas API - pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)\n",
|
"* [Pandas API - pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)\n",
|
||||||
"* [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html)\n",
|
"* [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html)\n",
|
||||||
"* [An introduction to NumPy and Scipy](http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf)\n",
|
"* [An introduction to NumPy and Scipy](https://sites.engineering.ucsb.edu/~shell/che210d/numpy.pdf)\n"
|
||||||
"* [NumPy tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@@ -433,10 +433,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Pandas](http://pandas.pydata.org/)\n",
|
"* [Pandas](http://pandas.pydata.org/)\n",
|
||||||
"* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n",
|
"* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
|
||||||
"* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
|
|
||||||
"* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
|
"* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
|
||||||
"* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)"
|
"* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -458,7 +457,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -472,7 +471,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -373,8 +373,8 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"#Mean age of passengers per Passenger class\n",
|
"#Mean age of passengers per Passenger class\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#First we calculate the mean\n",
|
"#First we calculate the mean for the numeric columns\n",
|
||||||
"df.groupby('Pclass').mean()"
|
"df.select_dtypes(np.number).groupby('Pclass').mean()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -404,7 +404,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"#Mean Age and SibSp of passengers grouped by passenger class and sex\n",
|
"#Mean Age and SibSp of passengers grouped by passenger class and sex\n",
|
||||||
"df.groupby(['Pclass', 'Sex'])['Age','SibSp'].mean()"
|
"df.groupby(['Pclass', 'Sex'])[['Age','SibSp']].mean()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -414,7 +414,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"#Show mean Age and SibSp for passengers older than 25 grouped by Passenger Class and Sex\n",
|
"#Show mean Age and SibSp for passengers older than 25 grouped by Passenger Class and Sex\n",
|
||||||
"df[df.Age > 25].groupby(['Pclass', 'Sex'])['Age','SibSp'].mean()"
|
"df[df.Age > 25].groupby(['Pclass', 'Sex'])[['Age','SibSp']].mean()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -424,7 +424,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Mean age, SibSp , Survived of passengers older than 25 which survived, grouped by Passenger Class and Sex \n",
|
"# Mean age, SibSp , Survived of passengers older than 25 which survived, grouped by Passenger Class and Sex \n",
|
||||||
"df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].mean()"
|
"df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])[['Age','SibSp','Survived']].mean()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -436,7 +436,7 @@
|
|||||||
"# We can also decide which function apply in each column\n",
|
"# We can also decide which function apply in each column\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#Show mean Age, mean SibSp, and number of passengers older than 25 that survived, grouped by Passenger Class and Sex\n",
|
"#Show mean Age, mean SibSp, and number of passengers older than 25 that survived, grouped by Passenger Class and Sex\n",
|
||||||
"df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].agg({'Age': np.mean, \n",
|
"df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])[['Age','SibSp','Survived']].agg({'Age': np.mean, \n",
|
||||||
" 'SibSp': np.mean, 'Survived': np.sum})"
|
" 'SibSp': np.mean, 'Survived': np.sum})"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -600,8 +600,8 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Fill missing values with the median\n",
|
"# Fill missing values with the median, we avoid empty (None) values with numeric_only\n",
|
||||||
"df_filled = df.fillna(df.median())\n",
|
"df_filled = df.fillna(df.median(numeric_only=True))\n",
|
||||||
"df_filled[-5:]"
|
"df_filled[-5:]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -685,7 +685,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# But we are working on a copy \n",
|
"# But we are working on a copy, so we get a warning\n",
|
||||||
"df.iloc[889]['Sex'] = np.nan"
|
"df.iloc[889]['Sex'] = np.nan"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -695,7 +695,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# If we want to change, we should not chain selections\n",
|
"# If we want to change it, we should not chain selections\n",
|
||||||
"# The selection can be done with the column name\n",
|
"# The selection can be done with the column name\n",
|
||||||
"df.loc[889, 'Sex']"
|
"df.loc[889, 'Sex']"
|
||||||
]
|
]
|
||||||
@@ -932,11 +932,11 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Pandas](http://pandas.pydata.org/)\n",
|
"* [Pandas](http://pandas.pydata.org/)\n",
|
||||||
"* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n",
|
"* [Learning Pandas, Michael Heydt, Packt Publishing, 2017](https://learning.oreilly.com/library/view/learning-pandas/9781787123137/)\n",
|
||||||
"* [Useful Pandas Snippets](https://gist.github.com/bsweger/e5817488d161f37dcbd2)\n",
|
"* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
|
||||||
"* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
|
|
||||||
"* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
|
"* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
|
||||||
"* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)"
|
"* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)\n",
|
||||||
|
"* [Useful Pandas Snippets](https://gist.github.com/bsweger/e5817488d161f37dcbd2)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -958,7 +958,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -972,7 +972,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -220,7 +220,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Analise distributon\n",
|
"# Analise distribution\n",
|
||||||
"df.hist(figsize=(10,10))\n",
|
"df.hist(figsize=(10,10))\n",
|
||||||
"plt.show()"
|
"plt.show()"
|
||||||
]
|
]
|
||||||
@@ -233,7 +233,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# We can see the pairwise correlation between variables. A value near 0 means low correlation\n",
|
"# We can see the pairwise correlation between variables. A value near 0 means low correlation\n",
|
||||||
"# while a value near -1 or 1 indicates strong correlation.\n",
|
"# while a value near -1 or 1 indicates strong correlation.\n",
|
||||||
"df.corr()"
|
"df.corr(numeric_only = True)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -249,11 +249,10 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# General description of relationship betweek variables uwing Seaborn PairGrid\n",
|
"# General description of relationship between variables uwing Seaborn PairGrid\n",
|
||||||
"# We use df_clean, since the null values of df would gives us an error, you can check it.\n",
|
"# We use df_clean, since the null values of df would gives us an error, you can check it.\n",
|
||||||
"g = sns.PairGrid(df_clean, hue=\"Survived\")\n",
|
"g = sns.PairGrid(df_clean, hue=\"Survived\")\n",
|
||||||
"g.map_diag(plt.hist)\n",
|
"g.map(sns.scatterplot)\n",
|
||||||
"g.map_offdiag(plt.scatter)\n",
|
|
||||||
"g.add_legend()"
|
"g.add_legend()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -367,7 +366,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Now we visualise age and survived to see if there is some relationship\n",
|
"# Now we visualise age and survived to see if there is some relationship\n",
|
||||||
"sns.FacetGrid(df, hue=\"Survived\", size=5).map(sns.kdeplot, \"Age\").add_legend()"
|
"sns.FacetGrid(df, hue=\"Survived\", height=5).map(sns.kdeplot, \"Age\").add_legend()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -567,7 +566,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Plot with seaborn\n",
|
"# Plot with seaborn\n",
|
||||||
"sns.countplot('Sex', data=df)"
|
"sns.countplot(x='Sex', data=df)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -683,16 +682,6 @@
|
|||||||
"df.groupby('Pclass').size()"
|
"df.groupby('Pclass').size()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Distribution\n",
|
|
||||||
"sns.countplot('Pclass', data=df)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -725,7 +714,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"sns.factorplot('Pclass',data=df,hue='Sex',kind='count')"
|
"sns.catplot(x='Pclass',data=df,hue='Sex',kind='count')"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -906,7 +895,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Distribution\n",
|
"# Distribution\n",
|
||||||
"sns.countplot('Embarked', data=df)"
|
"sns.countplot(x='Embarked', data=df)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -997,7 +986,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Distribution\n",
|
"# Distribution\n",
|
||||||
"sns.countplot('SibSp', data=df)"
|
"sns.countplot(x='SibSp', data=df)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1180,7 +1169,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Distribution\n",
|
"# Distribution\n",
|
||||||
"sns.countplot('Parch', data=df)"
|
"sns.countplot(x='Parch', data=df)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1233,7 +1222,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df.groupby(['Pclass', 'Sex', 'Parch'])['Parch', 'SibSp', 'Survived'].agg({'Parch': np.size, 'SibSp': np.mean, 'Survived': np.mean})"
|
"df.groupby(['Pclass', 'Sex', 'Parch'])[['Parch', 'SibSp', 'Survived']].agg({'Parch': np.size, 'SibSp': np.mean, 'Survived': np.mean})"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -1576,7 +1565,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -1590,7 +1579,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -72,7 +72,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv\"\n",
|
"Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Print *df*."
|
"Print *df*."
|
||||||
]
|
]
|
||||||
@@ -214,7 +214,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df['FamilySize'] = df['SibSp'] + df['Parch']\n",
|
"df['FamilySize'] = df['SibSp'] + df['Parch']\n",
|
||||||
"df.head()"
|
"df"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -377,8 +377,8 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Group ages to simplify machine learning algorithms. 0: 0-5, 1: 6-10, 2: 11-15, 3: 16-59 and 4: 60-80\n",
|
"# Group ages to simplify machine learning algorithms. 0: 0-5, 1: 6-10, 2: 11-15, 3: 16-59 and 4: 60-80\n",
|
||||||
"df['AgeGroup'] = 0\n",
|
"df['AgeGroup'] = np.nan\n",
|
||||||
"df.loc[(.Age<6),'AgeGroup'] = 0\n",
|
"df.loc[(df.Age<6),'AgeGroup'] = 0\n",
|
||||||
"df.loc[(df.Age>=6) & (df.Age < 11),'AgeGroup'] = 1\n",
|
"df.loc[(df.Age>=6) & (df.Age < 11),'AgeGroup'] = 1\n",
|
||||||
"df.loc[(df.Age>=11) & (df.Age < 16),'AgeGroup'] = 2\n",
|
"df.loc[(df.Age>=11) & (df.Age < 16),'AgeGroup'] = 2\n",
|
||||||
"df.loc[(df.Age>=16) & (df.Age < 60),'AgeGroup'] = 3\n",
|
"df.loc[(df.Age>=16) & (df.Age < 60),'AgeGroup'] = 3\n",
|
||||||
@@ -404,8 +404,8 @@
|
|||||||
" if np.isnan(big_string):\n",
|
" if np.isnan(big_string):\n",
|
||||||
" return 'X'\n",
|
" return 'X'\n",
|
||||||
" for substring in substrings:\n",
|
" for substring in substrings:\n",
|
||||||
" if big_string.find(substring) != 1:\n",
|
" if substring in big_string:\n",
|
||||||
" return substring\n",
|
" return substring[0::]\n",
|
||||||
" print(big_string)\n",
|
" print(big_string)\n",
|
||||||
" return 'X'\n",
|
" return 'X'\n",
|
||||||
" \n",
|
" \n",
|
||||||
@@ -478,8 +478,17 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -493,7 +502,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -78,7 +78,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
|
"* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka and Vahid Mirjalili, Packt Publishing, 2019."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -100,7 +100,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -114,7 +114,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -222,7 +222,7 @@
|
|||||||
"kernel = types_of_kernels[0]\n",
|
"kernel = types_of_kernels[0]\n",
|
||||||
"gamma = 3.0\n",
|
"gamma = 3.0\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Create kNN model\n",
|
"# Create SVM model\n",
|
||||||
"model = SVC(kernel=kernel, probability=True, gamma=gamma)"
|
"model = SVC(kernel=kernel, probability=True, gamma=gamma)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -276,7 +276,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"We can evaluate the accuracy if the model always predict the most frequent class, following this [refeference](http://blog.kaggle.com/2015/10/23/scikit-learn-video-9-better-evaluation-of-classification-models/)."
|
"We can evaluate the accuracy if the model always predict the most frequent class, following this [reference](https://medium.com/analytics-vidhya/model-validation-for-classification-5ff4a0373090)."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -351,10 +351,10 @@
|
|||||||
"We can obtain more information from the confussion matrix and the metric F1-score.\n",
|
"We can obtain more information from the confussion matrix and the metric F1-score.\n",
|
||||||
"In a confussion matrix, we can see:\n",
|
"In a confussion matrix, we can see:\n",
|
||||||
"\n",
|
"\n",
|
||||||
"||**Predicted**: 0| **Predicted: 1**|\n",
|
"| |**Predicted**: 0| **Predicted: 1**|\n",
|
||||||
"|---------------------------|\n",
|
"|-------------|----------------|-----------------|\n",
|
||||||
"|**Actual: 0**| TN | FP |\n",
|
"|**Actual: 0**| TN | FP |\n",
|
||||||
"|**Actual: 1**| FN|TP|\n",
|
"|**Actual: 1**| FN | TP |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"* **True negatives (TN)**: actual negatives that were predicted as negatives\n",
|
"* **True negatives (TN)**: actual negatives that were predicted as negatives\n",
|
||||||
"* **False positives (FP)**: actual negatives that were predicted as positives\n",
|
"* **False positives (FP)**: actual negatives that were predicted as positives\n",
|
||||||
@@ -418,7 +418,7 @@
|
|||||||
"plt.ylim([0.0, 1.0])\n",
|
"plt.ylim([0.0, 1.0])\n",
|
||||||
"plt.title('ROC curve for Titanic')\n",
|
"plt.title('ROC curve for Titanic')\n",
|
||||||
"plt.xlabel('False Positive Rate (1 - Recall)')\n",
|
"plt.xlabel('False Positive Rate (1 - Recall)')\n",
|
||||||
"plt.xlabel('True Positive Rate (Sensitivity)')\n",
|
"plt.ylabel('True Positive Rate (Sensitivity)')\n",
|
||||||
"plt.grid(True)"
|
"plt.grid(True)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -535,13 +535,13 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# This step will take some time\n",
|
"# This step will take some time\n",
|
||||||
"# Cross-validationt\n",
|
"# Cross-validationt\n",
|
||||||
"cv = KFold(n_splits=5, shuffle=False, random_state=33)\n",
|
"cv = KFold(n_splits=5, shuffle=True, random_state=33)\n",
|
||||||
"# StratifiedKFold has is a variation of k-fold which returns stratified folds:\n",
|
"# StratifiedKFold has is a variation of k-fold which returns stratified folds:\n",
|
||||||
"# each set contains approximately the same percentage of samples of each target class as the complete set.\n",
|
"# each set contains approximately the same percentage of samples of each target class as the complete set.\n",
|
||||||
"#cv = StratifiedKFold(y, n_folds=3, shuffle=False, random_state=33)\n",
|
"#cv = StratifiedKFold(y, n_folds=3, shuffle=True, random_state=33)\n",
|
||||||
"scores = cross_val_score(model, X, y, cv=cv)\n",
|
"scores = cross_val_score(model, X, y, cv=cv)\n",
|
||||||
"print(\"Scores in every iteration\", scores)\n",
|
"print(\"Scores in every iteration\", scores)\n",
|
||||||
"print(\"Accuracy: %0.2f (+/- %0.2f)\" % (scores.mean(), scores.std() * 2))\n"
|
"print(\"Accuracy: %0.2f (+/- %0.2f)\" % (scores.mean(), scores.std() * 2))"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -644,7 +644,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"* [Titanic Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
|
"* [Titanic Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
|
||||||
"* [API SVC scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)\n",
|
"* [API SVC scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)\n",
|
||||||
"* [Better evaluation of classification models](http://blog.kaggle.com/2015/10/23/scikit-learn-video-9-better-evaluation-of-classification-models/)"
|
"* [How to choose the right metric for evaluating an ML model](https://www.kaggle.com/vipulgandhi/how-to-choose-right-metric-for-evaluating-ml-model)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -666,7 +666,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -680,7 +680,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -39,7 +39,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"In this exercise we are going to put in practice what we have learnt in the notebooks of the session. \n",
|
"In this exercise, we are going to put in practice what we have learnt in the notebooks of the session. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"In the previous notebook we have been applying the SVM machine learning algorithm.\n",
|
"In the previous notebook we have been applying the SVM machine learning algorithm.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -67,7 +67,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -81,7 +81,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.8.12"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -1,21 +1,21 @@
|
|||||||
"""
|
"""
|
||||||
Taken from http://scikit-learn.org/stable/auto_examples/model_selection/plot_learning_curve.html
|
|
||||||
|
|
||||||
========================
|
========================
|
||||||
Plotting Learning Curves
|
Plotting Learning Curves
|
||||||
========================
|
========================
|
||||||
|
In the first column, first row the learning curve of a naive Bayes classifier
|
||||||
|
is shown for the digits dataset. Note that the training score and the
|
||||||
|
cross-validation score are both not very good at the end. However, the shape
|
||||||
|
of the curve can be found in more complex datasets very often: the training
|
||||||
|
score is very high at the beginning and decreases and the cross-validation
|
||||||
|
score is very low at the beginning and increases. In the second column, first
|
||||||
|
row we see the learning curve of an SVM with RBF kernel. We can see clearly
|
||||||
|
that the training score is still around the maximum and the validation score
|
||||||
|
could be increased with more training samples. The plots in the second row
|
||||||
|
show the times required by the models to train with various sizes of training
|
||||||
|
dataset. The plots in the third row show how much time was required to train
|
||||||
|
the models for each training sizes.
|
||||||
|
|
||||||
On the left side the learning curve of a naive Bayes classifier is shown for
|
|
||||||
the digits dataset. Note that the training score and the cross-validation score
|
|
||||||
are both not very good at the end. However, the shape of the curve can be found
|
|
||||||
in more complex datasets very often: the training score is very high at the
|
|
||||||
beginning and decreases and the cross-validation score is very low at the
|
|
||||||
beginning and increases. On the right side we see the learning curve of an SVM
|
|
||||||
with RBF kernel. We can see clearly that the training score is still around
|
|
||||||
the maximum and the validation score could be increased with more training
|
|
||||||
samples.
|
|
||||||
"""
|
"""
|
||||||
#print(__doc__)
|
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import matplotlib.pyplot as plt
|
import matplotlib.pyplot as plt
|
||||||
@@ -23,86 +23,181 @@ from sklearn.naive_bayes import GaussianNB
|
|||||||
from sklearn.svm import SVC
|
from sklearn.svm import SVC
|
||||||
from sklearn.datasets import load_digits
|
from sklearn.datasets import load_digits
|
||||||
from sklearn.model_selection import learning_curve
|
from sklearn.model_selection import learning_curve
|
||||||
|
from sklearn.model_selection import ShuffleSplit
|
||||||
|
|
||||||
|
|
||||||
def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,
|
def plot_learning_curve(
|
||||||
n_jobs=1, train_sizes=np.linspace(.1, 1.0, 5)):
|
estimator,
|
||||||
|
title,
|
||||||
|
X,
|
||||||
|
y,
|
||||||
|
axes=None,
|
||||||
|
ylim=None,
|
||||||
|
cv=None,
|
||||||
|
n_jobs=None,
|
||||||
|
train_sizes=np.linspace(0.1, 1.0, 5),
|
||||||
|
):
|
||||||
"""
|
"""
|
||||||
Generate a simple plot of the test and traning learning curve.
|
Generate 3 plots: the test and training learning curve, the training
|
||||||
|
samples vs fit times curve, the fit times vs score curve.
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
estimator : object type that implements the "fit" and "predict" methods
|
estimator : estimator instance
|
||||||
An object of that type which is cloned for each validation.
|
An estimator instance implementing `fit` and `predict` methods which
|
||||||
|
will be cloned for each validation.
|
||||||
|
|
||||||
title : string
|
title : str
|
||||||
Title for the chart.
|
Title for the chart.
|
||||||
|
|
||||||
X : array-like, shape (n_samples, n_features)
|
X : array-like of shape (n_samples, n_features)
|
||||||
Training vector, where n_samples is the number of samples and
|
Training vector, where ``n_samples`` is the number of samples and
|
||||||
n_features is the number of features.
|
``n_features`` is the number of features.
|
||||||
|
|
||||||
y : array-like, shape (n_samples) or (n_samples, n_features), optional
|
y : array-like of shape (n_samples) or (n_samples, n_features)
|
||||||
Target relative to X for classification or regression;
|
Target relative to ``X`` for classification or regression;
|
||||||
None for unsupervised learning.
|
None for unsupervised learning.
|
||||||
|
|
||||||
ylim : tuple, shape (ymin, ymax), optional
|
axes : array-like of shape (3,), default=None
|
||||||
Defines minimum and maximum yvalues plotted.
|
Axes to use for plotting the curves.
|
||||||
|
|
||||||
cv : integer, cross-validation generator, optional
|
ylim : tuple of shape (2,), default=None
|
||||||
If an integer is passed, it is the number of folds (defaults to 3).
|
Defines minimum and maximum y-values plotted, e.g. (ymin, ymax).
|
||||||
Specific cross-validation objects can be passed, see
|
|
||||||
sklearn.model_selection module for the list of possible objects
|
|
||||||
|
|
||||||
n_jobs : integer, optional
|
cv : int, cross-validation generator or an iterable, default=None
|
||||||
Number of jobs to run in parallel (default 1).
|
Determines the cross-validation splitting strategy.
|
||||||
|
Possible inputs for cv are:
|
||||||
|
|
||||||
|
- None, to use the default 5-fold cross-validation,
|
||||||
|
- integer, to specify the number of folds.
|
||||||
|
- :term:`CV splitter`,
|
||||||
|
- An iterable yielding (train, test) splits as arrays of indices.
|
||||||
|
|
||||||
|
For integer/None inputs, if ``y`` is binary or multiclass,
|
||||||
|
:class:`StratifiedKFold` used. If the estimator is not a classifier
|
||||||
|
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.
|
||||||
|
|
||||||
|
Refer :ref:`User Guide <cross_validation>` for the various
|
||||||
|
cross-validators that can be used here.
|
||||||
|
|
||||||
|
n_jobs : int or None, default=None
|
||||||
|
Number of jobs to run in parallel.
|
||||||
|
``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
|
||||||
|
``-1`` means using all processors. See :term:`Glossary <n_jobs>`
|
||||||
|
for more details.
|
||||||
|
|
||||||
|
train_sizes : array-like of shape (n_ticks,)
|
||||||
|
Relative or absolute numbers of training examples that will be used to
|
||||||
|
generate the learning curve. If the ``dtype`` is float, it is regarded
|
||||||
|
as a fraction of the maximum size of the training set (that is
|
||||||
|
determined by the selected validation method), i.e. it has to be within
|
||||||
|
(0, 1]. Otherwise it is interpreted as absolute sizes of the training
|
||||||
|
sets. Note that for classification the number of samples usually have
|
||||||
|
to be big enough to contain at least one sample from each class.
|
||||||
|
(default: np.linspace(0.1, 1.0, 5))
|
||||||
"""
|
"""
|
||||||
plt.figure()
|
if axes is None:
|
||||||
plt.title(title)
|
_, axes = plt.subplots(1, 3, figsize=(20, 5))
|
||||||
|
|
||||||
|
axes[0].set_title(title)
|
||||||
if ylim is not None:
|
if ylim is not None:
|
||||||
plt.ylim(*ylim)
|
axes[0].set_ylim(*ylim)
|
||||||
plt.xlabel("Training examples")
|
axes[0].set_xlabel("Training examples")
|
||||||
plt.ylabel("Score")
|
axes[0].set_ylabel("Score")
|
||||||
train_sizes, train_scores, test_scores = learning_curve(
|
|
||||||
estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)
|
train_sizes, train_scores, test_scores, fit_times, _ = learning_curve(
|
||||||
|
estimator,
|
||||||
|
X,
|
||||||
|
y,
|
||||||
|
cv=cv,
|
||||||
|
n_jobs=n_jobs,
|
||||||
|
train_sizes=train_sizes,
|
||||||
|
return_times=True,
|
||||||
|
)
|
||||||
train_scores_mean = np.mean(train_scores, axis=1)
|
train_scores_mean = np.mean(train_scores, axis=1)
|
||||||
train_scores_std = np.std(train_scores, axis=1)
|
train_scores_std = np.std(train_scores, axis=1)
|
||||||
test_scores_mean = np.mean(test_scores, axis=1)
|
test_scores_mean = np.mean(test_scores, axis=1)
|
||||||
test_scores_std = np.std(test_scores, axis=1)
|
test_scores_std = np.std(test_scores, axis=1)
|
||||||
plt.grid()
|
fit_times_mean = np.mean(fit_times, axis=1)
|
||||||
|
fit_times_std = np.std(fit_times, axis=1)
|
||||||
|
|
||||||
plt.fill_between(train_sizes, train_scores_mean - train_scores_std,
|
# Plot learning curve
|
||||||
train_scores_mean + train_scores_std, alpha=0.1,
|
axes[0].grid()
|
||||||
color="r")
|
axes[0].fill_between(
|
||||||
plt.fill_between(train_sizes, test_scores_mean - test_scores_std,
|
train_sizes,
|
||||||
test_scores_mean + test_scores_std, alpha=0.1, color="g")
|
train_scores_mean - train_scores_std,
|
||||||
plt.plot(train_sizes, train_scores_mean, 'o-', color="r",
|
train_scores_mean + train_scores_std,
|
||||||
label="Training score")
|
alpha=0.1,
|
||||||
plt.plot(train_sizes, test_scores_mean, 'o-', color="g",
|
color="r",
|
||||||
label="Cross-validation score")
|
)
|
||||||
|
axes[0].fill_between(
|
||||||
|
train_sizes,
|
||||||
|
test_scores_mean - test_scores_std,
|
||||||
|
test_scores_mean + test_scores_std,
|
||||||
|
alpha=0.1,
|
||||||
|
color="g",
|
||||||
|
)
|
||||||
|
axes[0].plot(
|
||||||
|
train_sizes, train_scores_mean, "o-", color="r", label="Training score"
|
||||||
|
)
|
||||||
|
axes[0].plot(
|
||||||
|
train_sizes, test_scores_mean, "o-", color="g", label="Cross-validation score"
|
||||||
|
)
|
||||||
|
axes[0].legend(loc="best")
|
||||||
|
|
||||||
|
# Plot n_samples vs fit_times
|
||||||
|
axes[1].grid()
|
||||||
|
axes[1].plot(train_sizes, fit_times_mean, "o-")
|
||||||
|
axes[1].fill_between(
|
||||||
|
train_sizes,
|
||||||
|
fit_times_mean - fit_times_std,
|
||||||
|
fit_times_mean + fit_times_std,
|
||||||
|
alpha=0.1,
|
||||||
|
)
|
||||||
|
axes[1].set_xlabel("Training examples")
|
||||||
|
axes[1].set_ylabel("fit_times")
|
||||||
|
axes[1].set_title("Scalability of the model")
|
||||||
|
|
||||||
|
# Plot fit_time vs score
|
||||||
|
fit_time_argsort = fit_times_mean.argsort()
|
||||||
|
fit_time_sorted = fit_times_mean[fit_time_argsort]
|
||||||
|
test_scores_mean_sorted = test_scores_mean[fit_time_argsort]
|
||||||
|
test_scores_std_sorted = test_scores_std[fit_time_argsort]
|
||||||
|
axes[2].grid()
|
||||||
|
axes[2].plot(fit_time_sorted, test_scores_mean_sorted, "o-")
|
||||||
|
axes[2].fill_between(
|
||||||
|
fit_time_sorted,
|
||||||
|
test_scores_mean_sorted - test_scores_std_sorted,
|
||||||
|
test_scores_mean_sorted + test_scores_std_sorted,
|
||||||
|
alpha=0.1,
|
||||||
|
)
|
||||||
|
axes[2].set_xlabel("fit_times")
|
||||||
|
axes[2].set_ylabel("Score")
|
||||||
|
axes[2].set_title("Performance of the model")
|
||||||
|
|
||||||
plt.legend(loc="best")
|
|
||||||
return plt
|
return plt
|
||||||
|
|
||||||
|
|
||||||
#digits = load_digits()
|
fig, axes = plt.subplots(3, 2, figsize=(10, 15))
|
||||||
#X, y = digits.data, digits.target
|
|
||||||
|
|
||||||
|
X, y = load_digits(return_X_y=True)
|
||||||
|
|
||||||
#title = "Learning Curves (Naive Bayes)"
|
title = "Learning Curves (Naive Bayes)"
|
||||||
# Cross validation with 100 iterations to get smoother mean test and train
|
# Cross validation with 50 iterations to get smoother mean test and train
|
||||||
# score curves, each time with 20% data randomly selected as a validation set.
|
# score curves, each time with 20% data randomly selected as a validation set.
|
||||||
#cv = cross_validation.ShuffleSplit(digits.data.shape[0], n_iter=100,
|
cv = ShuffleSplit(n_splits=50, test_size=0.2, random_state=0)
|
||||||
# test_size=0.2, random_state=0)
|
|
||||||
|
|
||||||
#estimator = GaussianNB()
|
estimator = GaussianNB()
|
||||||
#plot_learning_curve(estimator, title, X, y, ylim=(0.7, 1.01), cv=cv, n_jobs=4)
|
plot_learning_curve(
|
||||||
|
estimator, title, X, y, axes=axes[:, 0], ylim=(0.7, 1.01), cv=cv, n_jobs=4
|
||||||
|
)
|
||||||
|
|
||||||
#title = "Learning Curves (SVM, RBF kernel, $\gamma=0.001$)"
|
title = r"Learning Curves (SVM, RBF kernel, $\gamma=0.001$)"
|
||||||
# SVC is more expensive so we do a lower number of CV iterations:
|
# SVC is more expensive so we do a lower number of CV iterations:
|
||||||
#cv = cross_validation.ShuffleSplit(digits.data.shape[0], n_iter=10,
|
cv = ShuffleSplit(n_splits=5, test_size=0.2, random_state=0)
|
||||||
# test_size=0.2, random_state=0)
|
estimator = SVC(gamma=0.001)
|
||||||
#estimator = SVC(gamma=0.001)
|
plot_learning_curve(
|
||||||
#plot_learning_curve(estimator, title, X, y, (0.7, 1.01), cv=cv, n_jobs=4)
|
estimator, title, X, y, axes=axes[:, 1], ylim=(0.7, 1.01), cv=cv, n_jobs=4
|
||||||
|
)
|
||||||
|
|
||||||
#plt.show()
|
plt.show()
|
||||||
|
@@ -3,7 +3,7 @@ import matplotlib.pyplot as plt
|
|||||||
import numpy as np
|
import numpy as np
|
||||||
from sklearn import svm
|
from sklearn import svm
|
||||||
|
|
||||||
#Taken from http://nbviewer.jupyter.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb
|
# Taken from http://nbviewer.jupyter.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb
|
||||||
|
|
||||||
def plot_svm(df):
|
def plot_svm(df):
|
||||||
# set plotting parameters
|
# set plotting parameters
|
||||||
|
@@ -56,7 +56,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"# Genetic Algorithms\n",
|
"# Genetic Algorithms\n",
|
||||||
"In this section we are going to use the library [DEAP](#References) for implementing a genetic algorithms.\n",
|
"In this section we are going to use the library DEAP [[References](#References)] for implementing a genetic algorithms.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"We are going to implement the OneMax problem as seen in class.\n",
|
"We are going to implement the OneMax problem as seen in class.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -200,11 +200,13 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"## Optimizing ML hyperparameters\n",
|
"## Optimizing ML hyperparameters\n",
|
||||||
"\n",
|
"\n",
|
||||||
"One of the applications of Genetic Algorithms is the optimization of ML hyperparameters. Previously we have used GridSearch from Scikit. Using (sklearn-deap)[#References], optimize the Titatic hyperparameters using both GridSearch and Genetic Algorithms. \n",
|
"One of the applications of Genetic Algorithms is the optimization of ML hyperparameters. Previously we have used GridSearch from Scikit. Using (sklearn-deap)[[References](#References)], optimize the Titatic hyperparameters using both GridSearch and Genetic Algorithms. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"The same exercise (using the digits dataset) can be found in this [notebook](https://github.com/rsteca/sklearn-deap/blob/master/test.ipynb).\n",
|
"The same exercise (using the digits dataset) can be found in this [notebook](https://github.com/rsteca/sklearn-deap/blob/master/test.ipynb).\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Submit a notebook where you include well-crafted conclusions about the exercises, discussing the pros and cons of using genetic algorithms for this purpose.\n"
|
"Submit a notebook where you include well-crafted conclusions about the exercises, discussing the pros and cons of using genetic algorithms for this purpose.\n",
|
||||||
|
"\n",
|
||||||
|
"Note: There is a problem with the version 0.24 of scikit. Just comment the different approaches."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -261,6 +263,15 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
@@ -276,7 +287,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.7.9"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -48,7 +48,9 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"1. [Q-Learning](2_6_1_Q-Learning.ipynb)"
|
"1. [Q-Learning](2_6_1_Q-Learning_Basic.ipynb)\n",
|
||||||
|
"1. [Visualization](2_6_1_Q-Learning_Visualization.ipynb)\n",
|
||||||
|
"1. [Exercises](2_6_1_Q-Learning_Exercises.ipynb)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -64,7 +66,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -78,7 +80,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.10.10"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -1,443 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Course Notes for Learning Intelligent Systems"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2018 Carlos A. Iglesias"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## [Introduction to Machine Learning V](2_6_0_Intro_RL.ipynb)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Table of Contents\n",
|
|
||||||
"\n",
|
|
||||||
"* [Introduction](#Introduction)\n",
|
|
||||||
"* [Getting started with OpenAI Gym](#Getting-started-with-OpenAI-Gym)\n",
|
|
||||||
"* [The Frozen Lake scenario](#The-Frozen-Lake-scenario)\n",
|
|
||||||
"* [Q-Learning with the Frozen Lake scenario](#Q-Learning-with-the-Frozen-Lake-scenario)\n",
|
|
||||||
"* [Exercises](#Exercises)\n",
|
|
||||||
"* [Optional exercises](#Optional-exercises)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Introduction\n",
|
|
||||||
"The purpose of this practice is to understand better Reinforcement Learning (RL) and, in particular, Q-Learning.\n",
|
|
||||||
"\n",
|
|
||||||
"We are going to use [OpenAI Gym](https://gym.openai.com/). OpenAI is a toolkit for developing and comparing RL algorithms.Take a loot at ther [website](https://gym.openai.com/).\n",
|
|
||||||
"\n",
|
|
||||||
"It implements [algorithm imitation](http://gym.openai.com/envs/#algorithmic), [classic control problems](http://gym.openai.com/envs/#classic_control), [Atari games](http://gym.openai.com/envs/#atari), [Box2D continuous control](http://gym.openai.com/envs/#box2d), [robotics with MuJoCo, Multi-Joint dynamics with Contact](http://gym.openai.com/envs/#mujoco), and [simple text based environments](http://gym.openai.com/envs/#toy_text).\n",
|
|
||||||
"\n",
|
|
||||||
"This notebook is based on * [Diving deeper into Reinforcement Learning with Q-Learning](https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe).\n",
|
|
||||||
"\n",
|
|
||||||
"First of all, install the OpenAI Gym library:\n",
|
|
||||||
"\n",
|
|
||||||
"```console\n",
|
|
||||||
"foo@bar:~$ pip install gym\n",
|
|
||||||
"```\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"If you get the error message 'NotImplementedError: abstract', [execute](https://github.com/openai/gym/issues/775) \n",
|
|
||||||
"```console\n",
|
|
||||||
"foo@bar:~$ pip install pyglet==1.2.4\n",
|
|
||||||
"```\n",
|
|
||||||
"\n",
|
|
||||||
"If you want to try the Atari environment, it is better that you opt for the full installation from the source. Follow the instructions at [https://github.com/openai/gym#id15](OpenGym).\n"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Getting started with OpenAI Gym\n",
|
|
||||||
"\n",
|
|
||||||
"First of all, read the [introduction](http://gym.openai.com/docs/#getting-started-with-gym) of OpenAI Gym."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Environments\n",
|
|
||||||
"OpenGym provides a number of problems called *environments*. \n",
|
|
||||||
"\n",
|
|
||||||
"Try the 'CartPole-v0' (or 'MountainCar)."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import gym\n",
|
|
||||||
"\n",
|
|
||||||
"env = gym.make('CartPole-v0')\n",
|
|
||||||
"#env = gym.make('MountainCar-v0')\n",
|
|
||||||
"#env = gym.make('Taxi-v2')\n",
|
|
||||||
"\n",
|
|
||||||
"#env = gym.make('Jamesbond-ram-v0')\n",
|
|
||||||
"\n",
|
|
||||||
"env.reset()\n",
|
|
||||||
"for _ in range(1000):\n",
|
|
||||||
" env.render()\n",
|
|
||||||
" env.step(env.action_space.sample()) # take a random action"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"This will launch an external window with the game. If you cannot close that window, just execute in a code cell:\n",
|
|
||||||
"\n",
|
|
||||||
"```python\n",
|
|
||||||
"env.close()\n",
|
|
||||||
"```\n",
|
|
||||||
"\n",
|
|
||||||
"The full list of available environments can be found printing the environment registry as follows."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from gym import envs\n",
|
|
||||||
"print(envs.registry.all())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"The environment’s **step** function returns four values. These are:\n",
|
|
||||||
"\n",
|
|
||||||
"* **observation (object):** an environment-specific object representing your observation of the environment. For example, pixel data from a camera, joint angles and joint velocities of a robot, or the board state in a board game.\n",
|
|
||||||
"* **reward (float):** amount of reward achieved by the previous action. The scale varies between environments, but the goal is always to increase your total reward.\n",
|
|
||||||
"* **done (boolean):** whether it’s time to reset the environment again. Most (but not all) tasks are divided up into well-defined episodes, and done being True indicates the episode has terminated. (For example, perhaps the pole tipped too far, or you lost your last life.).\n",
|
|
||||||
"* **info (dict):** diagnostic information useful for debugging. It can sometimes be useful for learning (for example, it might contain the raw probabilities behind the environment’s last state change). However, official evaluations of your agent are not allowed to use this for learning.\n",
|
|
||||||
"\n",
|
|
||||||
"The typical agent loop consists in first calling the method *reset* which provides an initial observation. Then the agent executes an action, and receives the reward, the new observation, and if the episode has finished (done is true). \n",
|
|
||||||
"\n",
|
|
||||||
"For example, analyze this sample of agent loop for 100 ms. The details of the previous variables for this game as described [here](https://github.com/openai/gym/wiki/CartPole-v0) are:\n",
|
|
||||||
"* **observation**: Cart Position, Cart Velocity, Pole Angle, Pole Velocity.\n",
|
|
||||||
"* **action**: 0\t(Push cart to the left), 1\t(Push cart to the right).\n",
|
|
||||||
"* **reward**: 1 for every step taken, including the termination step."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import gym\n",
|
|
||||||
"env = gym.make('CartPole-v0')\n",
|
|
||||||
"for i_episode in range(20):\n",
|
|
||||||
" observation = env.reset()\n",
|
|
||||||
" for t in range(100):\n",
|
|
||||||
" env.render()\n",
|
|
||||||
" print(observation)\n",
|
|
||||||
" action = env.action_space.sample()\n",
|
|
||||||
" print(\"Action \", action)\n",
|
|
||||||
" observation, reward, done, info = env.step(action)\n",
|
|
||||||
" print(\"Observation \", observation, \", reward \", reward, \", done \", done, \", info \" , info)\n",
|
|
||||||
" if done:\n",
|
|
||||||
" print(\"Episode finished after {} timesteps\".format(t+1))\n",
|
|
||||||
" break"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# The Frozen Lake scenario\n",
|
|
||||||
"We are going to play to the [Frozen Lake](http://gym.openai.com/envs/FrozenLake-v0/) game.\n",
|
|
||||||
"\n",
|
|
||||||
"The problem is a grid where you should go from the 'start' (S) position to the 'goal position (G) (the pizza!). You can only walk through the 'frozen tiles' (F). Unfortunately, you can fall in a 'hole' (H).\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"The episode ends when you reach the goal or fall in a hole. You receive a reward of 1 if you reach the goal, and zero otherwise. The possible actions are going left, right, up or down. However, the ice is slippery, so you won't always move in the direction you intend.\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"Here you can see several episodes. A full recording is available at [Frozen World](http://gym.openai.com/envs/FrozenLake-v0/).\n",
|
|
||||||
"\n",
|
|
||||||
"\n"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Q-Learning with the Frozen Lake scenario\n",
|
|
||||||
"We are now going to apply Q-Learning for the Frozen Lake scenario. This part of the notebook is taken from [here](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Q%20learning/Q%20Learning%20with%20FrozenLake.ipynb).\n",
|
|
||||||
"\n",
|
|
||||||
"First we create the environment and a Q-table inizializated with zeros to store the value of each action in a given state. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import numpy as np\n",
|
|
||||||
"import gym\n",
|
|
||||||
"import random\n",
|
|
||||||
"\n",
|
|
||||||
"env = gym.make(\"FrozenLake-v0\")\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"action_size = env.action_space.n\n",
|
|
||||||
"state_size = env.observation_space.n\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"qtable = np.zeros((state_size, action_size))\n",
|
|
||||||
"print(qtable)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now we define the hyperparameters."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Q-Learning hyperparameters\n",
|
|
||||||
"total_episodes = 10000 # Total episodes\n",
|
|
||||||
"learning_rate = 0.8 # Learning rate\n",
|
|
||||||
"max_steps = 99 # Max steps per episode\n",
|
|
||||||
"gamma = 0.95 # Discounting rate\n",
|
|
||||||
"\n",
|
|
||||||
"# Exploration hyperparameters\n",
|
|
||||||
"epsilon = 1.0 # Exploration rate\n",
|
|
||||||
"max_epsilon = 1.0 # Exploration probability at start\n",
|
|
||||||
"min_epsilon = 0.01 # Minimum exploration probability \n",
|
|
||||||
"decay_rate = 0.01 # Exponential decay rate for exploration prob"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"And now we implement the Q-Learning algorithm.\n",
|
|
||||||
"\n",
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# List of rewards\n",
|
|
||||||
"rewards = []\n",
|
|
||||||
"\n",
|
|
||||||
"# 2 For life or until learning is stopped\n",
|
|
||||||
"for episode in range(total_episodes):\n",
|
|
||||||
" # Reset the environment\n",
|
|
||||||
" state = env.reset()\n",
|
|
||||||
" step = 0\n",
|
|
||||||
" done = False\n",
|
|
||||||
" total_rewards = 0\n",
|
|
||||||
" \n",
|
|
||||||
" for step in range(max_steps):\n",
|
|
||||||
" # 3. Choose an action a in the current world state (s)\n",
|
|
||||||
" ## First we randomize a number\n",
|
|
||||||
" exp_exp_tradeoff = random.uniform(0, 1)\n",
|
|
||||||
" \n",
|
|
||||||
" ## If this number > greater than epsilon --> exploitation (taking the biggest Q value for this state)\n",
|
|
||||||
" if exp_exp_tradeoff > epsilon:\n",
|
|
||||||
" action = np.argmax(qtable[state,:])\n",
|
|
||||||
"\n",
|
|
||||||
" # Else doing a random choice --> exploration\n",
|
|
||||||
" else:\n",
|
|
||||||
" action = env.action_space.sample()\n",
|
|
||||||
"\n",
|
|
||||||
" # Take the action (a) and observe the outcome state(s') and reward (r)\n",
|
|
||||||
" new_state, reward, done, info = env.step(action)\n",
|
|
||||||
"\n",
|
|
||||||
" # Update Q(s,a):= Q(s,a) + lr [R(s,a) + gamma * max Q(s',a') - Q(s,a)]\n",
|
|
||||||
" # qtable[new_state,:] : all the actions we can take from new state\n",
|
|
||||||
" qtable[state, action] = qtable[state, action] + learning_rate * (reward + gamma * np.max(qtable[new_state, :]) - qtable[state, action])\n",
|
|
||||||
" \n",
|
|
||||||
" total_rewards += reward\n",
|
|
||||||
" \n",
|
|
||||||
" # Our new state is state\n",
|
|
||||||
" state = new_state\n",
|
|
||||||
" \n",
|
|
||||||
" # If done (if we're dead) : finish episode\n",
|
|
||||||
" if done == True: \n",
|
|
||||||
" break\n",
|
|
||||||
" \n",
|
|
||||||
" episode += 1\n",
|
|
||||||
" # Reduce epsilon (because we need less and less exploration)\n",
|
|
||||||
" epsilon = min_epsilon + (max_epsilon - min_epsilon)*np.exp(-decay_rate*episode) \n",
|
|
||||||
" rewards.append(total_rewards)\n",
|
|
||||||
"\n",
|
|
||||||
"print (\"Score over time: \" + str(sum(rewards)/total_episodes))\n",
|
|
||||||
"print(qtable)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Finally, we use the learnt Q-table for playing the Frozen World game."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"\n",
|
|
||||||
"env.reset()\n",
|
|
||||||
"\n",
|
|
||||||
"for episode in range(5):\n",
|
|
||||||
" state = env.reset()\n",
|
|
||||||
" step = 0\n",
|
|
||||||
" done = False\n",
|
|
||||||
" print(\"****************************************************\")\n",
|
|
||||||
" print(\"EPISODE \", episode)\n",
|
|
||||||
"\n",
|
|
||||||
" for step in range(max_steps):\n",
|
|
||||||
" env.render()\n",
|
|
||||||
" # Take the action (index) that have the maximum expected future reward given that state\n",
|
|
||||||
" action = np.argmax(qtable[state,:])\n",
|
|
||||||
" \n",
|
|
||||||
" new_state, reward, done, info = env.step(action)\n",
|
|
||||||
" \n",
|
|
||||||
" if done:\n",
|
|
||||||
" break\n",
|
|
||||||
" state = new_state\n",
|
|
||||||
"env.close()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Exercises\n",
|
|
||||||
"\n",
|
|
||||||
"## Taxi\n",
|
|
||||||
"Analyze the [Taxi problem](http://gym.openai.com/envs/Taxi-v2/) and solve it applying Q-Learning. You can find a solution as the one previously presented [here](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym).\n",
|
|
||||||
"\n",
|
|
||||||
"Analyze the impact of not changing the learning rate (alfa or epsilon, depending on the book) or changing it in a different way."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Optional exercises\n",
|
|
||||||
"\n",
|
|
||||||
"## Doom\n",
|
|
||||||
"Read this [article](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8) and execute the companion [notebook](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Deep%20Q%20Learning/Doom/Deep%20Q%20learning%20with%20Doom.ipynb). Analyze the results and provide conclusions about DQN."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## References\n",
|
|
||||||
"* [Diving deeper into Reinforcement Learning with Q-Learning, Thomas Simonini](https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe).\n",
|
|
||||||
"* Illustrations by [Thomas Simonini](https://github.com/simoninithomas/Deep_reinforcement_learning_Course) and [Sung Kim](https://www.youtube.com/watch?v=xgoO54qN4lY).\n",
|
|
||||||
"* [Frozen Lake solution with TensorFlow](https://analyticsindiamag.com/openai-gym-frozen-lake-beginners-guide-reinforcement-learning/)\n",
|
|
||||||
"* [Deep Q-Learning for Doom](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8)\n",
|
|
||||||
"* [Intro OpenAI Gym with Random Search and the Cart Pole scenario](http://www.pinchofintelligence.com/getting-started-openai-gym/)\n",
|
|
||||||
"* [Q-Learning for the Taxi scenario](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Licence"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
|
||||||
"\n",
|
|
||||||
"© 2018 Carlos A. Iglesias, Universidad Politécnica de Madrid."
|
|
||||||
]
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python3"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.5.5"
|
|
||||||
},
|
|
||||||
"latex_envs": {
|
|
||||||
"LaTeX_envs_menu_present": true,
|
|
||||||
"autocomplete": true,
|
|
||||||
"bibliofile": "biblio.bib",
|
|
||||||
"cite_by": "apalike",
|
|
||||||
"current_citInitial": 1,
|
|
||||||
"eqLabelWithNumbers": true,
|
|
||||||
"eqNumInitial": 1,
|
|
||||||
"hotkeys": {
|
|
||||||
"equation": "Ctrl-E",
|
|
||||||
"itemize": "Ctrl-I"
|
|
||||||
},
|
|
||||||
"labels_anchors": false,
|
|
||||||
"latex_user_defs": false,
|
|
||||||
"report_style_numbering": false,
|
|
||||||
"user_envs_cfg": false
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 1
|
|
||||||
}
|
|
1384
ml5/2_6_1_Q-Learning_Basic.ipynb
Normal file
138
ml5/2_6_1_Q-Learning_Exercises.ipynb
Normal file
@@ -0,0 +1,138 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Course Notes for Learning Intelligent Systems"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos Á. Iglesias"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## [Introduction to Machine Learning V](2_6_0_Intro_RL.ipynb)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Exercises\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"## Taxi\n",
|
||||||
|
"Analyze the [Taxi problem](https://gymnasium.farama.org/environments/toy_text/taxi/) and solve it applying Q-Learning. You can find a solution as the one previously presented [here](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym), and the notebook is [here](https://github.com/wagonhelm/Reinforcement-Learning-Introduction/blob/master/Reinforcement%20Learning%20Introduction.ipynb). Take into account that Gymnasium has changed, so you will have to adapt the code.\n",
|
||||||
|
"\n",
|
||||||
|
"Analyze the impact of not changing the learning rate or changing it in a different way. "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Optional exercises\n",
|
||||||
|
"Select one of the following exercises.\n",
|
||||||
|
"\n",
|
||||||
|
"## Blackjack\n",
|
||||||
|
"Analyze how to appy Q-Learning for solving Blackjack.\n",
|
||||||
|
"You can find information in this [article](https://gymnasium.farama.org/tutorials/training_agents/blackjack_tutorial/).\n",
|
||||||
|
"\n",
|
||||||
|
"## Doom\n",
|
||||||
|
"Read this [article](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8) and execute the companion [notebook](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Deep%20Q%20Learning/Doom/Deep%20Q%20learning%20with%20Doom.ipynb). Analyze the results and provide conclusions about DQN.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## References\n",
|
||||||
|
"* [Gymnasium documentation](https://gymnasium.farama.org/).\n",
|
||||||
|
"* [Diving deeper into Reinforcement Learning with Q-Learning, Thomas Simonini](https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe).\n",
|
||||||
|
"* Illustrations by [Thomas Simonini](https://github.com/simoninithomas/Deep_reinforcement_learning_Course) and [Sung Kim](https://www.youtube.com/watch?v=xgoO54qN4lY).\n",
|
||||||
|
"* [Frozen Lake solution with TensorFlow](https://analyticsindiamag.com/openai-gym-frozen-lake-beginners-guide-reinforcement-learning/)\n",
|
||||||
|
"* [Deep Q-Learning for Doom](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8)\n",
|
||||||
|
"* [Intro OpenAI Gym with Random Search and the Cart Pole scenario](http://www.pinchofintelligence.com/getting-started-openai-gym/)\n",
|
||||||
|
"* [Q-Learning for the Taxi scenario](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Licence"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||||
|
"\n",
|
||||||
|
"© Carlos Á. Iglesias, Universidad Politécnica de Madrid."
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.10.10"
|
||||||
|
},
|
||||||
|
"latex_envs": {
|
||||||
|
"LaTeX_envs_menu_present": true,
|
||||||
|
"autocomplete": true,
|
||||||
|
"bibliofile": "biblio.bib",
|
||||||
|
"cite_by": "apalike",
|
||||||
|
"current_citInitial": 1,
|
||||||
|
"eqLabelWithNumbers": true,
|
||||||
|
"eqNumInitial": 1,
|
||||||
|
"hotkeys": {
|
||||||
|
"equation": "Ctrl-E",
|
||||||
|
"itemize": "Ctrl-I"
|
||||||
|
},
|
||||||
|
"labels_anchors": false,
|
||||||
|
"latex_user_defs": false,
|
||||||
|
"report_style_numbering": false,
|
||||||
|
"user_envs_cfg": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 1
|
||||||
|
}
|
368
ml5/2_6_1_Q-Learning_Visualization.ipynb
Normal file
274
ml5/qlearning.py
Normal file
@@ -0,0 +1,274 @@
|
|||||||
|
# Class definition of QLearning
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import NamedTuple
|
||||||
|
|
||||||
|
import matplotlib.pyplot as plt
|
||||||
|
import numpy as np
|
||||||
|
import pandas as pd
|
||||||
|
import seaborn as sns
|
||||||
|
from tqdm import tqdm
|
||||||
|
|
||||||
|
import gymnasium as gym
|
||||||
|
from gymnasium.envs.toy_text.frozen_lake import generate_random_map
|
||||||
|
|
||||||
|
# Params
|
||||||
|
|
||||||
|
class Params(NamedTuple):
|
||||||
|
total_episodes: int # Total episodes
|
||||||
|
learning_rate: float # Learning rate
|
||||||
|
gamma: float # Discounting rate
|
||||||
|
epsilon: float # Exploration probability
|
||||||
|
map_size: int # Number of tiles of one side of the squared environment
|
||||||
|
seed: int # Define a seed so that we get reproducible results
|
||||||
|
is_slippery: bool # If true the player will move in intended direction with probability of 1/3 else will move in either perpendicular direction with equal probability of 1/3 in both directions
|
||||||
|
n_runs: int # Number of runs
|
||||||
|
action_size: int # Number of possible actions
|
||||||
|
state_size: int # Number of possible states
|
||||||
|
proba_frozen: float # Probability that a tile is frozen
|
||||||
|
savefig_folder: Path # Root folder where plots are saved
|
||||||
|
|
||||||
|
|
||||||
|
class Qlearning:
|
||||||
|
def __init__(self, learning_rate, gamma, state_size, action_size):
|
||||||
|
self.state_size = state_size
|
||||||
|
self.action_size = action_size
|
||||||
|
self.learning_rate = learning_rate
|
||||||
|
self.gamma = gamma
|
||||||
|
self.reset_qtable()
|
||||||
|
|
||||||
|
def update(self, state, action, reward, new_state):
|
||||||
|
"""Update Q(s,a):= Q(s,a) + lr [R(s,a) + gamma * max Q(s',a') - Q(s,a)]"""
|
||||||
|
delta = (
|
||||||
|
reward
|
||||||
|
+ self.gamma * np.max(self.qtable[new_state][:])
|
||||||
|
- self.qtable[state][action]
|
||||||
|
)
|
||||||
|
q_update = self.qtable[state][action] + self.learning_rate * delta
|
||||||
|
return q_update
|
||||||
|
|
||||||
|
def reset_qtable(self):
|
||||||
|
"""Reset the Q-table."""
|
||||||
|
self.qtable = np.zeros((self.state_size, self.action_size))
|
||||||
|
|
||||||
|
|
||||||
|
class EpsilonGreedy:
|
||||||
|
def __init__(self, epsilon, rng):
|
||||||
|
self.epsilon = epsilon
|
||||||
|
self.rng = rng
|
||||||
|
|
||||||
|
def choose_action(self, action_space, state, qtable):
|
||||||
|
"""Choose an action `a` in the current world state (s)."""
|
||||||
|
# First we randomize a number
|
||||||
|
explor_exploit_tradeoff = self.rng.uniform(0, 1)
|
||||||
|
|
||||||
|
# Exploration
|
||||||
|
if explor_exploit_tradeoff < self.epsilon:
|
||||||
|
action = action_space.sample()
|
||||||
|
|
||||||
|
# Exploitation (taking the biggest Q-value for this state)
|
||||||
|
else:
|
||||||
|
# Break ties randomly
|
||||||
|
# If all actions are the same for this state we choose a random one
|
||||||
|
# (otherwise `np.argmax()` would always take the first one)
|
||||||
|
if np.all(qtable[state][:]) == qtable[state][0]:
|
||||||
|
action = action_space.sample()
|
||||||
|
else:
|
||||||
|
action = np.argmax(qtable[state][:])
|
||||||
|
return action
|
||||||
|
|
||||||
|
|
||||||
|
def run_frozen_maps(maps, params, rng):
|
||||||
|
"""Run FrozenLake in maps and plot results"""
|
||||||
|
map_sizes = maps
|
||||||
|
res_all = pd.DataFrame()
|
||||||
|
st_all = pd.DataFrame()
|
||||||
|
|
||||||
|
for map_size in map_sizes:
|
||||||
|
env = gym.make(
|
||||||
|
"FrozenLake-v1",
|
||||||
|
is_slippery=params.is_slippery,
|
||||||
|
render_mode="rgb_array",
|
||||||
|
desc=generate_random_map(
|
||||||
|
size=map_size, p=params.proba_frozen, seed=params.seed
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
params = params._replace(action_size=env.action_space.n)
|
||||||
|
params = params._replace(state_size=env.observation_space.n)
|
||||||
|
env.action_space.seed(
|
||||||
|
params.seed
|
||||||
|
) # Set the seed to get reproducible results when sampling the action space
|
||||||
|
learner = Qlearning(
|
||||||
|
learning_rate=params.learning_rate,
|
||||||
|
gamma=params.gamma,
|
||||||
|
state_size=params.state_size,
|
||||||
|
action_size=params.action_size,
|
||||||
|
)
|
||||||
|
explorer = EpsilonGreedy(
|
||||||
|
epsilon=params.epsilon,
|
||||||
|
rng=rng
|
||||||
|
)
|
||||||
|
print(f"Map size: {map_size}x{map_size}")
|
||||||
|
rewards, steps, episodes, qtables, all_states, all_actions = run_env(env, params, learner, explorer)
|
||||||
|
|
||||||
|
# Save the results in dataframes
|
||||||
|
res, st = postprocess(episodes, params, rewards, steps, map_size)
|
||||||
|
res_all = pd.concat([res_all, res])
|
||||||
|
st_all = pd.concat([st_all, st])
|
||||||
|
qtable = qtables.mean(axis=0) # Average the Q-table between runs
|
||||||
|
|
||||||
|
plot_states_actions_distribution(
|
||||||
|
states=all_states, actions=all_actions, map_size=map_size, params=params
|
||||||
|
) # Sanity check
|
||||||
|
plot_q_values_map(qtable, env, map_size, params)
|
||||||
|
|
||||||
|
env.close()
|
||||||
|
return res_all, st_all
|
||||||
|
|
||||||
|
def run_env(env, params, learner, explorer):
|
||||||
|
rewards = np.zeros((params.total_episodes, params.n_runs))
|
||||||
|
steps = np.zeros((params.total_episodes, params.n_runs))
|
||||||
|
episodes = np.arange(params.total_episodes)
|
||||||
|
qtables = np.zeros((params.n_runs, params.state_size, params.action_size))
|
||||||
|
all_states = []
|
||||||
|
all_actions = []
|
||||||
|
|
||||||
|
for run in range(params.n_runs): # Run several times to account for stochasticity
|
||||||
|
learner.reset_qtable() # Reset the Q-table between runs
|
||||||
|
|
||||||
|
for episode in tqdm(
|
||||||
|
episodes, desc=f"Run {run}/{params.n_runs} - Episodes", leave=False
|
||||||
|
):
|
||||||
|
state = env.reset(seed=params.seed)[0] # Reset the environment
|
||||||
|
step = 0
|
||||||
|
done = False
|
||||||
|
total_rewards = 0
|
||||||
|
|
||||||
|
while not done:
|
||||||
|
action = explorer.choose_action(
|
||||||
|
action_space=env.action_space, state=state, qtable=learner.qtable
|
||||||
|
)
|
||||||
|
|
||||||
|
# Log all states and actions
|
||||||
|
all_states.append(state)
|
||||||
|
all_actions.append(action)
|
||||||
|
|
||||||
|
# Take the action (a) and observe the outcome state(s') and reward (r)
|
||||||
|
new_state, reward, terminated, truncated, info = env.step(action)
|
||||||
|
|
||||||
|
done = terminated or truncated
|
||||||
|
|
||||||
|
learner.qtable[state, action] = learner.update(
|
||||||
|
state, action, reward, new_state
|
||||||
|
)
|
||||||
|
|
||||||
|
total_rewards += reward
|
||||||
|
step += 1
|
||||||
|
|
||||||
|
# Our new state is state
|
||||||
|
state = new_state
|
||||||
|
|
||||||
|
# Log all rewards and steps
|
||||||
|
rewards[episode, run] = total_rewards
|
||||||
|
steps[episode, run] = step
|
||||||
|
qtables[run, :, :] = learner.qtable
|
||||||
|
|
||||||
|
return rewards, steps, episodes, qtables, all_states, all_actions
|
||||||
|
|
||||||
|
def postprocess(episodes, params, rewards, steps, map_size):
|
||||||
|
"""Convert the results of the simulation in dataframes."""
|
||||||
|
res = pd.DataFrame(
|
||||||
|
data={
|
||||||
|
"Episodes": np.tile(episodes, reps=params.n_runs),
|
||||||
|
"Rewards": rewards.flatten(),
|
||||||
|
"Steps": steps.flatten(),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
res["cum_rewards"] = rewards.cumsum(axis=0).flatten(order="F")
|
||||||
|
res["map_size"] = np.repeat(f"{map_size}x{map_size}", res.shape[0])
|
||||||
|
|
||||||
|
st = pd.DataFrame(data={"Episodes": episodes, "Steps": steps.mean(axis=1)})
|
||||||
|
st["map_size"] = np.repeat(f"{map_size}x{map_size}", st.shape[0])
|
||||||
|
return res, st
|
||||||
|
|
||||||
|
def qtable_directions_map(qtable, map_size):
|
||||||
|
"""Get the best learned action & map it to arrows."""
|
||||||
|
qtable_val_max = qtable.max(axis=1).reshape(map_size, map_size)
|
||||||
|
qtable_best_action = np.argmax(qtable, axis=1).reshape(map_size, map_size)
|
||||||
|
directions = {0: "←", 1: "↓", 2: "→", 3: "↑"}
|
||||||
|
qtable_directions = np.empty(qtable_best_action.flatten().shape, dtype=str)
|
||||||
|
eps = np.finfo(float).eps # Minimum float number on the machine
|
||||||
|
for idx, val in enumerate(qtable_best_action.flatten()):
|
||||||
|
if qtable_val_max.flatten()[idx] > eps:
|
||||||
|
# Assign an arrow only if a minimal Q-value has been learned as best action
|
||||||
|
# otherwise since 0 is a direction, it also gets mapped on the tiles where
|
||||||
|
# it didn't actually learn anything
|
||||||
|
qtable_directions[idx] = directions[val]
|
||||||
|
qtable_directions = qtable_directions.reshape(map_size, map_size)
|
||||||
|
return qtable_val_max, qtable_directions
|
||||||
|
|
||||||
|
def plot_q_values_map(qtable, env, map_size, params):
|
||||||
|
"""Plot the last frame of the simulation and the policy learned."""
|
||||||
|
qtable_val_max, qtable_directions = qtable_directions_map(qtable, map_size)
|
||||||
|
|
||||||
|
# Plot the last frame
|
||||||
|
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))
|
||||||
|
ax[0].imshow(env.render())
|
||||||
|
ax[0].axis("off")
|
||||||
|
ax[0].set_title("Last frame")
|
||||||
|
|
||||||
|
# Plot the policy
|
||||||
|
sns.heatmap(
|
||||||
|
qtable_val_max,
|
||||||
|
annot=qtable_directions,
|
||||||
|
fmt="",
|
||||||
|
ax=ax[1],
|
||||||
|
cmap=sns.color_palette("Blues", as_cmap=True),
|
||||||
|
linewidths=0.7,
|
||||||
|
linecolor="black",
|
||||||
|
xticklabels=[],
|
||||||
|
yticklabels=[],
|
||||||
|
annot_kws={"fontsize": "xx-large"},
|
||||||
|
).set(title="Learned Q-values\nArrows represent best action")
|
||||||
|
for _, spine in ax[1].spines.items():
|
||||||
|
spine.set_visible(True)
|
||||||
|
spine.set_linewidth(0.7)
|
||||||
|
spine.set_color("black")
|
||||||
|
img_title = f"frozenlake_q_values_{map_size}x{map_size}.png"
|
||||||
|
fig.savefig(params.savefig_folder / img_title, bbox_inches="tight")
|
||||||
|
plt.show()
|
||||||
|
|
||||||
|
def plot_states_actions_distribution(states, actions, map_size, params):
|
||||||
|
"""Plot the distributions of states and actions."""
|
||||||
|
labels = {"LEFT": 0, "DOWN": 1, "RIGHT": 2, "UP": 3}
|
||||||
|
|
||||||
|
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))
|
||||||
|
sns.histplot(data=states, ax=ax[0], kde=True)
|
||||||
|
ax[0].set_title("States")
|
||||||
|
sns.histplot(data=actions, ax=ax[1])
|
||||||
|
ax[1].set_xticks(list(labels.values()), labels=labels.keys())
|
||||||
|
ax[1].set_title("Actions")
|
||||||
|
fig.tight_layout()
|
||||||
|
img_title = f"frozenlake_states_actions_distrib_{map_size}x{map_size}.png"
|
||||||
|
fig.savefig(params.savefig_folder / img_title, bbox_inches="tight")
|
||||||
|
plt.show()
|
||||||
|
|
||||||
|
def plot_steps_and_rewards(rewards_df, steps_df,params):
|
||||||
|
"""Plot the steps and rewards from dataframes."""
|
||||||
|
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))
|
||||||
|
sns.lineplot(
|
||||||
|
data=rewards_df, x="Episodes", y="cum_rewards", hue="map_size", ax=ax[0]
|
||||||
|
)
|
||||||
|
ax[0].set(ylabel="Cumulated rewards")
|
||||||
|
|
||||||
|
sns.lineplot(data=steps_df, x="Episodes", y="Steps", hue="map_size", ax=ax[1])
|
||||||
|
ax[1].set(ylabel="Averaged steps number")
|
||||||
|
|
||||||
|
for axi in ax:
|
||||||
|
axi.legend(title="map size")
|
||||||
|
fig.tight_layout()
|
||||||
|
img_title = "frozenlake_steps_and_rewards.png"
|
||||||
|
fig.savefig(params.savefig_folder / img_title, bbox_inches="tight")
|
||||||
|
plt.show()
|
||||||
|
|
2538
nlp/0_1_NLP_Slides.ipynb
Normal file
333
nlp/0_2_NLP_Assignment.ipynb
Normal file
@@ -0,0 +1,333 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Course Notes for Learning Intelligent Systems"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Table of Contents\n",
|
||||||
|
"* [First steps](#First-steps)\n",
|
||||||
|
"* [Movie review](#Movie-review)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# First steps\n",
|
||||||
|
"Given the text taken from https://www.romania-insider.com/baneasa-airport-reopening-date-jul-2022.\n",
|
||||||
|
"\n",
|
||||||
|
"The Aurel Vlaicu Băneasa Airport will reopen on August 1, with scheduled commercial flights resuming after a nine-year hiatus, George Dorobanțu, the director of the Bucharest National Airports Company (CNAB), announced in an interview with the public radio. Three companies are already ready to start scheduled and charter flights on Băneasa, namely Ryanair, Air Connect, and Fly One, the director said.\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 4,
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"text = \"The Aurel Vlaicu Băneasa Airport will reopen on August 1, with scheduled commercial flights resuming after a nine-year hiatus, George Dorobanțu, the director of the Bucharest National Airports Company (CNAB), announced in an interview with the public radio. Three companies are already ready to start scheduled and charter flights on Băneasa, namely Ryanair, Air Connect, and Fly One, the director said.\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "subslide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 1. List the first 10 tokens of the doc"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 2. Number of tokens of the text."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 3. List the Noun chunks\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 4. Print the sentences of the text"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 5. Print the number of sentences of the text\n",
|
||||||
|
"Hint: build a list first"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 6. Print the second sentence. "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 7. Visualize the dependency grammar analysis of the second sentence"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 8. Listing lemmas and deps\n",
|
||||||
|
"For every token in the second sentence, print the text token, the grammatical category, and the lemma in four columns.\n",
|
||||||
|
"\n",
|
||||||
|
"Example:\n",
|
||||||
|
"\n",
|
||||||
|
"you  PRON  you  nsubj\n",
|
||||||
|
"\n",
|
||||||
|
"Hint: format the columns. You can use expandtabs."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 9. List frequencies of POS in the document in a table "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 10. Preprocessing\n",
|
||||||
|
"\n",
|
||||||
|
"Remove from the doc stopwords, digits and punctuation.\n",
|
||||||
|
"\n",
|
||||||
|
"Hint: check the token api https://spacy.io/api/token\n",
|
||||||
|
"\n",
|
||||||
|
"Print the number of tokens before and after preprocessing."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 11. Entities of the document\n",
|
||||||
|
"Print the entities of the document, the type of the entity and what the explanation of the entity in a table with three columns.\n",
|
||||||
|
"\n",
|
||||||
|
"Example:\n",
|
||||||
|
"\n",
|
||||||
|
"Ubuntu    ORG    Companies, agencies, institutions, etc."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"### 12. Visualize the entities\n",
|
||||||
|
"Show the entities in a graph."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "slide"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"# Movie review\n",
|
||||||
|
"\n",
|
||||||
|
"Classify the rmoview reviews from the following dataset https://data.world/rajeevsharma993/movie-reviews"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## References\n",
|
||||||
|
"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"* [Spacy](https://spacy.io/usage/spacy-101/#annotations) \n",
|
||||||
|
"* [NLTK stemmer](https://www.nltk.org/howto/stem.html)\n",
|
||||||
|
"* [NLTK Book. Natural Language Processing with Python. Steven Bird, Ewan Klein, and Edward Loper. O'Reilly Media, 2009 ](http://www.nltk.org/book_1ed/)\n",
|
||||||
|
"* [NLTK Essentials, Nitin Hardeniya, Packt Publishing, 2015](http://proquest.safaribooksonline.com/search?q=NLTK%20Essentials)\n",
|
||||||
|
"* Natural Language Processing with Python, José Portilla, 2019."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"## Licence"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"slideshow": {
|
||||||
|
"slide_type": "skip"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||||
|
"\n",
|
||||||
|
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"celltoolbar": "Slideshow",
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.10.10"
|
||||||
|
},
|
||||||
|
"latex_envs": {
|
||||||
|
"LaTeX_envs_menu_present": true,
|
||||||
|
"autocomplete": true,
|
||||||
|
"bibliofile": "biblio.bib",
|
||||||
|
"cite_by": "apalike",
|
||||||
|
"current_citInitial": 1,
|
||||||
|
"eqLabelWithNumbers": true,
|
||||||
|
"eqNumInitial": 1,
|
||||||
|
"hotkeys": {
|
||||||
|
"equation": "Ctrl-E",
|
||||||
|
"itemize": "Ctrl-I"
|
||||||
|
},
|
||||||
|
"labels_anchors": false,
|
||||||
|
"latex_user_defs": false,
|
||||||
|
"report_style_numbering": false,
|
||||||
|
"user_envs_cfg": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 1
|
||||||
|
}
|
@@ -105,9 +105,23 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 2,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/html": [
|
||||||
|
"<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>CountVectorizer(max_features=5000)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">CountVectorizer</label><div class=\"sk-toggleable__content\"><pre>CountVectorizer(max_features=5000)</pre></div></div></div></div></div>"
|
||||||
|
],
|
||||||
|
"text/plain": [
|
||||||
|
"CountVectorizer(max_features=5000)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 2,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"from sklearn.feature_extraction.text import CountVectorizer\n",
|
"from sklearn.feature_extraction.text import CountVectorizer\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -128,9 +142,21 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 3,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"<3x10 sparse matrix of type '<class 'numpy.int64'>'\n",
|
||||||
|
"\twith 15 stored elements in Compressed Sparse Row format>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 3,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"vectors = vectorizer.fit_transform(documents)\n",
|
"vectors = vectorizer.fit_transform(documents)\n",
|
||||||
"vectors"
|
"vectors"
|
||||||
@@ -146,12 +172,24 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 4,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"[[0 1 1 2 0 0 1 2 0 0]\n",
|
||||||
|
" [1 0 0 0 2 0 0 1 2 1]\n",
|
||||||
|
" [1 0 0 0 2 1 0 0 1 1]]\n",
|
||||||
|
"['and' 'but' 'coming' 'is' 'like' 'sandwiches' 'short' 'summer' 'the'\n",
|
||||||
|
" 'winter']\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"print(vectors.toarray())\n",
|
"print(vectors.toarray())\n",
|
||||||
"print(vectorizer.get_feature_names())"
|
"print(vectorizer.get_feature_names_out())"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -164,13 +202,25 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 5,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"array(['and', 'but', 'coming', 'i', 'is', 'like', 'sandwiches', 'short',\n",
|
||||||
|
" 'summer', 'the', 'winter'], dtype=object)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"vectorizer = CountVectorizer(analyzer=\"word\", stop_words=None, token_pattern='(?u)\\\\b\\\\w+\\\\b') \n",
|
"vectorizer = CountVectorizer(analyzer=\"word\", stop_words=None, token_pattern='(?u)\\\\b\\\\w+\\\\b') \n",
|
||||||
"vectors = vectorizer.fit_transform(documents)\n",
|
"vectors = vectorizer.fit_transform(documents)\n",
|
||||||
"vectorizer.get_feature_names()"
|
"vectorizer.get_feature_names_out()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -182,20 +232,47 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 6,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"/home/cif/anaconda3/lib/python3.10/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function get_feature_names is deprecated; get_feature_names is deprecated in 1.0 and will be removed in 1.2. Please use get_feature_names_out instead.\n",
|
||||||
|
" warnings.warn(msg, category=FutureWarning)\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"['coming', 'like', 'sandwiches', 'short', 'summer', 'winter']"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 6,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"vectorizer = CountVectorizer(analyzer=\"word\", stop_words='english', token_pattern='(?u)\\\\b\\\\w+\\\\b') \n",
|
"vectorizer = CountVectorizer(analyzer=\"word\", stop_words='english', token_pattern='(?u)\\\\b\\\\w+\\\\b') \n",
|
||||||
"vectors = vectorizer.fit_transform(documents)\n",
|
"vectors = vectorizer.fit_transform(documents)\n",
|
||||||
"vectorizer.get_feature_names()"
|
"vectorizer.get_feature_names_out()"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 7,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"frozenset({'or', 'be', 'least', 'ours', 'very', 'noone', 'more', 'can', 'front', 'last', 'co', 'where', 'beyond', 'you', 'was', 'to', 'nine', 'here', 'describe', 'than', 'rather', 'therefore', 'except', 'at', 'again', 'ourselves', 'most', 'anyway', 'thick', 'whither', 'thereupon', 'someone', 'hereupon', 'besides', 'among', 'hasnt', 'across', 'namely', 'because', 'is', 'out', 'same', 'yourself', 'somehow', 'sincere', 'con', 'hereby', 'towards', 'interest', 'much', 'up', 'why', 'myself', 'all', 'nobody', 'though', 'every', 'show', 'not', 'there', 'whether', 'still', 'name', 'when', 'the', 'each', 'six', 'nor', 'and', 'under', 'thereby', 'less', 'either', 'thence', 'into', 'seemed', 'something', 'four', 'sometimes', 'himself', 'those', 'nowhere', 'almost', 'are', 'empty', 'must', 'while', 'afterwards', 'perhaps', 'from', 'detail', 'through', 'any', 'have', 'may', 'he', 'anywhere', 'alone', 'without', 'beforehand', 'had', 'too', 'yourselves', 'our', 'see', 'how', 'please', 'what', 'am', 'do', 'it', 'serious', 'yet', 'down', 'top', 'amount', 'then', 'both', 'fire', 'been', 'wherein', 'done', 'etc', 'whose', 'whereafter', 'who', 'ltd', 'meanwhile', 'further', 'few', 'first', 'behind', 'made', 'yours', 'until', 'toward', 'amoungst', 'anyhow', 'we', 'with', 'give', 'go', 'no', 'back', 'else', 'becomes', 'your', 'fill', 'together', 'another', 'throughout', 'onto', 'de', 'me', 'ten', 'system', 'became', 'per', 'therein', 'everyone', 'often', 'ie', 'put', 'hers', 'herself', 'nevertheless', 'itself', 'eg', 'herein', 'his', 'this', 'cry', 'due', 'bill', 'one', 'on', 'being', 'themselves', 'of', 'some', 'their', 'neither', 'elsewhere', 'since', 'whole', 'eight', 'i', 'a', 'whoever', 'own', 'call', 'them', 'mostly', 'she', 'my', 'cannot', 'us', 'never', 'as', 'thin', 'upon', 'cant', 'un', 'before', 'her', 'otherwise', 'full', 'these', 'next', 'they', 'side', 'somewhere', 'fifty', 'hence', 'so', 'along', 'already', 'three', 'latter', 'anything', 'whom', 'could', 'indeed', 'nothing', 'whereby', 'which', 'sometime', 'become', 'ever', 'amongst', 'by', 'in', 'five', 'after', 'mine', 'fifteen', 'wherever', 'found', 'thereafter', 'third', 'keep', 'anyone', 'will', 'bottom', 'off', 'seem', 'none', 'an', 'whatever', 'over', 'during', 'also', 'latterly', 'via', 'take', 'former', 'above', 'now', 'becoming', 'hereafter', 'such', 'two', 'only', 'about', 'sixty', 're', 'everything', 'others', 'hundred', 'twelve', 'thus', 'even', 'well', 'always', 'once', 'beside', 'get', 'mill', 'seems', 'if', 'whereupon', 'find', 'forty', 'inc', 'whenever', 'around', 'other', 'should', 'many', 'enough', 'however', 'move', 'against', 'several', 'everywhere', 'has', 'whereas', 'that', 'whence', 'eleven', 'its', 'within', 'twenty', 'part', 'although', 'thru', 'couldnt', 'moreover', 'him', 'formerly', 'might', 'seeming', 'but', 'below', 'would', 'between', 'were', 'for'})\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"#stop words in scikit-learn for English\n",
|
"#stop words in scikit-learn for English\n",
|
||||||
"print(vectorizer.get_stop_words())"
|
"print(vectorizer.get_stop_words())"
|
||||||
@@ -442,7 +519,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -456,7 +533,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.10.10"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -74,9 +74,17 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 1,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"['alt.atheism', 'comp.graphics', 'comp.os.ms-windows.misc', 'comp.sys.ibm.pc.hardware', 'comp.sys.mac.hardware', 'comp.windows.x', 'misc.forsale', 'rec.autos', 'rec.motorcycles', 'rec.sport.baseball', 'rec.sport.hockey', 'sci.crypt', 'sci.electronics', 'sci.med', 'sci.space', 'soc.religion.christian', 'talk.politics.guns', 'talk.politics.mideast', 'talk.politics.misc', 'talk.religion.misc']\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"from sklearn.datasets import fetch_20newsgroups\n",
|
"from sklearn.datasets import fetch_20newsgroups\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -90,9 +98,17 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 2,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"20\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"#Number of categories\n",
|
"#Number of categories\n",
|
||||||
"print(len(newsgroups_train.target_names))"
|
"print(len(newsgroups_train.target_names))"
|
||||||
@@ -100,9 +116,26 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 3,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"Category id 4 comp.sys.mac.hardware\n",
|
||||||
|
"Doc A fair number of brave souls who upgraded their SI clock oscillator have\n",
|
||||||
|
"shared their experiences for this poll. Please send a brief message detailing\n",
|
||||||
|
"your experiences with the procedure. Top speed attained, CPU rated speed,\n",
|
||||||
|
"add on cards and adapters, heat sinks, hour of usage per day, floppy disk\n",
|
||||||
|
"functionality with 800 and 1.4 m floppies are especially requested.\n",
|
||||||
|
"\n",
|
||||||
|
"I will be summarizing in the next two days, so please add to the network\n",
|
||||||
|
"knowledge base if you have done the clock upgrade and haven't answered this\n",
|
||||||
|
"poll. Thanks.\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# Show a document\n",
|
"# Show a document\n",
|
||||||
"docid = 1\n",
|
"docid = 1\n",
|
||||||
@@ -115,9 +148,20 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 4,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"(11314,)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 4,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"#Number of files\n",
|
"#Number of files\n",
|
||||||
"newsgroups_train.filenames.shape"
|
"newsgroups_train.filenames.shape"
|
||||||
@@ -125,9 +169,20 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 5,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"(11314, 101322)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# Obtain a vector\n",
|
"# Obtain a vector\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -141,9 +196,20 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 6,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"66.802987449178"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 6,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# The tf-idf vectors are very sparse with an average of 66 non zero components in 101.323 dimensions (.06%)\n",
|
"# The tf-idf vectors are very sparse with an average of 66 non zero components in 101.323 dimensions (.06%)\n",
|
||||||
"vectors_train.nnz / float(vectors_train.shape[0])"
|
"vectors_train.nnz / float(vectors_train.shape[0])"
|
||||||
@@ -165,9 +231,20 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 7,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"0.695453607190013"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 7,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"from sklearn.naive_bayes import MultinomialNB\n",
|
"from sklearn.naive_bayes import MultinomialNB\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -195,29 +272,44 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 9,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
"source": [
|
|
||||||
"from sklearn.utils.extmath import density\n",
|
|
||||||
"\n",
|
|
||||||
"print(\"dimensionality: %d\" % model.coef_.shape[1])\n",
|
|
||||||
"print(\"density: %f\" % density(model.coef_))"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"name": "stdout",
|
||||||
"execution_count": null,
|
"output_type": "stream",
|
||||||
"metadata": {},
|
"text": [
|
||||||
"outputs": [],
|
"alt.atheism: islam atheists say just religion atheism think don people god\n",
|
||||||
|
"comp.graphics: looking format 3d know program file files thanks image graphics\n",
|
||||||
|
"comp.os.ms-windows.misc: card problem thanks driver drivers use files dos file windows\n",
|
||||||
|
"comp.sys.ibm.pc.hardware: monitor disk thanks pc ide controller bus card scsi drive\n",
|
||||||
|
"comp.sys.mac.hardware: know monitor does quadra simms thanks problem drive apple mac\n",
|
||||||
|
"comp.windows.x: using windows x11r5 use application thanks widget server motif window\n",
|
||||||
|
"misc.forsale: asking email sell price condition new shipping offer 00 sale\n",
|
||||||
|
"rec.autos: don ford new good dealer just engine like cars car\n",
|
||||||
|
"rec.motorcycles: don just helmet riding like motorcycle ride bikes dod bike\n",
|
||||||
|
"rec.sport.baseball: braves players pitching hit runs games game baseball team year\n",
|
||||||
|
"rec.sport.hockey: league year nhl games season players play hockey team game\n",
|
||||||
|
"sci.crypt: people use escrow nsa keys government chip clipper encryption key\n",
|
||||||
|
"sci.electronics: don thanks voltage used know does like circuit power use\n",
|
||||||
|
"sci.med: skepticism cadre dsl banks chastity n3jxp pitt gordon geb msg\n",
|
||||||
|
"sci.space: just lunar earth shuttle like moon launch orbit nasa space\n",
|
||||||
|
"soc.religion.christian: believe faith christian christ bible people christians church jesus god\n",
|
||||||
|
"talk.politics.guns: just law firearms government fbi don weapons people guns gun\n",
|
||||||
|
"talk.politics.mideast: said arabs arab turkish people armenians armenian jews israeli israel\n",
|
||||||
|
"talk.politics.misc: know state clinton president just think tax don government people\n",
|
||||||
|
"talk.religion.misc: think don koresh objective christians bible people christian jesus god\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# We can review the top features per topic in Bayes (attribute coef_)\n",
|
"# We can review the top features per topic in Bayes (attribute feature_log_prob_)\n",
|
||||||
"import numpy as np\n",
|
"import numpy as np\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def show_top10(classifier, vectorizer, categories):\n",
|
"def show_top10(classifier, vectorizer, categories):\n",
|
||||||
" feature_names = np.asarray(vectorizer.get_feature_names())\n",
|
" feature_names = np.asarray(vectorizer.get_feature_names_out())\n",
|
||||||
" for i, category in enumerate(categories):\n",
|
" for i, category in enumerate(categories):\n",
|
||||||
" top10 = np.argsort(classifier.coef_[i])[-10:]\n",
|
" top10 = np.argsort(classifier.feature_log_prob_[i, :])[-10:]\n",
|
||||||
" print(\"%s: %s\" % (category, \" \".join(feature_names[top10])))\n",
|
" print(\"%s: %s\" % (category, \" \".join(feature_names[top10])))\n",
|
||||||
"\n",
|
"\n",
|
||||||
" \n",
|
" \n",
|
||||||
@@ -226,9 +318,18 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 10,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"[ 2 15]\n",
|
||||||
|
"['comp.os.ms-windows.misc', 'soc.religion.christian']\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# We try the classifier in two new docs\n",
|
"# We try the classifier in two new docs\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -275,7 +376,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -289,7 +390,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.10.10"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -126,20 +126,49 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
|
"Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"You should install first *gensim*. Run 'conda install -c anaconda gensim=0.12.4' in a terminal."
|
"You should install first:\n",
|
||||||
|
"\n",
|
||||||
|
"* *gensim*. Run 'conda install gensim' in a terminal.\n",
|
||||||
|
"* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 2,
|
"execution_count": 24,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"Requirement already satisfied: gensim in /home/cif/anaconda3/lib/python3.10/site-packages (4.3.1)\n",
|
||||||
|
"Requirement already satisfied: scipy>=1.7.0 in /home/cif/anaconda3/lib/python3.10/site-packages (from gensim) (1.10.1)\n",
|
||||||
|
"Requirement already satisfied: smart-open>=1.8.1 in /home/cif/anaconda3/lib/python3.10/site-packages (from gensim) (6.3.0)\n",
|
||||||
|
"Requirement already satisfied: numpy>=1.18.5 in /home/cif/anaconda3/lib/python3.10/site-packages (from gensim) (1.24.2)\n",
|
||||||
|
"Note: you may need to restart the kernel to use updated packages.\n",
|
||||||
|
"Requirement already satisfied: python-Levenshtein in /home/cif/anaconda3/lib/python3.10/site-packages (0.21.0)\n",
|
||||||
|
"Requirement already satisfied: Levenshtein==0.21.0 in /home/cif/anaconda3/lib/python3.10/site-packages (from python-Levenshtein) (0.21.0)\n",
|
||||||
|
"Requirement already satisfied: rapidfuzz<4.0.0,>=2.3.0 in /home/cif/anaconda3/lib/python3.10/site-packages (from Levenshtein==0.21.0->python-Levenshtein) (3.0.0)\n",
|
||||||
|
"Note: you may need to restart the kernel to use updated packages.\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"%pip install gensim\n",
|
||||||
|
"%pip install python-Levenshtein"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 23,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"from gensim import matutils\n",
|
"from gensim import matutils\n",
|
||||||
"\n",
|
"\n",
|
||||||
"vocab = vectorizer.get_feature_names()\n",
|
"vocab = vectorizer.get_feature_names_out()\n",
|
||||||
"\n",
|
"\n",
|
||||||
"dictionary = dict([(i, s) for i, s in enumerate(vectorizer.get_feature_names())])\n",
|
"dictionary = dict([(i, s) for i, s in enumerate(vectorizer.get_feature_names_out())])\n",
|
||||||
"corpus_tfidf = matutils.Sparse2Corpus(vectors_train)"
|
"corpus_tfidf = matutils.Sparse2Corpus(vectors_train)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -180,13 +209,13 @@
|
|||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"[(0,\n",
|
"[(0,\n",
|
||||||
" '0.007*\"car\" + 0.006*\"increased\" + 0.006*\"closely\" + 0.006*\"groups\" + 0.006*\"center\" + 0.006*\"88\" + 0.006*\"offer\" + 0.005*\"archie\" + 0.005*\"beginning\" + 0.005*\"comets\"'),\n",
|
" '0.004*\"central\" + 0.004*\"assumptions\" + 0.004*\"matthew\" + 0.004*\"define\" + 0.004*\"holes\" + 0.003*\"killing\" + 0.003*\"item\" + 0.003*\"curious\" + 0.003*\"going\" + 0.003*\"presentations\"'),\n",
|
||||||
" (1,\n",
|
" (1,\n",
|
||||||
" '0.005*\"allow\" + 0.005*\"discuss\" + 0.005*\"condition\" + 0.004*\"certain\" + 0.004*\"member\" + 0.004*\"manipulation\" + 0.004*\"little\" + 0.003*\"proposal\" + 0.003*\"heavily\" + 0.003*\"obvious\"'),\n",
|
" '0.002*\"mechanism\" + 0.002*\"led\" + 0.002*\"apple\" + 0.002*\"color\" + 0.002*\"mormons\" + 0.002*\"activity\" + 0.002*\"concepts\" + 0.002*\"frank\" + 0.002*\"platform\" + 0.002*\"fault\"'),\n",
|
||||||
" (2,\n",
|
" (2,\n",
|
||||||
" '0.002*\"led\" + 0.002*\"mechanism\" + 0.002*\"frank\" + 0.002*\"platform\" + 0.002*\"mormons\" + 0.002*\"concepts\" + 0.002*\"proton\" + 0.002*\"aeronautics\" + 0.002*\"header\" + 0.002*\"foreign\"'),\n",
|
" '0.005*\"objects\" + 0.005*\"obtained\" + 0.003*\"manhattan\" + 0.003*\"capability\" + 0.003*\"education\" + 0.003*\"men\" + 0.003*\"photo\" + 0.003*\"decent\" + 0.003*\"environmental\" + 0.003*\"pain\"'),\n",
|
||||||
" (3,\n",
|
" (3,\n",
|
||||||
" '0.004*\"objects\" + 0.003*\"activity\" + 0.003*\"manhattan\" + 0.003*\"obtained\" + 0.003*\"eyes\" + 0.003*\"education\" + 0.003*\"netters\" + 0.003*\"complex\" + 0.003*\"europe\" + 0.002*\"missions\"')]"
|
" '0.004*\"car\" + 0.004*\"contain\" + 0.004*\"groups\" + 0.004*\"center\" + 0.004*\"evil\" + 0.004*\"maintain\" + 0.004*\"comets\" + 0.004*\"88\" + 0.004*\"density\" + 0.003*\"company\"')]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 4,
|
"execution_count": 4,
|
||||||
@@ -247,13 +276,13 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"Dictionary(10913 unique tokens: ['cel', 'ds', 'hi', 'nothing', 'prj']...)\n"
|
"Dictionary<10913 unique tokens: ['cel', 'ds', 'hi', 'nothing', 'prj']...>\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# You can save the dictionary\n",
|
"# You can save the dictionary\n",
|
||||||
"dictionary.save('newsgroup.dict')\n",
|
"dictionary.save('newsgroup.dict.texts')\n",
|
||||||
"\n",
|
"\n",
|
||||||
"print(dictionary)"
|
"print(dictionary)"
|
||||||
]
|
]
|
||||||
@@ -283,35 +312,14 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 15,
|
"execution_count": 9,
|
||||||
"metadata": {},
|
|
||||||
"outputs": [
|
|
||||||
{
|
|
||||||
"name": "stderr",
|
|
||||||
"output_type": "stream",
|
|
||||||
"text": [
|
|
||||||
"WARNING:root:random_state not set so using default value\n",
|
|
||||||
"WARNING:root:failed to load state from newsgroups.dict.state: [Errno 2] No such file or directory: 'newsgroups.dict.state'\n"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"source": [
|
|
||||||
"# You can optionally save the dictionary \n",
|
|
||||||
"\n",
|
|
||||||
"dictionary.save('newsgroups.dict')\n",
|
|
||||||
"lda = LdaModel.load('newsgroups.dict')"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": 16,
|
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"Dictionary(10913 unique tokens: ['cel', 'ds', 'hi', 'nothing', 'prj']...)\n"
|
"Dictionary<10913 unique tokens: ['cel', 'ds', 'hi', 'nothing', 'prj']...>\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -323,7 +331,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 17,
|
"execution_count": 10,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@@ -333,7 +341,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 18,
|
"execution_count": 11,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@@ -346,7 +354,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 19,
|
"execution_count": 12,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@@ -364,7 +372,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 20,
|
"execution_count": 13,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@@ -377,23 +385,23 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 21,
|
"execution_count": 14,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"[(0,\n",
|
"[(0,\n",
|
||||||
" '0.011*\"thanks\" + 0.010*\"targa\" + 0.008*\"mary\" + 0.008*\"western\" + 0.007*\"craig\" + 0.007*\"jeff\" + 0.006*\"yayayay\" + 0.006*\"phobos\" + 0.005*\"unfortunately\" + 0.005*\"martian\"'),\n",
|
" '0.011*\"mary\" + 0.007*\"ns\" + 0.006*\"joseph\" + 0.006*\"lucky\" + 0.006*\"ssrt\" + 0.005*\"god\" + 0.005*\"unfortunately\" + 0.004*\"rayshade\" + 0.004*\"phil\" + 0.004*\"nasa\"'),\n",
|
||||||
" (1,\n",
|
" (1,\n",
|
||||||
" '0.007*\"islam\" + 0.006*\"koresh\" + 0.006*\"moon\" + 0.006*\"bible\" + 0.006*\"plane\" + 0.006*\"ns\" + 0.005*\"zoroastrians\" + 0.005*\"joy\" + 0.005*\"lucky\" + 0.005*\"ssrt\"'),\n",
|
" '0.009*\"thanks\" + 0.009*\"targa\" + 0.008*\"whatever\" + 0.008*\"baptist\" + 0.007*\"islam\" + 0.006*\"cheers\" + 0.006*\"kent\" + 0.006*\"zoroastrians\" + 0.006*\"joy\" + 0.006*\"lot\"'),\n",
|
||||||
" (2,\n",
|
" (2,\n",
|
||||||
" '0.009*\"whatever\" + 0.009*\"baptist\" + 0.007*\"cheers\" + 0.007*\"kent\" + 0.006*\"khomeini\" + 0.006*\"davidian\" + 0.005*\"gerald\" + 0.005*\"bull\" + 0.005*\"sorry\" + 0.005*\"jesus\"'),\n",
|
" '0.008*\"moon\" + 0.008*\"really\" + 0.008*\"western\" + 0.007*\"plane\" + 0.006*\"samaritan\" + 0.006*\"crusades\" + 0.006*\"baltimore\" + 0.005*\"bob\" + 0.005*\"septuagint\" + 0.005*\"virtual\"'),\n",
|
||||||
" (3,\n",
|
" (3,\n",
|
||||||
" '0.005*\"pd\" + 0.004*\"baltimore\" + 0.004*\"also\" + 0.003*\"ipx\" + 0.003*\"dam\" + 0.003*\"feiner\" + 0.003*\"foley\" + 0.003*\"ideally\" + 0.003*\"srgp\" + 0.003*\"thank\"')]"
|
" '0.009*\"koresh\" + 0.008*\"bible\" + 0.008*\"jeff\" + 0.007*\"basically\" + 0.006*\"gerald\" + 0.006*\"bull\" + 0.005*\"pd\" + 0.004*\"also\" + 0.003*\"dam\" + 0.003*\"feiner\"')]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 21,
|
"execution_count": 14,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@@ -405,14 +413,14 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 22,
|
"execution_count": 15,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[(0, 0.09401487), (1, 0.08991001), (2, 0.08514047), (3, 0.7309346)]\n"
|
"[(0, 0.09161347), (1, 0.1133858), (2, 0.103424065), (3, 0.69157666)]\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -424,7 +432,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 24,
|
"execution_count": 16,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@@ -445,14 +453,14 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 25,
|
"execution_count": 17,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[(0, 0.06678458), (1, 0.8006135), (2, 0.06974816), (3, 0.062853776)]\n"
|
"[(0, 0.066217005), (1, 0.8084562), (2, 0.062542014), (3, 0.0627848)]\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -464,14 +472,14 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 26,
|
"execution_count": 18,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"0.007*\"islam\" + 0.006*\"koresh\" + 0.006*\"moon\" + 0.006*\"bible\" + 0.006*\"plane\" + 0.006*\"ns\" + 0.005*\"zoroastrians\" + 0.005*\"joy\" + 0.005*\"lucky\" + 0.005*\"ssrt\"\n"
|
"0.009*\"thanks\" + 0.009*\"targa\" + 0.008*\"whatever\" + 0.008*\"baptist\" + 0.007*\"islam\" + 0.006*\"cheers\" + 0.006*\"kent\" + 0.006*\"zoroastrians\" + 0.006*\"joy\" + 0.006*\"lot\"\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -482,15 +490,15 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 27,
|
"execution_count": 19,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"[(0, 0.110989906), (1, 0.670005), (2, 0.11422917), (3, 0.10477593)]\n",
|
"[(0, 0.11006463), (1, 0.6813435), (2, 0.10399808), (3, 0.10459379)]\n",
|
||||||
"0.007*\"islam\" + 0.006*\"koresh\" + 0.006*\"moon\" + 0.006*\"bible\" + 0.006*\"plane\" + 0.006*\"ns\" + 0.005*\"zoroastrians\" + 0.005*\"joy\" + 0.005*\"lucky\" + 0.005*\"ssrt\"\n"
|
"0.009*\"thanks\" + 0.009*\"targa\" + 0.008*\"whatever\" + 0.008*\"baptist\" + 0.007*\"islam\" + 0.006*\"cheers\" + 0.006*\"kent\" + 0.006*\"zoroastrians\" + 0.006*\"joy\" + 0.006*\"lot\"\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -510,7 +518,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 28,
|
"execution_count": 20,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
@@ -526,23 +534,23 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 29,
|
"execution_count": 21,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"data": {
|
"data": {
|
||||||
"text/plain": [
|
"text/plain": [
|
||||||
"[(0,\n",
|
"[(0,\n",
|
||||||
" '0.769*\"god\" + 0.345*\"jesus\" + 0.235*\"bible\" + 0.203*\"christian\" + 0.149*\"christians\" + 0.108*\"christ\" + 0.089*\"well\" + 0.085*\"koresh\" + 0.081*\"kent\" + 0.080*\"christianity\"'),\n",
|
" '-0.769*\"god\" + -0.345*\"jesus\" + -0.235*\"bible\" + -0.203*\"christian\" + -0.149*\"christians\" + -0.107*\"christ\" + -0.089*\"well\" + -0.085*\"koresh\" + -0.082*\"kent\" + -0.081*\"christianity\"'),\n",
|
||||||
" (1,\n",
|
" (1,\n",
|
||||||
" '-0.863*\"thanks\" + -0.255*\"please\" + -0.160*\"hello\" + -0.153*\"hi\" + 0.123*\"god\" + -0.112*\"sorry\" + -0.088*\"could\" + -0.075*\"windows\" + -0.068*\"jpeg\" + -0.062*\"gif\"'),\n",
|
" '-0.863*\"thanks\" + -0.255*\"please\" + -0.159*\"hello\" + -0.152*\"hi\" + 0.123*\"god\" + -0.112*\"sorry\" + -0.088*\"could\" + -0.074*\"windows\" + -0.067*\"jpeg\" + -0.063*\"gif\"'),\n",
|
||||||
" (2,\n",
|
" (2,\n",
|
||||||
" '-0.779*\"well\" + 0.229*\"god\" + -0.164*\"yes\" + 0.153*\"thanks\" + -0.135*\"ico\" + -0.135*\"tek\" + -0.132*\"beauchaine\" + -0.132*\"queens\" + -0.132*\"bronx\" + -0.131*\"manhattan\"'),\n",
|
" '0.779*\"well\" + -0.229*\"god\" + 0.165*\"yes\" + -0.154*\"thanks\" + 0.135*\"ico\" + 0.134*\"tek\" + 0.131*\"queens\" + 0.131*\"bronx\" + 0.131*\"beauchaine\" + 0.131*\"manhattan\"'),\n",
|
||||||
" (3,\n",
|
" (3,\n",
|
||||||
" '0.343*\"well\" + -0.335*\"ico\" + -0.334*\"tek\" + -0.328*\"bronx\" + -0.328*\"beauchaine\" + -0.328*\"queens\" + -0.325*\"manhattan\" + -0.305*\"com\" + -0.303*\"bob\" + -0.073*\"god\"')]"
|
" '-0.342*\"well\" + 0.335*\"ico\" + 0.333*\"tek\" + 0.327*\"bronx\" + 0.327*\"queens\" + 0.327*\"beauchaine\" + 0.325*\"manhattan\" + 0.305*\"bob\" + 0.304*\"com\" + 0.073*\"god\"')]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"execution_count": 29,
|
"execution_count": 21,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"output_type": "execute_result"
|
"output_type": "execute_result"
|
||||||
}
|
}
|
||||||
@@ -554,7 +562,7 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 30,
|
"execution_count": 22,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
@@ -603,8 +611,17 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
|
"datacleaner": {
|
||||||
|
"position": {
|
||||||
|
"top": "50px"
|
||||||
|
},
|
||||||
|
"python": {
|
||||||
|
"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"
|
||||||
|
},
|
||||||
|
"window_display": false
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3",
|
"display_name": "Python 3 (ipykernel)",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python3"
|
"name": "python3"
|
||||||
},
|
},
|
||||||
@@ -618,7 +635,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.7.1"
|
"version": "3.10.10"
|
||||||
},
|
},
|
||||||
"latex_envs": {
|
"latex_envs": {
|
||||||
"LaTeX_envs_menu_present": true,
|
"LaTeX_envs_menu_present": true,
|
||||||
|
@@ -35,7 +35,7 @@
|
|||||||
"# Table of Contents\n",
|
"# Table of Contents\n",
|
||||||
"\n",
|
"\n",
|
||||||
"* [Exercises](#Exercises)\n",
|
"* [Exercises](#Exercises)\n",
|
||||||
"\t* [Exercise 1 - Sentiment classification for Twitter](#Exercise-1---Sentiment-classification-for-Twitter)\n",
|
"\t* [Exercise 1 - Sentiment Analysis on Movie Reviews](#Exercise-1---Sentiment-Analysis-on-Movie-Reviews)\n",
|
||||||
"\t* [Exercise 2 - Spam classification](#Exercise-2---Spam-classification)\n",
|
"\t* [Exercise 2 - Spam classification](#Exercise-2---Spam-classification)\n",
|
||||||
"\t* [Exercise 3 - Automatic essay classification](#Exercise-3---Automatic-essay-classification)"
|
"\t* [Exercise 3 - Automatic essay classification](#Exercise-3---Automatic-essay-classification)"
|
||||||
]
|
]
|
||||||
@@ -58,21 +58,15 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"## Exercise 1 - Sentiment classification for Twitter"
|
"## Exercise 1 - Sentiment Analysis on Movie Reviews"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"The purpose of this exercise is:\n",
|
"You can try the exercise Exercise 2: Sentiment Analysis on movie reviews of Scikit-Learn https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html. \n",
|
||||||
"* Collect geolocated tweets\n",
|
"Previously you should follow the installation instructions in the section Tutorial Setup."
|
||||||
"* Analyse their sentiment\n",
|
|
||||||
"* Represent the result in a map, so that one can understand the sentiment in a geographic region.\n",
|
|
||||||
"\n",
|
|
||||||
"The steps (and most of the code) can be found [here](http://pybonacci.org/2015/11/24/como-hacer-analisis-de-sentimiento-en-espanol-2/). \n",
|
|
||||||
"\n",
|
|
||||||
"You can select the tweets in any language."
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
64313
nlp/moviereviews.tsv
Normal file
@@ -74,7 +74,6 @@
|
|||||||
"* [The Python tutorial](https://docs.python.org/3/tutorial/)\n",
|
"* [The Python tutorial](https://docs.python.org/3/tutorial/)\n",
|
||||||
"* [Object-Oriented Programming in Python](http://python-textbok.readthedocs.org/en/latest/index.html)\n",
|
"* [Object-Oriented Programming in Python](http://python-textbok.readthedocs.org/en/latest/index.html)\n",
|
||||||
"* [Python3 tutorial](http://www.python-course.eu/python3_course.php)\n",
|
"* [Python3 tutorial](http://www.python-course.eu/python3_course.php)\n",
|
||||||
"* [Python for the Busy Java Developer, Deepak Sarda, 2014](http://antrix.net/static/pages/python-for-java/online/)\n",
|
|
||||||
"* [Style Guide for Python Code (PEP-0008)](https://www.python.org/dev/peps/pep-0008/)\n",
|
"* [Style Guide for Python Code (PEP-0008)](https://www.python.org/dev/peps/pep-0008/)\n",
|
||||||
"* [Python Slides](http://tdc-www.harvard.edu/Python.pdf)\n",
|
"* [Python Slides](http://tdc-www.harvard.edu/Python.pdf)\n",
|
||||||
"* [Python for Programmers - 1 day course](http://www.ucs.cam.ac.uk/docs/course-notes/unix-courses/archived/archived-python-courses/PythonProgIntro/files/notes.pdf)\n",
|
"* [Python for Programmers - 1 day course](http://www.ucs.cam.ac.uk/docs/course-notes/unix-courses/archived/archived-python-courses/PythonProgIntro/files/notes.pdf)\n",
|
||||||
|
@@ -47,7 +47,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"1. Install Anaconda \n",
|
"1. Install Anaconda \n",
|
||||||
"* Download the suitable version for your operating system of Python 3 at https://www.continuum.io/downloads\n",
|
"* Download the suitable version for your operating system of Python 3 at https://www.anaconda.com/products/individual\n",
|
||||||
"* Follow the installation instructions. In Linux/Mac, launch a terminal and execute the installer script.\n",
|
"* Follow the installation instructions. In Linux/Mac, launch a terminal and execute the installer script.\n",
|
||||||
"2. [Optional, anaconda already installs jupyter] Install Jupyter http://jupyter.readthedocs.org/en/latest/install.html\n",
|
"2. [Optional, anaconda already installs jupyter] Install Jupyter http://jupyter.readthedocs.org/en/latest/install.html\n",
|
||||||
"* Launch a terminal and execute 'conda install jupyter'"
|
"* Launch a terminal and execute 'conda install jupyter'"
|
||||||
|
@@ -85,7 +85,7 @@
|
|||||||
"In Python3, there are the following [numeric types](https://docs.python.org/3/library/stdtypes.html#typesnumeric):\n",
|
"In Python3, there are the following [numeric types](https://docs.python.org/3/library/stdtypes.html#typesnumeric):\n",
|
||||||
"* integers (int): 1, -1, ...\n",
|
"* integers (int): 1, -1, ...\n",
|
||||||
"* floating point numbers (float): 0.1, 1E2\n",
|
"* floating point numbers (float): 0.1, 1E2\n",
|
||||||
"* complex numbers (complex): 2 + 3j\n",
|
"* complex numbers (complex): 2 + 3j\n.",
|
||||||
"Let's play a bit"
|
"Let's play a bit"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
@@ -377,7 +377,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"Tuples are faster than lists. Its main usage is when the collection is constant, or you do not want it can be changed (write protected). \n",
|
"Tuples are faster than lists. Its main usage is when the collection is constant, or you do not want it can be changed (write protected). \n",
|
||||||
"\n",
|
"\n",
|
||||||
"Tuples can be converted into lists and vice-versa, with the methods list() and tuple()."
|
"Tuples can be converted into lists and vice-versa, with the methods *list()* and *tuple()*."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@@ -37,7 +37,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"A set object is an unordered collection of distinct objects. There are two built-in set types: **set** (mutable) and **frozenset** (inmutable).\n",
|
"A set object is an unordered collection of distinct objects. There are two built-in set types: **set** (mutable) and **frozenset** (inmutable).\n",
|
||||||
"\n",
|
"\n",
|
||||||
"A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is only one bultin mapping type: **dictionary**."
|
"A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is only one builtin mapping type: **dictionary**."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|