1
0
mirror of https://github.com/gsi-upm/sitc synced 2024-12-22 11:48:12 +00:00

Minor changes LOD 01 and 03

This commit is contained in:
J. Fernando Sánchez 2022-02-15 20:48:49 +01:00
parent 95cd25aef4
commit 66ed4ba258
4 changed files with 112 additions and 74 deletions

View File

@ -790,11 +790,12 @@
"\n",
"SELECT *\n",
"WHERE { ... }\n",
"ORDER BY <variable> <variable> ... DESC(<variable>) ASC(<variable>)\n",
"ORDER BY <variable> <variable> ... \n",
"... other statements like LIMIT ...\n",
"```\n",
"\n",
"The results can be sorted in ascending or descending order, and using several variables."
"The results can be sorted in ascending or descending order, and using several variables.\n",
"By default the results are ordered in ascending order, but you can indicate the order using an optional modifier (`ASC(<variable>)`, or `DESC(<variable>)`). \n"
]
},
{
@ -880,7 +881,7 @@
" rdfs:label \"Ringo Starr\" .\n",
"```\n",
"\n",
"Using this structure, and the SPARQL statements you already know, to get the **names** of all musicians that collaborated in at least one song.\n"
"Using this structure, and the SPARQL statements you already know, get the **names** of all musicians that collaborated in at least one song.\n"
]
},
{
@ -954,13 +955,13 @@
"\n",
"Results can be aggregated using different functions.\n",
"One of the simplest functions is `COUNT`.\n",
"The syntax for COUNT is:\n",
"The syntax for `COUNT` is:\n",
" \n",
"```sparql\n",
"SELECT (COUNT(?variable) as ?count_name)\n",
"```\n",
"\n",
"Use `COUNT` to get the number of songs in which Ringo collaborated."
"Use `COUNT` to get the number of songs in which Ringo collaborated. Your query should return a column named `number`."
]
},
{
@ -1143,7 +1144,9 @@
"Now, use the same principle to get the count of **different** instruments in each song.\n",
"Some songs have several musicians playing the same instrument, but we only care about *different* instruments in each song.\n",
"\n",
"Use `?number` for the count."
"Use `?song` for the song and `?number` for the count.\n",
"\n",
"Take into consideration that instruments are entities of type `i:Instrument`."
]
},
{
@ -1153,7 +1156,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "2d0633303eedd0655e9b64bb00317dba",
"checksum": "3139d9b7e620266946ffe1ae0cf67581",
"grade": false,
"grade_id": "cell-ee208c762d00da9c",
"locked": false,
@ -1173,6 +1176,8 @@
" [] a s:Song ;\n",
" rdfs:label ?song ;\n",
" ?instrument ?musician .\n",
" \n",
"?instrument a s:Instrument .\n",
"}\n",
"# YOUR ANSWER HERE\n",
"ORDER BY DESC(?number)"
@ -1186,7 +1191,7 @@
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "301aa479241fa02534ee047cf7577eee",
"checksum": "5abf6eb7a67ebc9f7612b876105c1960",
"grade": true,
"grade_id": "cell-ddeec32b8ac3d894",
"locked": true,
@ -1198,7 +1203,7 @@
"outputs": [],
"source": [
"s = solution()\n",
"assert s['columns']['number'][0] == '27'"
"assert s['columns']['number'][0] == '25'"
]
},
{
@ -1243,10 +1248,10 @@
"metadata": {},
"source": [
"However, there are some songs that do not have a vocalist (at least, in the dataset).\n",
"Those songs will not appear in the list above, because we they do not match part of the `WHERE` clause.\n",
"Those songs will not appear in the list above, because they do not match part of the `WHERE` clause.\n",
"\n",
"In these cases, we can specify optional values in a query using the `OPTIONAL` keyword.\n",
"When a set of clauses are inside an OPTIONAL group, the SPARQL endpoint will try to use them in the query.\n",
"When a set of clauses are inside an `OPTIONAL` group, the SPARQL endpoint will try to use them in the query.\n",
"If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n",
"\n",
"To exemplify this, we can use a property that **does not exist in the dataset**:"
@ -1504,7 +1509,9 @@
"source": [
"Now, count how many instruments each musician have played in a song.\n",
"\n",
"**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**."
"**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**.\n",
"\n",
"Use `?musician` for the musician and `?number` for the count."
]
},
{
@ -1770,7 +1777,9 @@
"\n",
"Using `GROUP_CONCAT`, get a list of the instruments that each musician could play.\n",
"\n",
"You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/)."
"You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/).\n",
"\n",
"Use `?musician` for the musician and `?instruments` for the list of instruments."
]
},
{
@ -1815,7 +1824,9 @@
"\n",
"You can check if a string or URI matches a regular expression with `regex(?variable, \"<regex>\", \"i\")`.\n",
"\n",
"The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/)."
"The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/).\n",
"\n",
"Use `?instrument` for the instrument and `?ins` for the url of the type."
]
},
{
@ -1873,7 +1884,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -1887,7 +1898,20 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.1"
"version": "3.8.10"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,

View File

@ -455,7 +455,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.1"
"version": "3.9.1"
}
},
"nbformat": 4,

View File

@ -189,8 +189,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's start with a simple query. We will get a list of cities and towns in Madrid.\n",
"If we take a look at the DBpedia ontology or the page of any town we already know, we discover that the property that links towns to their community is [`isPartOf`](http://dbpedia.org/ontology/isPartOf), and [the Community of Madrid is also a resource in DBpedia](http://dbpedia.org/resource/Community_of_Madrid)\n",
"Let's start with a simple query. We will get a list of towns and other populated areas within the Community of Madrid.\n",
"If we take a look at the DBpedia ontology, or the page of any town we already know, we discover that the property that links towns to their community is [`subdivision`](http://dbpedia.org/ontology/subdivision), and [the Community of Madrid is also a resource in DBpedia](http://dbpedia.org/resource/Community_of_Madrid)\n",
"\n",
"Since there are potentially many cities to get, we will limit our results to the first 10 results:"
]
@ -201,11 +201,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"SELECT ?localidad\n",
"WHERE {\n",
" ?localidad <http://dbpedia.org/ontology/isPartOf> <http://dbpedia.org/resource/Community_of_Madrid>\n",
" ?localidad <http://dbpedia.org/ontology/subdivision> <http://dbpedia.org/resource/Community_of_Madrid>\n",
"}\n",
"LIMIT 10"
]
@ -224,14 +224,14 @@
"metadata": {},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
" \n",
"SELECT ?localidad\n",
"WHERE {\n",
" ?localidad dbo:isPartOf dbr:Community_of_Madrid.\n",
" ?localidad dbo:subdivision dbr:Community_of_Madrid.\n",
"}\n",
"LIMIT 10"
]
@ -272,7 +272,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "eef1c62e2797bd3ef01f2061da6f83c4",
"checksum": "4f753f7c895d2f65fa9fcda462f8adda",
"grade": false,
"grade_id": "cell-7a9509ff3c34127e",
"locked": false,
@ -282,7 +282,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
@ -324,7 +324,7 @@
"source": [
"### Using more criteria\n",
"\n",
"We can get more than one property in the same query. Let us modify our query to get the population of the cities as well."
"We can get more than one property in the same query. Let us modify our query to get the total area of the towns we found before."
]
},
{
@ -333,22 +333,21 @@
"metadata": {},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
"PREFIX dbp: <http://dbpedia.org/property/>\n",
" \n",
"SELECT ?localidad ?pop ?when\n",
"SELECT ?localidad ?area\n",
"\n",
"WHERE {\n",
" ?localidad dbo:populationTotal ?pop .\n",
" ?localidad dbo:isPartOf dbr:Community_of_Madrid.\n",
" ?localidad dbp:populationAsOf ?when .\n",
" ?localidad dbo:areaTotal ?area .\n",
" ?localidad dbo:subdivision dbr:Community_of_Madrid .\n",
"}\n",
"\n",
"LIMIT 100"
"LIMIT 1000"
]
},
{
@ -358,8 +357,7 @@
"outputs": [],
"source": [
"assert 'localidad' in solution()['columns']\n",
"assert 'http://dbpedia.org/resource/Parla' in solution()['columns']['localidad']\n",
"assert ('http://dbpedia.org/resource/San_Sebastián_de_los_Reyes', '75912', '2009') in solution()['tuples']"
"assert ('http://dbpedia.org/resource/Lozoya', '5.794e+07') in solution()['tuples']"
]
},
{
@ -378,7 +376,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "9d4193612dea95da2d91762b638ad5e6",
"checksum": "2ebdc8d3f3420bb961e2c8c77d027c3b",
"grade": false,
"grade_id": "cell-83dcaae0d09657b5",
"locked": false,
@ -388,7 +386,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -399,7 +397,7 @@
"WHERE {\n",
"# YOUR ANSWER HERE\n",
"}\n",
"LIMIT 10"
"LIMIT 100"
]
},
{
@ -410,7 +408,7 @@
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "86115c2a8982ad12b7250cf4341ae9c3",
"checksum": "b6bcf798e1d8784ecfe11ccb3d621bdd",
"grade": true,
"grade_id": "cell-8afd28aada7a896c",
"locked": true,
@ -422,8 +420,8 @@
"outputs": [],
"source": [
"assert 'escritor' in solution()['columns']\n",
"assert 'http://dbpedia.org/resource/Eduardo_Mendoza_Garriga' in solution()['columns']['escritor']\n",
"assert ('http://dbpedia.org/resource/Eduardo_Mendoza_Garriga', 'Eduardo Mendoza') in solution()['tuples']"
"assert 'http://dbpedia.org/resource/Luis_Coloma' in solution()['columns']['escritor']\n",
"assert ('http://dbpedia.org/resource/Luis_Coloma', 'Luis Coloma') in solution()['tuples']"
]
},
{
@ -444,7 +442,7 @@
"\n",
"We can also decide the order in which our results are shown.\n",
"\n",
"For instance, this is how we could use filtering to get only large cities in our example, ordered by population:"
"For instance, this is how we could use filtering to get only large areas in our example, in descending order:"
]
},
{
@ -453,19 +451,18 @@
"metadata": {},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
" \n",
"SELECT ?localidad ?pop ?when\n",
"SELECT ?localidad ?area\n",
"\n",
"WHERE {\n",
" ?localidad dbo:populationTotal ?pop .\n",
" ?localidad dbo:isPartOf dbr:Community_of_Madrid.\n",
" ?localidad dbp:populationAsOf ?when .\n",
" FILTER(?pop > 100000)\n",
" ?localidad dbo:areaTotal ?area .\n",
" ?localidad dbo:type dbr:Municipalities_of_Spain .\n",
" FILTER(?area > 100000)\n",
"}\n",
"ORDER BY ?pop\n",
"LIMIT 100"
@ -486,7 +483,7 @@
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "a38cb1aea7b1f01f6b37c088384e0a3d",
"checksum": "9485c62a83314be3c6e0c2ce0ab1ff2e",
"grade": true,
"grade_id": "cell-cb7b8283568cd349",
"locked": true,
@ -498,10 +495,9 @@
"outputs": [],
"source": [
"# We still have the biggest city\n",
"assert ('http://dbpedia.org/resource/Madrid', '3141991', '2014') in solution()['tuples']\n",
"assert ('http://dbpedia.org/resource/Orcasur', '14264', '2020') in solution()['tuples']\n",
"# But the smaller ones are gone\n",
"assert 'http://dbpedia.org/resource/Tres_Cantos' not in solution()['columns']['localidad']\n",
"assert 'http://dbpedia.org/resource/San_Sebastián_de_los_Reyes' not in solution()['columns']['localidad']"
"assert 'http://dbpedia.org/resource/El_Cañaveral' not in solution()['columns']['localidad']"
]
},
{
@ -518,7 +514,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "b6aaac8ab30d52a042c1efefbbff7550",
"checksum": "0545d89afc6fbb3001240d50c74dce77",
"grade": false,
"grade_id": "cell-ff3d611cb0304b01",
"locked": false,
@ -528,7 +524,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -647,7 +643,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f4170cbbf042644e394d1eb9acf12ce3",
"checksum": "e8dcfee69cc653b9e09c2b6ba2e2fa97",
"grade": false,
"grade_id": "cell-254a18dd973e82ed",
"locked": false,
@ -657,7 +653,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -670,7 +666,7 @@
"# YOUR ANSWER HERE\n",
"}\n",
"# YOUR ANSWER HERE\n",
"LIMIT 200"
"LIMIT 2000"
]
},
{
@ -681,7 +677,7 @@
"editable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "29c6362adbdb5606e158f696594e1052",
"checksum": "5c01a467c0e3f3fa8c13dc7648696858",
"grade": true,
"grade_id": "cell-4d6a64dde67f0e11",
"locked": true,
@ -692,8 +688,8 @@
},
"outputs": [],
"source": [
"assert 'Wenceslao Fernández Flórez' in solution()['columns']['nombre']\n",
"assert '1879-2-11' in solution()['columns']['fechaNac']\n",
"assert 'Carmen Laforet' in solution()['columns']['nombre']\n",
"# assert '1879-2-11' in solution()['columns']['fechaNac']\n",
"assert '' in solution()['columns']['fechaNac'] # Not all birthdates are defined\n",
"assert '' in solution()['columns']['fechaDef'] # Some deathdates are not defined"
]
@ -733,7 +729,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "f3c11121eb0d1328d2f5da3580f8d648",
"checksum": "2508e2f85ece2e717aa1348db290e449",
"grade": false,
"grade_id": "cell-474b1a72dec6827c",
"locked": false,
@ -743,7 +739,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://live.dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -804,7 +800,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "ed34857649c9a6926eb0a3a0e1d8198d",
"checksum": "2e608b808ceceb2c8515f892a6b98d06",
"grade": false,
"grade_id": "cell-ceefd3c8fbd39d79",
"locked": false,
@ -814,7 +810,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -887,7 +883,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "34163ddb0400cd8ddd2c2e2cdf29c20b",
"checksum": "3d647ccd0f3e861b843af0ec4a33098b",
"grade": false,
"grade_id": "cell-2a39adc71d26ae73",
"locked": false,
@ -897,7 +893,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -968,7 +964,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "25c8edcee216d536aac98fc9aa2b6422",
"checksum": "f067a70a247b62d7eb5cc526efdc53c4",
"grade": false,
"grade_id": "cell-d175e41da57c889b",
"locked": false,
@ -978,7 +974,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -1042,7 +1038,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "c1f22b82c4d0bd4102a6c38f7f933dc6",
"checksum": "5b9d1561a5c9786c5a803b9aaa259441",
"grade": false,
"grade_id": "cell-e4b99af9ef91ff6f",
"locked": false,
@ -1052,7 +1048,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -1289,7 +1285,7 @@
"deletable": false,
"nbgrader": {
"cell_type": "code",
"checksum": "cd7ce9212f587afe311c7631b3908de2",
"checksum": "f8cca6da3b6830a5474eac28c3c8ebde",
"grade": false,
"grade_id": "cell-e35414e191c5bf16",
"locked": false,
@ -1299,7 +1295,7 @@
},
"outputs": [],
"source": [
"%%sparql http://dbpedia.org/sparql\n",
"%%sparql https://dbpedia.org/sparql\n",
"\n",
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
"PREFIX dct:<http://purl.org/dc/terms/>\n",
@ -1395,7 +1391,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -1409,7 +1405,20 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.1"
"version": "3.8.10"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,

View File

@ -12,6 +12,7 @@ from urllib.request import Request, urlopen
from urllib.parse import quote_plus, urlencode
from urllib.error import HTTPError
import ssl
import json
import sys
@ -32,7 +33,11 @@ def send_query(query, endpoint):
headers={'content-type': 'application/x-www-form-urlencoded',
'accept': FORMATS},
method='POST')
res = urlopen(r)
context = ssl.create_default_context()
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
res = urlopen(r, context=context)
data = res.read().decode('utf-8')
if res.getcode() == 200:
try: