mirror of
https://github.com/gsi-upm/sitc
synced 2024-11-17 12:02:28 +00:00
1430 lines
38 KiB
Plaintext
1430 lines
38 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "markdown",
|
|
"checksum": "7276f055a8c504d3c80098c62ed41a4f",
|
|
"grade": false,
|
|
"grade_id": "cell-0bfe38f97f6ab2d2",
|
|
"locked": true,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"source": [
|
|
"<header style=\"width:100%;position:relative\">\n",
|
|
" <div style=\"width:80%;float:right;\">\n",
|
|
" <h1>Course Notes for Learning Intelligent Systems</h1>\n",
|
|
" <h3>Department of Telematic Engineering Systems</h3>\n",
|
|
" <h5>Universidad Politécnica de Madrid</h5>\n",
|
|
" </div>\n",
|
|
" <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
|
|
"</header>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "markdown",
|
|
"checksum": "2387a77db61721ecc375a8275111ecaf",
|
|
"grade": false,
|
|
"grade_id": "cell-0cd673883ee592d1",
|
|
"locked": true,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"source": [
|
|
"## Introduction to Linked Open Data\n",
|
|
"\n",
|
|
"In this lecture, we will apply the same SPARQL concepts as in previous notebooks.\n",
|
|
"This time, instead of using a database specifically built for the exercise, we will be using DBpedia.\n",
|
|
"DBpedia is a semantic version of Wikipedia.\n",
|
|
"\n",
|
|
"The language we will use to query DBpedia is SPARQL, a semantic query language inspired by SQL.\n",
|
|
"For convenience, the examples in the notebook are executable, and they are accompanied by some code to test the results.\n",
|
|
"If the tests pass, you probably got the answer right.\n",
|
|
"\n",
|
|
"However, you can also use any other method to write and send your queries.\n",
|
|
"You may find online query editors particularly useful.\n",
|
|
"In addition to running queries from your browser, they provide useful features such as syntax highlighting and autocompletion.\n",
|
|
"Some examples are:\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "markdown",
|
|
"checksum": "bc0ca2e21254707344c60f895cb204b4",
|
|
"grade": false,
|
|
"grade_id": "cell-10264483046abcc4",
|
|
"locked": true,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"source": [
|
|
"## Objectives\n",
|
|
"\n",
|
|
"* Learning SPARQL and the Linked Data principles by defining queries to answer a set of problems of increasing difficulty\n",
|
|
"* Learning how to use integrated SPARQL editors and programming interfaces to SPARQL."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "markdown",
|
|
"checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
|
|
"grade": false,
|
|
"grade_id": "cell-4f8492996e74bf20",
|
|
"locked": true,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"source": [
|
|
"## Tools\n",
|
|
"\n",
|
|
"See [the SPARQL notebook](./01_SPARQL_Introduction.ipynb#Tools)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Instructions\n",
|
|
"\n",
|
|
"As in previous notebooks, the exercises can be done in the notebook, using the `%%sparql` magic, and the set of tests.\n",
|
|
"\n",
|
|
"\n",
|
|
"After every query, you will find some python code to test the results of the query.\n",
|
|
"**Make sure you've run the tests before moving to the next exercise**.\n",
|
|
"If the test gives you an error, you've probably done something wrong.\n",
|
|
"You **do not need to understand or modify the test code**.\n",
|
|
"\n",
|
|
"If you prefer to edit your queries in a different editor, here are some options:\n",
|
|
"\n",
|
|
"* DBpedia's virtuoso query editor https://dbpedia.org/sparql\n",
|
|
"* A javascript based client hosted at GSI: http://yasgui.gsi.upm.es/\n",
|
|
"\n",
|
|
"If you use an editor, make sure to copy it to the notebook and run the tests, once you are getting the expected results."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "markdown",
|
|
"checksum": "c5f8646518bd832a47d71f9d3218237a",
|
|
"grade": false,
|
|
"grade_id": "cell-eb13908482825e42",
|
|
"locked": true,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"source": [
|
|
"Run this line to enable the `%%sparql` magic command."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from helpers import sparql, solution, show_photos"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The `%%sparql` magic command will allow us to use SPARQL inside normal jupyter cells.\n",
|
|
"\n",
|
|
"For instance, the following code:\n",
|
|
"\n",
|
|
"```python\n",
|
|
"%%sparql\n",
|
|
"\n",
|
|
"<MY QUERY>\n",
|
|
"``` \n",
|
|
"\n",
|
|
"Is the same as `run_query('<MY QUERY>', endpoint='http://dbpedia.org/sparql')` plus some additional steps, such as saving the results in a nice table format so that they can be used later and storing the results in a variable (`solution()`), which we will use in our tests.\n",
|
|
"\n",
|
|
"You do not need to worry about it, and **you can always use one of the suggested online editors if you wish**."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Exercises"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### First Select"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's start with a simple query. We will get a list of towns and other populated areas within the Community of Madrid.\n",
|
|
"If we take a look at the DBpedia ontology, or the page of any town we already know, we discover that the property that links towns to their community is [`subdivision`](http://dbpedia.org/ontology/subdivision), and [the Community of Madrid is also a resource in DBpedia](http://dbpedia.org/resource/Community_of_Madrid)\n",
|
|
"\n",
|
|
"Since there are potentially many cities to get, we will limit our results to the first 10 results:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"SELECT ?localidad\n",
|
|
"WHERE {\n",
|
|
" ?localidad <http://dbpedia.org/ontology/subdivision> <http://dbpedia.org/resource/Community_of_Madrid>\n",
|
|
"}\n",
|
|
"LIMIT 10"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"However, that query is very verbose because we are using full URIs.\n",
|
|
"To simplify it, we will make use of SPARQL prefixes:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
|
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
|
" \n",
|
|
"SELECT ?localidad\n",
|
|
"WHERE {\n",
|
|
" ?localidad dbo:subdivision dbr:Community_of_Madrid.\n",
|
|
"}\n",
|
|
"LIMIT 10"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"To make sure that the query returned something sensible, we can test it with some python code:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'localidad' in solution()['columns']\n",
|
|
"assert len(solution()['tuples']) == 10"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now that you have some experience under your belt, it is time to design your own query.\n",
|
|
"\n",
|
|
"Your first task it to get a list of writers, using the skeleton below and the previous query to guide you.\n",
|
|
"\n",
|
|
"The DBpedia vocabulary has a special class for writers: `<http://dbpedia.org/ontology/Writer>`.\n",
|
|
"\n",
|
|
"In other words, the difference from the previous query will be using `a` instead of `dbo:isPartOf`, and `dbo:Writer` instead of `dbr:Community_of_Madrid`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "2a5c55e8bca983aca6cc2293f4560f31",
|
|
"grade": false,
|
|
"grade_id": "cell-7a9509ff3c34127e",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT ?escritor\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"LIMIT 10"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "5a57d16cb2b53925f6e39fba429b7ef2",
|
|
"grade": true,
|
|
"grade_id": "cell-91240ded2cac7b6d",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert len(solution()['columns']) == 1 # We only use one variable, ?escritor\n",
|
|
"assert len(solution()['tuples']) == 10 # There should be 10 results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Using more criteria\n",
|
|
"\n",
|
|
"We can get more than one property in the same query. Let us modify our query to get the total area of the towns we found before."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
|
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
|
"PREFIX dbp: <http://dbpedia.org/property/>\n",
|
|
" \n",
|
|
"SELECT ?localidad ?area\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
" ?localidad dbo:areaTotal ?area .\n",
|
|
" ?localidad dbo:subdivision dbr:Community_of_Madrid .\n",
|
|
"}\n",
|
|
"\n",
|
|
"LIMIT 1000"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'localidad' in solution()['columns']\n",
|
|
"assert ('http://dbpedia.org/resource/Lozoya', '5.794e+07') in solution()['tuples']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Time to try it yourself.\n",
|
|
"\n",
|
|
"Get the list of writers AND their name (using rdfs:label)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "2ebdc8d3f3420bb961e2c8c77d027c3b",
|
|
"grade": false,
|
|
"grade_id": "cell-83dcaae0d09657b5",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"\n",
|
|
"SELECT ?escritor ?name\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "d779d690d5d1865973fdcf113b74c221",
|
|
"grade": true,
|
|
"grade_id": "cell-8afd28aada7a896c",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'escritor' in solution()['columns']\n",
|
|
"assert 'http://dbpedia.org/resource/Alison_Stine' in solution()['columns']['escritor']\n",
|
|
"assert ('http://dbpedia.org/resource/Alistair_MacLeod', 'Alistair MacLeod') in solution()['tuples']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Filtering and ordering"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"In the previous example, we saw that we got what seemed to be duplicated answers.\n",
|
|
"\n",
|
|
"This happens because entities can have labels in different languages (e.g. English, Spanish).\n",
|
|
"We can filter results using the `FILTER` keyword.\n",
|
|
"\n",
|
|
"We can also decide the order in which our results are shown using the `ORDER BY` sentence.\n",
|
|
"We can order in ascending (`ASC`) or descending (`DESC`) order.\n",
|
|
"\n",
|
|
"For instance, this is how we could use filtering to get only large areas in our example, in descending order:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dbo: <http://dbpedia.org/ontology/>\n",
|
|
"PREFIX dbr: <http://dbpedia.org/resource/>\n",
|
|
" \n",
|
|
"SELECT ?localidad ?area\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
" ?localidad dbo:areaTotal ?area .\n",
|
|
" ?localidad dbo:type dbr:Municipalities_of_Spain .\n",
|
|
" FILTER(?area > 100000)\n",
|
|
"}\n",
|
|
"ORDER BY DESC(?area)\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Note that ordering happens before limits."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "1e09f3c1749dd3c9256a1d0bbc14ff2d",
|
|
"grade": true,
|
|
"grade_id": "cell-cb7b8283568cd349",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# We still have the biggest city\n",
|
|
"assert 'http://dbpedia.org/resource/Úbeda' in solution()['columns']['localidad']\n",
|
|
"# But the smaller ones are gone\n",
|
|
"assert 'http://dbpedia.org/resource/El_Cañaveral' not in solution()['columns']['localidad']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now, try filtering to get a list of novelists and their name in Spanish, ordered by name `(FILTER (LANG(?nombre) = \"es\") y ORDER BY`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "b200ff7d97fe03bab726040d16b636fe",
|
|
"grade": false,
|
|
"grade_id": "cell-ff3d611cb0304b01",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT ?escritor ?nombre\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "637f8a2e0eb286f968f22b0e0fa2215a",
|
|
"grade": true,
|
|
"grade_id": "cell-d70cc6ea394741bc",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert len(solution()['tuples']) >= 50\n",
|
|
"assert 'Abraham Abulafia' in solution()['columns']['nombre']\n",
|
|
"assert sum(1 for k in solution()['columns']['escritor'] if k == 'http://dbpedia.org/resource/Abraham_Abulafia') == 1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Optional"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"From now on, we will focus on our Writers example.\n",
|
|
"More specifically, we will be interested in writers born in the XX century.\n",
|
|
"\n",
|
|
"To do that, we will filter our novelists to only those born (`dbo:birthDate`) in the 20th century (after 1900)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "e3ff089c983be1ae937f254b8d9d229a",
|
|
"grade": false,
|
|
"grade_id": "cell-ab7755944d46f9ca",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# YOUR ANSWER HERE"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "e896e64c21f317aeacf82ccd46811059",
|
|
"grade": true,
|
|
"grade_id": "cell-cf3821f2d33fb0f6",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'Kiku Amino' in solution()['columns']['nombre']\n",
|
|
"assert 'Albert Hackett' in solution()['columns']['nombre']\n",
|
|
"assert all(x > '1900-01-01' and x < '2001-01-01' for x in solution()['columns']['nac'])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"In our last example, we were missing many novelists that are do not have birth information in DBpedia.\n",
|
|
"\n",
|
|
"We can specify optional values in a query using the `OPTIONAL` keyword.\n",
|
|
"When a set of clauses are inside an OPTIONAL group, the SPARQL endpoint will try to use them in the query.\n",
|
|
"If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n",
|
|
"\n",
|
|
"Using that, let us retrieve all the novelists, their birth and death date (if they are available)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "df4364d90fd37ec886bec8f39f6df8ee",
|
|
"grade": false,
|
|
"grade_id": "cell-254a18dd973e82ed",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT ?escritor ?nombre ?fechaNac ?fechaDef\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "26d08d050ac6963b20595f52b5d14781",
|
|
"grade": true,
|
|
"grade_id": "cell-4d6a64dde67f0e11",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'Alister McGrath' in solution()['columns']['nombre']\n",
|
|
"# assert '1879-2-11' in solution()['columns']['fechaNac']\n",
|
|
"assert '' in solution()['columns']['fechaNac'] # Not all birthdates are defined\n",
|
|
"assert '' in solution()['columns']['fechaDef'] # Some deathdates are not defined"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Bound"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can check whether the optional value for a key was bound in a SPARQL query using `BOUND(?key)`.\n",
|
|
"\n",
|
|
"This is very useful for two purposes.\n",
|
|
"First, it allows us to look for patterns that **do not occur** in the graph, such as missing properties.\n",
|
|
"For instance, we could search for the authors with missing birth information so we can add it.\n",
|
|
"Secondly, we can use bound in filters to get conditional filters.\n",
|
|
"We will explore both uses in this exercise."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Get the list of writers that are still alive.\n",
|
|
"A person is alive if their death date is not defined and the were born less than 100 years ago"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "7527bd597f9550ec14d454732f6b2183",
|
|
"grade": false,
|
|
"grade_id": "cell-474b1a72dec6827c",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
" \n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 1000"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "8f8c783af97cd3024b90a8f5b7fd7027",
|
|
"grade": true,
|
|
"grade_id": "cell-46b62dd2856bc919",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'Fernando Arrabal' in solution()['columns']['nombre']\n",
|
|
"assert 'Javier Sierra' in solution()['columns']['nombre']\n",
|
|
"for year in solution()['columns']['nac']:\n",
|
|
" assert int(year) >= 1918"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now, get the list of writers that died before their fifties (i.e. younger than 50 years old), or that aren't 50 years old yet.\n",
|
|
"\n",
|
|
"Hint: you can use boolean logic in your filters (e.g. `&&` and `||`).\n",
|
|
"\n",
|
|
"Hint 2: Some dates are not formatted properly, which makes some queries fail when they shouldn't. You might need to convert between different types as a workaround. For instance, you could get the year from a date like this: `year(xsd:dateTime(str(?date)))`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "2e608b808ceceb2c8515f892a6b98d06",
|
|
"grade": false,
|
|
"grade_id": "cell-ceefd3c8fbd39d79",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT ?escritor, ?nombre, YEAR(?fechaNac) as ?nac, ?fechaDef\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "ec821397f67619e5bfa02a19bdd597fc",
|
|
"grade": true,
|
|
"grade_id": "cell-461cd6ccc6c2dc79",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'Wang Ruowang' in solution()['columns']['nombre']\n",
|
|
"assert 'http://dbpedia.org/resource/Manuel_de_Pedrolo' in solution()['columns']['escritor']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Finding unique elements"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"In our last example, our results show some authors more than once.\n",
|
|
"This is because some properties are defined more than once.\n",
|
|
"For instance, birth date is giving using different formats.\n",
|
|
"Even if we exclude that property from our results by not adding it in our `SELECT`, we will get duplicated lines.\n",
|
|
"\n",
|
|
"To solve this, we can use the `DISTINCT` keyword."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Modify your last query to remove duplicated lines.\n",
|
|
"In other words, authors should only appear once."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "3d647ccd0f3e861b843af0ec4a33098b",
|
|
"grade": false,
|
|
"grade_id": "cell-2a39adc71d26ae73",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT DISTINCT ?escritor, ?nombre, year(?fechaNac) as ?nac\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "524d152d46d3c1166052b6d5871c6aa5",
|
|
"grade": true,
|
|
"grade_id": "cell-542e0e36347fd5d1",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'Anna Langfus' in solution()['columns']['nombre']\n",
|
|
"assert 'http://dbpedia.org/resource/Paul_Celan' in solution()['columns']['escritor']\n",
|
|
"\n",
|
|
"from collections import Counter\n",
|
|
"c = Counter(solution()['columns']['nombre'])\n",
|
|
"for count in c.values():\n",
|
|
" assert count == 1\n",
|
|
" \n",
|
|
"c1 = Counter(solution()['columns']['escritor'])\n",
|
|
"assert all(count==1 for count in c1.values())\n",
|
|
"# c = Counter(solution()['columns']['nombre'])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Using other resources"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Get the list of living novelists born in Madrid.\n",
|
|
"\n",
|
|
"Hint: use `dbr:Madrid` and `dbo:birthPlace`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "f067a70a247b62d7eb5cc526efdc53c4",
|
|
"grade": false,
|
|
"grade_id": "cell-d175e41da57c889b",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT DISTINCT ?escritor, ?nombre, ?lugarNac, year(?fechaNac) as ?nac\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "443608e177f514f2cddafa6c1d1e3cc7",
|
|
"grade": true,
|
|
"grade_id": "cell-fadd095862db6bc8",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'José Ángel Mañas' in solution()['columns']['nombre']\n",
|
|
"assert 'http://dbpedia.org/resource/Madrid' in solution()['columns']['lugarNac']\n",
|
|
"MADRID_QUERY = solution()['columns'].copy()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Traversing the graph"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Get the list of works of the authors in the previous query (i.e. authors born in Madrid), if they have any.\n",
|
|
"\n",
|
|
"Hint: use `dbo:author`, which is a **property of a literary work** that points to the author."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "64ea2ef341901ce486bb1dcbed6c3785",
|
|
"grade": false,
|
|
"grade_id": "cell-e4b99af9ef91ff6f",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT DISTINCT ?escritor, ?nombre, ?obra\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 1000"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "fe47b48969b20b50a16a4ce4ad75e97d",
|
|
"grade": true,
|
|
"grade_id": "cell-68661b73c2140e4f",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'http://dbpedia.org/resource/Cristina_Guzmán_(novel)' in solution()['columns']['obra']\n",
|
|
"assert 'http://dbpedia.org/resource/Life_Is_a_Dream' in solution()['columns']['obra']\n",
|
|
"assert '' in solution()['columns']['obra'] # Some authors don't have works in dbpedia"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Traversing the graph II"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Get a list of writers born in Madrid, their name in Spanish, a link to their foto and a website (if they have one).\n",
|
|
"\n",
|
|
"If the query is right, you should see a list of writers after running the test code.\n",
|
|
"\n",
|
|
"Hint: `foaf:depiction` and `foaf: homepage`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "d3636d90f8d6a3c824b17ce87ba6c423",
|
|
"grade": false,
|
|
"grade_id": "cell-b1f71c67dd71dad4",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT ?escritor ?web ?foto\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"ORDER BY ?nombre\n",
|
|
"LIMIT 5"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "2d40e7ceb7774b29a709092ee8dfa9f5",
|
|
"grade": true,
|
|
"grade_id": "cell-8b8ba7cca701c652",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"fotos = set(filter(lambda x: x != '', solution()['columns']['foto']))\n",
|
|
"assert len(fotos) > 2\n",
|
|
"show_photos(fotos) #show the pictures of the writers!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Union"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can merge the results of several queries, just like using `JOIN` in SQL.\n",
|
|
"The keyword in SPARQL is `UNION`, because we are merging graphs.\n",
|
|
"\n",
|
|
"`UNION` is useful in many situations.\n",
|
|
"For instance, when there are equivalent properties, or when you want to use two search terms and FILTER would be too inefficient.\n",
|
|
"\n",
|
|
"The syntax is as follows:\n",
|
|
"\n",
|
|
"```sparql\n",
|
|
"SELECT ?title\n",
|
|
"WHERE {\n",
|
|
" { ?book dc10:title ?title }\n",
|
|
" UNION\n",
|
|
" { ?book dc11:title ?title }\n",
|
|
" \n",
|
|
" ... REST OF YOUR QUERY ...\n",
|
|
"\n",
|
|
"}\n",
|
|
"```\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Using UNION, get a list of distinct spanish novelists AND poets.\n",
|
|
"\n",
|
|
"In this query, instead of looking for writers, try to find the right entities by looking at the `dct:subject` property.\n",
|
|
"The entities we are looking after should be in the `Spanish_poets` and `Spanish_novelists` categories."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "2547e55ac68b37687efddd50c768eb5b",
|
|
"grade": false,
|
|
"grade_id": "cell-21eb6323b6d0011d",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"SELECT DISTINCT ?escritor, ?nombre\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"LIMIT 100"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "565dac8ae632765bc3f128f830e70993",
|
|
"grade": true,
|
|
"grade_id": "cell-004e021e877c6ace",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert 'Antonio Gala' in solution()['columns']['nombre']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can also get the count of results either by inspecting the result (we will not cover this) or by aggregating the results using the `COUNT` operation.\n",
|
|
"\n",
|
|
"The syntax is:\n",
|
|
" \n",
|
|
"```sparql\n",
|
|
"SELECT COUNT(?variable) as ?count_name\n",
|
|
"```\n",
|
|
"\n",
|
|
"Try it yourself with our previous example:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "f8cca6da3b6830a5474eac28c3c8ebde",
|
|
"grade": false,
|
|
"grade_id": "cell-e35414e191c5bf16",
|
|
"locked": false,
|
|
"schema_version": 3,
|
|
"solution": true
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%sparql https://dbpedia.org/sparql\n",
|
|
"\n",
|
|
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
|
|
"PREFIX dct:<http://purl.org/dc/terms/>\n",
|
|
"PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
|
|
"PREFIX dbr:<http://dbpedia.org/resource/>\n",
|
|
"PREFIX dbo:<http://dbpedia.org/ontology/>\n",
|
|
"\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"\n",
|
|
"WHERE {\n",
|
|
"# YOUR ANSWER HERE\n",
|
|
"}\n",
|
|
"LIMIT 10000"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"deletable": false,
|
|
"editable": false,
|
|
"nbgrader": {
|
|
"cell_type": "code",
|
|
"checksum": "68609fa02dcf7480e16f0e5eb7849e65",
|
|
"grade": true,
|
|
"grade_id": "cell-7a7ef8255a5662e2",
|
|
"locked": true,
|
|
"points": 0,
|
|
"schema_version": 3,
|
|
"solution": false
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"assert len(solution()['columns']) == 1\n",
|
|
"column_name = list(solution()['columns'].keys())[0]\n",
|
|
"assert int(solution()['columns'][column_name][0]) > 200"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Additional exercises"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Find out if there are more dbpedia entries for writers (dbo:Writer) than for football players (dbo:SoccerPlayers)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Get a list of European countries with a population higher than 20 million, in decreasing order of population, including their URI, name in English and population."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Find the country in the world that speaks the most languages. Show its name in Spanish, if available."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## References"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* [RDFLib documentation](https://rdflib.readthedocs.io/en/stable/).\n",
|
|
"* [Wikidata Query Service query examples](https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Licence\n",
|
|
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
|
"\n",
|
|
"© 2023 Universidad Politécnica de Madrid."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.8.10"
|
|
},
|
|
"toc": {
|
|
"base_numbering": 1,
|
|
"nav_menu": {},
|
|
"number_sections": true,
|
|
"sideBar": true,
|
|
"skip_h1_title": false,
|
|
"title_cell": "Table of Contents",
|
|
"title_sidebar": "Contents",
|
|
"toc_cell": false,
|
|
"toc_position": {},
|
|
"toc_section_display": true,
|
|
"toc_window_display": false
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|