mirror of
https://github.com/gsi-upm/sitc
synced 2024-11-22 06:22:29 +00:00
Fix SPARQL regex exercise
This commit is contained in:
parent
e5fa77a128
commit
bb2e3c2fe4
@ -1381,6 +1381,56 @@
|
|||||||
"### Regular expressions"
|
"### Regular expressions"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"The last SPARQL concept we will cover are [regular expressions](https://www.w3.org/TR/rdf-sparql-query/#funcex-regex) (`regex`).\n",
|
||||||
|
"Regular expressions are a very powerful tool, but we will only cover the basics in this exercise.\n",
|
||||||
|
"\n",
|
||||||
|
"In essence, regular expressions match strings against patterns.\n",
|
||||||
|
"In their simplest form, they can be used to find substrings within a variable.\n",
|
||||||
|
"For instance, using `regex(?label, \"substring\")` would only match if and only if the `?label` variable contains `substring`.\n",
|
||||||
|
"But regular expressions can be more complex than that.\n",
|
||||||
|
"For instance, we can find patterns such as: a 10 digit number, a 5 character long string, or variables without whitespaces.\n",
|
||||||
|
"\n",
|
||||||
|
"The syntax of the regex function is the following:\n",
|
||||||
|
"\n",
|
||||||
|
"```\n",
|
||||||
|
"regex(?variable, \"pattern\", \"flags\")\n",
|
||||||
|
"```\n",
|
||||||
|
"\n",
|
||||||
|
"Flags are optional configuration options for the regular expression, such as *do not care about case* (`i` flag).\n",
|
||||||
|
"\n",
|
||||||
|
"As an example, let us find the cities in Madrid that contain \"de\" in their name."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%%sparql\n",
|
||||||
|
"\n",
|
||||||
|
"SELECT ?localidad\n",
|
||||||
|
"WHERE {\n",
|
||||||
|
" ?localidad <http://dbpedia.org/ontology/isPartOf> <http://dbpedia.org/resource/Community_of_Madrid> .\n",
|
||||||
|
" ?localidad rdfs:label ?nombre .\n",
|
||||||
|
" FILTER (lang(?nombre) = \"es\" ).\n",
|
||||||
|
" FILTER regex(?nombre, \"de\", \"i\")\n",
|
||||||
|
"}\n",
|
||||||
|
"LIMIT 10"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Now, use regular expressions to find Spanish novelists whose **first name** is Juan.\n",
|
||||||
|
"In other words, their name **starts with** \"Juan\"."
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
@ -1421,7 +1471,7 @@
|
|||||||
"deletable": false,
|
"deletable": false,
|
||||||
"editable": false,
|
"editable": false,
|
||||||
"nbgrader": {
|
"nbgrader": {
|
||||||
"checksum": "71b5b187bb147c0e7444b29a4f413720",
|
"checksum": "6632242d1d5055e12c3df37941b9e434",
|
||||||
"grade": true,
|
"grade": true,
|
||||||
"grade_id": "cell-c149fe65008f39a9",
|
"grade_id": "cell-c149fe65008f39a9",
|
||||||
"locked": true,
|
"locked": true,
|
||||||
@ -1434,7 +1484,8 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"assert len(LAST_QUERY['columns']['nombre']) > 15\n",
|
"assert len(LAST_QUERY['columns']['nombre']) > 15\n",
|
||||||
"for i in LAST_QUERY['columns']['nombre']:\n",
|
"for i in LAST_QUERY['columns']['nombre']:\n",
|
||||||
" assert 'Juan' in i"
|
" assert 'Juan' in i\n",
|
||||||
|
"assert \"Robert Juan-Cantavella\" not in LAST_QUERY['columns']['nombre']"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -1507,6 +1558,10 @@
|
|||||||
"Querying the manually annotated dataset is slightly different from querying DBpedia.\n",
|
"Querying the manually annotated dataset is slightly different from querying DBpedia.\n",
|
||||||
"The main difference is that this dataset uses different graphs to separate the annotations from different students.\n",
|
"The main difference is that this dataset uses different graphs to separate the annotations from different students.\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"**Each graph is a separate set of triples**.\n",
|
||||||
|
"For this exercise, you could think of graphs as individual endpoints.\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
"First, let us get a list of graphs available:"
|
"First, let us get a list of graphs available:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
Loading…
Reference in New Issue
Block a user