diff --git a/lod/02_SPARQL_Custom_Endpoint.ipynb b/lod/02_SPARQL_Custom_Endpoint.ipynb
new file mode 100644
index 0000000..a4726d0
--- /dev/null
+++ b/lod/02_SPARQL_Custom_Endpoint.ipynb
@@ -0,0 +1,459 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "deletable": false,
+ "editable": false,
+ "nbgrader": {
+ "checksum": "7276f055a8c504d3c80098c62ed41a4f",
+ "grade": false,
+ "grade_id": "cell-0bfe38f97f6ab2d2",
+ "locked": true,
+ "schema_version": 1,
+ "solution": false
+ }
+ },
+ "source": [
+ "\n",
+ " \n",
+ "
Course Notes for Learning Intelligent Systems
\n",
+ " Department of Telematic Engineering Systems
\n",
+ " Universidad Politécnica de Madrid
\n",
+ "
\n",
+ " \n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "deletable": false,
+ "editable": false,
+ "nbgrader": {
+ "checksum": "42642609861283bc33914d16750b7efa",
+ "grade": false,
+ "grade_id": "cell-0cd673883ee592d1",
+ "locked": true,
+ "schema_version": 1,
+ "solution": false
+ }
+ },
+ "source": [
+ "## Introduction\n",
+ "\n",
+ "In the previous notebook, we learnt how to use SPARQL by querying DBpedia.\n",
+ "\n",
+ "In this notebook, we will use SPARQL on manually annotated data. The data was collected as part of a [previous exercise](../lod/).\n",
+ "\n",
+ "The goal is to try SPARQL with data annotated by users with limited knowledge of vocabularies and semantics, and to compare the experience with similar queries to a more structured dataset.\n",
+ "\n",
+ "Hence, there are two parts.\n",
+ "First, you will query a set of graphs annotated by students of this course.\n",
+ "Then, you will query a synthetic dataset that contains similar information."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "deletable": false,
+ "editable": false,
+ "nbgrader": {
+ "checksum": "a3ecb4b300a5ab82376a4a8cb01f7e6b",
+ "grade": false,
+ "grade_id": "cell-10264483046abcc4",
+ "locked": true,
+ "schema_version": 1,
+ "solution": false
+ }
+ },
+ "source": [
+ "## Objectives\n",
+ "\n",
+ "* Experiencing the usefulness of the Linked Open Data initiative by querying data from different RDF graphs and endpoints\n",
+ "* Understanding the challenges in querying multiple sources, with different annotators.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "deletable": false,
+ "editable": false,
+ "nbgrader": {
+ "checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
+ "grade": false,
+ "grade_id": "cell-4f8492996e74bf20",
+ "locked": true,
+ "schema_version": 1,
+ "solution": false
+ }
+ },
+ "source": [
+ "## Tools\n",
+ "\n",
+ "See [the SPARQL notebook](./01_SPARQL_Introduction.ipynb#Tools)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "deletable": false,
+ "editable": false,
+ "nbgrader": {
+ "checksum": "c5f8646518bd832a47d71f9d3218237a",
+ "grade": false,
+ "grade_id": "cell-eb13908482825e42",
+ "locked": true,
+ "schema_version": 1,
+ "solution": false
+ }
+ },
+ "source": [
+ "Run this line to enable the `%%sparql` magic command."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from helpers import *"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Exercises\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Querying the manually annotated dataset will be slightly different from querying DBpedia.\n",
+ "The main difference is that this dataset uses different graphs to separate the annotations from different students.\n",
+ "\n",
+ "**Each graph is a separate set of triples**.\n",
+ "For this exercise, you could think of graphs as individual endpoints.\n",
+ "\n",
+ "\n",
+ "First, let us get a list of graphs available:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+ " \n",
+ "SELECT ?g (COUNT(?s) as ?count) WHERE {\n",
+ " GRAPH ?g {\n",
+ " ?s ?p ?o\n",
+ " }\n",
+ "}\n",
+ "GROUP BY ?g\n",
+ "ORDER BY desc(?count)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "You should see many graphs, with different triple counts.\n",
+ "\n",
+ "The biggest one should be http://fuseki.cluster.gsi.dit.upm.es/synthetic"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Once you have this list, you can query specific graphs like so:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+ " \n",
+ "SELECT *\n",
+ "WHERE {\n",
+ " GRAPH {\n",
+ " ?s ?p ?o .\n",
+ " }\n",
+ "}\n",
+ "LIMIT 10"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "There are two exercises in this notebook.\n",
+ "\n",
+ "In each of them, you are asked to run five queries, to answer the following questions:\n",
+ "\n",
+ "* Number of hotels (or entities) with reviews\n",
+ "* Number of reviews\n",
+ "* The hotel with the lowest average score\n",
+ "* The hotel with the highest average score\n",
+ "* A list of hotels with their addresses and telephone numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Manually annotated data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Your task is to design five queries to answer the questions in the description, and run each of them in at least three graphs, other than the `synthetic` graph.\n",
+ "\n",
+ "To design the queries, you can either use what you know about the schema.org vocabularies, or explore subjects, predicates and objects in each of the graphs.\n",
+ "\n",
+ "You will get a better understanding if you follow the exploratory path.\n",
+ "\n",
+ "Here's an example to get the entities and their types in a graph:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+ "\n",
+ "PREFIX schema: \n",
+ " \n",
+ "SELECT ?s ?o\n",
+ "WHERE {\n",
+ " GRAPH {\n",
+ " ?s a ?o .\n",
+ " }\n",
+ "\n",
+ "}\n",
+ "LIMIT 40"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Synthetic dataset\n",
+ "\n",
+ "Now, run the same queries in the synthetic dataset.\n",
+ "\n",
+ "The query below should get you started:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+ " \n",
+ "SELECT *\n",
+ "WHERE {\n",
+ " GRAPH {\n",
+ " ?s ?p ?o .\n",
+ " }\n",
+ "}\n",
+ "LIMIT 10"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Optional exercise\n",
+ "\n",
+ "\n",
+ "Explore the graphs and find the most typical mistakes (e.g. using `http://schema.org/Hotel/Hotel`).\n",
+ "\n",
+ "Tip: You can use normal SPARQL queries with `BOUND` and `REGEX` to check if the annotations are correct.\n",
+ "\n",
+ "You can also query all the graphs at the same time. e.g. to get all types used:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+ "\n",
+ "PREFIX schema: \n",
+ " \n",
+ "SELECT DISTINCT ?o\n",
+ "WHERE {\n",
+ " GRAPH ?g {\n",
+ " ?s a ?o .\n",
+ " }\n",
+ " {\n",
+ " SELECT ?g\n",
+ " WHERE {\n",
+ " GRAPH ?g {}\n",
+ " FILTER (str(?g) != 'http://fuseki.cluster.gsi.dit.upm.es/synthetic')\n",
+ " }\n",
+ " }\n",
+ "\n",
+ "\n",
+ "}\n",
+ "LIMIT 50"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Discussion\n",
+ "\n",
+ "Compare the results of the synthetic and the manual dataset, and answer these questions:"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Both datasets should use the same schema. Are there any differences when it comes to using them?"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "deletable": false,
+ "nbgrader": {
+ "checksum": "860c3977cd06736f1342d535944dbb63",
+ "grade": true,
+ "grade_id": "cell-9bd08e4f5842cb89",
+ "locked": false,
+ "points": 0,
+ "schema_version": 1,
+ "solution": true
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# YOUR ANSWER HERE"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Are the annotations used correctly in every graph?"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "deletable": false,
+ "nbgrader": {
+ "checksum": "1946a7ed4aba8d168bb3fad898c05651",
+ "grade": true,
+ "grade_id": "cell-9dc1c9033198bb18",
+ "locked": false,
+ "points": 0,
+ "schema_version": 1,
+ "solution": true
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# YOUR ANSWER HERE"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Has any of the datasets been harder to query? If so, why?"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "deletable": false,
+ "nbgrader": {
+ "checksum": "6714abc5226618b76dc4c1aaed6d1a49",
+ "grade": true,
+ "grade_id": "cell-6c18003ced54be23",
+ "locked": false,
+ "points": 0,
+ "schema_version": 1,
+ "solution": true
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# YOUR ANSWER HERE"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## References"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "* [RDFLib documentation](https://rdflib.readthedocs.io/en/stable/).\n",
+ "* [Wikidata Query Service query examples](https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Licence\n",
+ "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
+ "\n",
+ "© 2018 Universidad Politécnica de Madrid."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.7.2"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}