Delete sna/t.txt

Uploaded SNA files
Add files via upload
2026-06-24 15:31:59 +00:00 · 2024-04-17 17:24:12 +02:00 · 2024-04-17 17:23:28 +02:00 · 2024-04-17 17:22:36 +02:00 · 2024-04-17 17:22:20 +02:00 · 2024-04-17 17:21:21 +02:00
143 changed files with 113180 additions and 1600 deletions
--- a/README.md
+++ b/README.md
@@ -1,19 +1,21 @@
 # sitc
 Exercises for Intelligent Systems Course at Universidad Politécnica de Madrid, Telecommunication Engineering School. This material is used in the subjects
- SITC (Sistemas Inteligentes y Tecnologías del Conocimiento) - Master Universitario de Ingeniería de Telecomunicación (MUIT)
- TIAD (Tecnologías Inteligentes de Análisis de Datos) - Master Universitario en Ingeniera de Redes y Servicios Telemáticos)
+- CDAW (Ciencia de datos y aprendizaje en automático en la web de datos) - Master Universitario de Ingeniería de Telecomunicación (MUIT)
+- ABID (Analítica de Big Data) - Master Universitario en Ingeniera de Redes y Servicios Telemáticos)

 For following this course:
 - Follow the instructions to install the environment: https://github.com/gsi-upm/sitc/blob/master/python/1_1_Notebooks.ipynb (Just install 'conda')
- Download the course: use 'https://github.com/gsi-upm/sitc'
- Run in a terminal in the foloder sitc: jupyter notebook (and enjoy)
+- Download the course: use 'https://github.com/gsi-upm/sitc' (or clone the repository to receive updates).
+- Run in a terminal in the folder sitc: jupyter notebook (and enjoy)

 Topics
-* Python: quick introduction to Python
+* Python: a quick introduction to Python
 * ML-1: introduction to machine learning with scikit-learn
 * ML-2: introduction to machine learning with pandas and scikit-learn
+* ML-21: preprocessing and visualizatoin
 * ML-3: introduction to machine learning. Neural Computing
 * ML-4: introduction to Evolutionary Computing
 * ML-5: introduction to Reinforcement Learning
 * NLP: introduction to NLP
 * LOD: Linked Open Data, exercises and example code
+* SNA: Social Network Analysis
--- a/lod/00_SPARQL_Tutorial.ipynb
+++ b/lod/00_SPARQL_Tutorial.ipynb
@@ -0,0 +1,484 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](files/images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<header style=\"width:100%;position:relative\">\n",
+    "  <div style=\"width:80%;float:right;\">\n",
+    "    <h1>Course Notes for Learning Intelligent Systems</h1>\n",
+    "    <h3>Department of Telematic Engineering Systems</h3>\n",
+    "    <h5>Universidad Politécnica de Madrid. © Carlos A. Iglesias </h5>\n",
+    "  </div>\n",
+    "        <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
+    "</header>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Introduction\n",
+    "\n",
+    "This lecture provides an introduction to RDF and the SPARQL query language.\n",
+    "\n",
+    "This is the first in a series of notebooks about SPARQL, which consists of:\n",
+    "\n",
+    "* This notebook, which explains basic concepts of RDF and SPARQL\n",
+    "* [A notebook](01_SPARQL_Introduction.ipynb) that provides an introduction to SPARQL through a collection of  exercises of increasing difficulty.\n",
+    "* [An optional notebook](02_SPARQL_Custom_Endpoint.ipynb) with queries to a custom dataset.\n",
+    "The dataset is meant to be done after the [RDF exercises](../rdf/RDF.ipynb) and it is out of the scope of this course.\n",
+    "You can consult it if you are interested."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# RDF basics\n",
+    "This section is taken from [[1](#1), [2](#2)].\n",
+    "\n",
+    "RDF allows us to make statements about resources. The format of these statements is simple. A statement always has the following structure:\n",
+    "\n",
+    "      <subject> <predicate> <object>\n",
+    "    \n",
+    "An RDF statement expresses a relationship between two resources. The **subject** and the **object** represent the two resources being related; the **predicate** represents the nature of their relationship.\n",
+    "The relationship is phrased in a directional way (from subject to object).\n",
+    "In RDF this relationship is known as a **property**.\n",
+    "Because RDF statements consist of three elements they are called **triples**.\n",
+    "\n",
+    "Here are some examples of RDF triples (informally expressed in pseudocode):\n",
+    "\n",
+    "      <Bob> <is a> <person>.\n",
+    "      <Bob> <is a friend of> <Alice>.\n",
+    "      \n",
+    "Resources are identified by [IRIs](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier), which can appear in all three positions of a triple. For example, the IRI for Leonardo da Vinci in DBpedia is:\n",
+    "\n",
+    "      <http://dbpedia.org/resource/Leonardo_da_Vinci>\n",
+    "\n",
+    "IRIs can be abbreviated as *prefixed names*. For example, \n",
+    "     PREFIX dbr: <http://dbpedia.org/resource/>\n",
+    "     <dbr:Leonardo_da_Vinci>\n",
+    "     \n",
+    "Objects can be literals: \n",
+    " * strings (e.g., \"plain string\" or \"string with language\"@en)\n",
+    " * numbers (e.g., \"13.4\"^^xsd:float)\n",
+    " * dates (e.g., )\n",
+    " * booleans\n",
+    " * etc.\n",
+    " \n",
+    "RDF data is stored in RDF repositories that expose SPARQL endpoints.\n",
+    "Let's query one of the most famous RDF repositories: [dbpedia](https://wiki.dbpedia.org/).\n",
+    "First, we should learn how to execute SPARQL in a notebook."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Executing SPARQL in a notebook\n",
+    "There are several ways to execute SPARQL in a notebook.\n",
+    "Some of the most popular are:\n",
+    "\n",
+    "* using libraries such as [sparql-client](https://pypi.org/project/sparql-client/) or [rdflib](https://rdflib.dev/sparqlwrapper/) that enable executing SPARQL within a Python3 kernel\n",
+    "* using other libraries. In our case, a light library has been developed (the file helpers.py) for accessing SPARQL endpoints using an HTTP connection.\n",
+    "* using the [graph notebook package](https://pypi.org/project/graph-notebook/)\n",
+    "* using a SPARQL kernel [sparql kernel](https://github.com/paulovn/sparql-kernel) instead of the Python3 kernel\n",
+    "\n",
+    "\n",
+    "We are going to use the second option to avoid installing new packages.\n",
+    "\n",
+    "To use the library, you need to:\n",
+    "\n",
+    "1. Import `sparql` from helpers (i.e., `helpers.py`, a file that is available in the github repository)\n",
+    "2. Use the `%%sparql` magic command to indicate the SPARQL endpoint and then the SPARQL code.\n",
+    "\n",
+    "Let's try it!\n",
+    "\n",
+    "# Queries agains DBPedia\n",
+    "\n",
+    "We are going to execute a SPARQL query against DBPedia. This section is based on [[8](#8)].\n",
+    "\n",
+    "First, we just create a query to retrieve arbitrary triples (subject, predicate, object) without any restriction (besides limiting the result to 10 triples)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from helpers import sparql"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "SELECT ?s ?p ?o\n",
+    "WHERE {\n",
+    "    ?s ?p ?o\n",
+    "}\n",
+    "LIMIT 10"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Well, it worked, but the results are not particulary interesting. \n",
+    "Let's search for a famous football player, Fernando Torres.\n",
+    "\n",
+    "To do so, we will search for entities whose English \"human-readable representation\" (i.e., label) matches \"Fernando Torres\":"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?athlete rdfs:label \"Fernando Torres\"@en \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Great, we found the IRI of the node: `http://dbpedia.org/resource/Fernando_Torres`\n",
+    "\n",
+    "Now we can start asking for more properties.\n",
+    "\n",
+    "To do so, go to http://dbpedia.org/resource/Fernando_Torres and you will see all the information available about Fernando Torres. Pay attention to the names of predicates to be able to create new queries. For example, we are interesting in knowing where Fernando Torres was born (`dbo:birthPlace`).\n",
+    "\n",
+    "Let's go!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?athlete rdfs:label \"Fernando Torres\"@en ;\n",
+    "                 dbo:birthPlace ?birthPlace .       \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we examine the SPARQL query, we find three blocks:\n",
+    "\n",
+    "* **PREFIX** section: IRIs of vocabularies and the prefix used below, to avoid long IRIs. e.g., by defining the `dbo` prefix in our example, the `dbo:birthPlace` below expands to `http://dbpedia.org/ontology/birthPlace`.\n",
+    "* **SELECT** section: variables we want to return (`*` is an abbreviation that selects all of the variables in a query)\n",
+    "* **WHERE** clause: triples where some elements are variables. These variables are bound during the query processing process and bounded variables are returned.\n",
+    "\n",
+    "Now take a closer look at the **WHERE** section.\n",
+    "We said earlier that triples are made out of three elements and each triple pattern should finish with a  period (`.`) (although the last pattern can omit this).\n",
+    "However, when two or more triple patterns share the same subject, we omit it all but the first one, and use ` ;` as separator.\n",
+    "If if both the subject and predicate are the same, we could use a coma `,` instead.\n",
+    "This allows us to avoid repetition and make queries more readable.\n",
+    "But don't forget the space before your separators (`;` and `.`).\n",
+    "\n",
+    "The result is interesting, we know he was born in Fuenlabrada, but we see an additional (wrong) value, the Spanish national football team. The conversion process from Wikipedia to DBPedia should still be tuned :).\n",
+    "\n",
+    "We can *fix* it, by adding some more constaints.\n",
+    "In our case, only want a birth place that is also a municipality (i.e., its type is `http://dbpedia.org/resource/Municipalities_of_Spain`).\n",
+    "Let's see!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbr: <http://dbpedia.org/resource/>\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?athlete rdfs:label \"Fernando Torres\"@en ;\n",
+    "                 dbo:birthPlace ?birthPlace .\n",
+    "        ?birthPlace dbo:type dbr:Municipalities_of_Spain \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Great. Now it looks better.\n",
+    "Notice that we added a new prefix.\n",
+    "\n",
+    "Now, is Fuenlabrada is a big city?\n",
+    "Let's find out.\n",
+    "\n",
+    "**Hint**: you can find more subject / object / predicate nodes related to [Fuenlabrada])http://dbpedia.org/resource/Fuenlabrada) in the RDF graph just as we did before.\n",
+    "That is how we found the `dbo:areaTotal` property."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbr: <http://dbpedia.org/resource/>\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        dbr:Fuenlabrada dbo:areaTotal ?area \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Well, it shows 39.1 km$^2$.\n",
+    "\n",
+    "Let's go back to our Fernando Torres.\n",
+    "What we are really insterested in is the name of the city he was born in, not its IRI.\n",
+    "As we saw before, the human-readable name is provided by the `rdfs:label` property:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbp: <http://dbpedia.org/property/>\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?player rdfs:label \"Fernando Torres\"@en ;\n",
+    "                 dbo:birthPlace ?birthPlace .\n",
+    "        ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
+    "                    rdfs:label ?placeName        \n",
+    "                 \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Well, we are almost there. We see that we receive the city name in many languages. We want just the English name. Let's filter!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbp: <http://dbpedia.org/property/>\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?player rdfs:label \"Fernando Torres\"@en ;\n",
+    "                 dbo:birthPlace ?birthPlace .\n",
+    "        ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
+    "                    rdfs:label ?placeName .\n",
+    "         FILTER ( LANG ( ?placeName ) = 'en' )\n",
+    "                 \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Awesome!\n",
+    "\n",
+    "But we said we don't care about the IRI of the place. We only want two pieces of data: Fernando's birth date and the name of his birthplace.\n",
+    "\n",
+    "Let's tune our query a bit more."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbp: <http://dbpedia.org/property/>\n",
+    "\n",
+    "SELECT ?birthDate, ?placeName\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?player rdfs:label \"Fernando Torres\"@en ;\n",
+    "                 dbo:birthDate ?birthDate ;\n",
+    "                 dbo:birthPlace ?birthPlace .\n",
+    "        ?birthPlace dbo:type dbr:Municipalities_of_Spain ;\n",
+    "                    rdfs:label ?placeName .\n",
+    "         FILTER ( LANG ( ?placeName ) = 'en' )\n",
+    "                 \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Great 😃\n",
+    "\n",
+    "Are there many football players born in Fuenlabrada? Let's find out!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbp: <http://dbpedia.org/property/>\n",
+    "\n",
+    "SELECT *\n",
+    "WHERE\n",
+    "     {\n",
+    "        ?player a dbo:SoccerPlayer ;  \n",
+    "                  dbo:birthPlace dbr:Fuenlabrada .         \n",
+    "     }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Well, not that many. Observe we have used `a`.\n",
+    "It is just an abbreviation for `rdf:type`, both can be used interchangeably.\n",
+    "\n",
+    "If you want additional examples, you can follow the notebook by [Shawn Graham](https://github.com/o-date/sparql-and-lod/blob/master/sparql-intro.ipynb), which is based on the  SPARQL tutorial by Matthew Lincoln, available [here in English](https://programminghistorian.org/en/lessons/retired/graph-databases-and-SPARQL) and [here in Spanish](https://programminghistorian.org/es/lecciones/retirada/sparql-datos-abiertos-enlazados]). You have also a local copy of these tutorials together with this notebook [here in English](https://htmlpreview.github.io/?https://github.com/gsi-upm/sitc/blob/master/lod/tutorial/graph-databases-and-SPARQL.html) and [here in Spanish](https://htmlpreview.github.io/?https://github.com/gsi-upm/sitc/blob/master/lod/tutorial/sparql-datos-abiertos-enlazados.html). \n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## References"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "* <a id=\"1\">[1]</a> [SPARQL by Example. A Tutorial. Lee Feigenbaum. W3C, 2009](https://www.w3.org/2009/Talks/0615-qbe/#q1)\n",
+    "* <a id=\"2\">[2]</a> [RDF Primer W3C](https://www.w3.org/TR/rdf11-primer/)\n",
+    "* <a id=\"3\">[3]</a> [SPARQL queries of Beatles recording sessions](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html)\n",
+    "* <a id=\"4\">[4]</a> [RDFLib documentation](https://rdflib.readthedocs.io/en/stable/).\n",
+    "* <a id=\"5\">[5]</a> [Wikidata Query Service query examples](https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples)\n",
+    "* <a id=\"6\">[6]</a> [RDF Graph Data Model. Learn about the RDF graph model used by Stardog.](https://www.stardog.com/tutorials/data-model)\n",
+    "* <a id=\"7\">[7]</a> [Learn SPARQL Write Knowledge Graph queries using SPARQL with step-by-step examples.](https://www.stardog.com/tutorials/sparql/)\n",
+    "* <a id=\"8\">[8]</a> [Running Basic SPARQL Queries Against DBpedia.](https://medium.com/virtuoso-blog/dbpedia-basic-queries-bc1ac172cc09)\n",
+    "* <a id=\"8\">[9]</a> [Intro SPARQL based on painters.](https://github.com/o-date/sparql-and-lod/blob/master/sparql-intro.ipynb)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "©  Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.10"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/lod/01_SPARQL_Introduction.ipynb
+++ b/lod/01_SPARQL_Introduction.ipynb
@@ -6,11 +6,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "7276f055a8c504d3c80098c62ed41a4f",
     "grade": false,
     "grade_id": "cell-0bfe38f97f6ab2d2",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -56,11 +57,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "99aecbad8f94966d92d72dc911d3ff99",
+     "cell_type": "markdown",
+     "checksum": "40ccd05ad0704781327031a84dfb9939",
     "grade": false,
     "grade_id": "cell-4f8492996e74bf20",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -69,7 +71,7 @@
    "\n",
    "* This notebook\n",
    "* External SPARQL editors (optional)\n",
-    "    * YASGUI-GSI http://yasgui.cluster.gsi.dit.upm.es\n",
+    "    * YASGUI-GSI http://yasgui.gsi.upm.es\n",
    "    * DBpedia virtuoso http://dbpedia.org/sparql\n",
    "\n",
    "Using the YASGUI-GSI editor has several advantages over other options.\n",
@@ -93,18 +95,19 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "99e3107f9987cdddae7866dded27f165",
+     "cell_type": "markdown",
+     "checksum": "81894e9d65e5dd9f3b6e1c5f66804bf6",
     "grade": false,
     "grade_id": "cell-70ac24910356c3cf",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
   "source": [
    "## Instructions\n",
    "\n",
-    "We will be using a semantic server, available at: http://fuseki.cluster.gsi.dit.upm.es/sitc.\n",
+    "We will be using a semantic server, available at: http://fuseki.gsi.upm.es/sitc.\n",
    "\n",
    "This server contains a dataset about [Beatles songs](http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html), which we will query with SPARQL.\n",
    "\n",
@@ -122,11 +125,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "1d332d3d11fd6b57f0ec0ac3c358c6cb",
     "grade": false,
     "grade_id": "cell-eb13908482825e42",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -144,11 +148,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "code",
     "checksum": "aca7c5538b8fc53e99c92e94e6818c83",
     "grade": false,
     "grade_id": "cell-b3f3d92fa2100c3d",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -163,11 +168,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "e896b6560e45d5c385a43aa85e3523c7",
     "grade": false,
     "grade_id": "cell-04410e75828c388d",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -193,11 +199,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "96ca90572d6b275fa515c6b976115257",
+     "cell_type": "markdown",
+     "checksum": "34710d3bb8e2cf826833a43adb7fb448",
     "grade": false,
     "grade_id": "cell-2a44c0da2c206d01",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -210,7 +217,7 @@
    "Some examples are:\n",
    "\n",
    "* DBpedia's virtuoso query editor https://dbpedia.org/sparql\n",
-    "* A javascript based client hosted at GSI: http://yasgui.cluster.gsi.dit.upm.es/\n",
+    "* A javascript based client hosted at GSI: http://yasgui.gsi.upm.es/\n",
    "\n",
    "[^1]: http://www.snee.com/bobdc.blog/2017/11/sparql-queries-of-beatles-reco.html"
   ]
@@ -221,11 +228,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "79c60bd3d4c13f380aae5778c5ce7245",
     "grade": false,
     "grade_id": "cell-d645128d3af18117",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -241,11 +249,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "f7428fe79cd33383dfd3b09a0d951b6e",
     "grade": false,
     "grade_id": "cell-8391a5322a9ad4a7",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -260,11 +269,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "f6b5da583694dd5cc9326c670830875d",
     "grade": false,
     "grade_id": "cell-4f56a152e4d70c02",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -313,17 +323,18 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "7a9dc62ab639143c9fc13593e50500d4",
+     "cell_type": "code",
+     "checksum": "3bc71f851a33fa401d18ea3ab02cf61f",
     "grade": false,
     "grade_id": "cell-8ce8c954513f17e7",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "SELECT ?entity ?type\n",
    "WHERE {\n",
@@ -338,11 +349,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "d6a79c2f5fd005a9e15a8f67dcfd4784",
     "grade": false,
     "grade_id": "cell-3d6d622c717c3950",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -375,17 +387,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "69e016b0224f410f03f6217ac30c03a8",
+     "cell_type": "code",
+     "checksum": "65be7168bedb4f6dc2f19e2138bab232",
     "grade": false,
     "grade_id": "cell-6e904d692b5facad",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "SELECT ?entity ?prop\n",
    "WHERE {\n",
@@ -401,12 +414,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "97bd5d5383bd94a72c7452bc33e4b0f9",
+     "cell_type": "code",
+     "checksum": "e78b57fa9baab578f5a4bd22dc499fca",
     "grade": true,
     "grade_id": "cell-3fc0d3c43dfd04a3",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -440,7 +454,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "SELECT ?type\n",
    "WHERE {\n",
@@ -465,7 +479,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "SELECT DISTINCT ?type\n",
    "WHERE {\n",
@@ -507,17 +521,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "47c4f68e342ffe59a3804de7b6a3909b",
+     "cell_type": "code",
+     "checksum": "35563ff455c7e8b1c91f61db97b2011b",
     "grade": false,
     "grade_id": "cell-e615f9a77c4bc9a5",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "SELECT DISTINCT ?property\n",
    "WHERE {\n",
@@ -532,12 +547,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "c9ffeba2d4ffc3e0b95f15a0ec6012c5",
+     "cell_type": "code",
+     "checksum": "7603c90d8c177e2e6678baa2f1b6af36",
     "grade": true,
     "grade_id": "cell-9168718938ab7347",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -569,7 +585,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -638,17 +654,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "8b0faf938efc1a64a70515da3c132605",
+     "cell_type": "code",
+     "checksum": "069811507dbac4b86dc5d3adc82ba4ec",
     "grade": false,
     "grade_id": "cell-0223a51f609edcf9",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -667,12 +684,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "e93d7336fd125d95996e60fd312a4e4d",
+     "cell_type": "code",
+     "checksum": "9833a3efa75c7e2784ef5d60aae2a13e",
     "grade": true,
     "grade_id": "cell-3c7943c6382c62f5",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -706,17 +724,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "271f2b194c2db4c558a46e8312b593e6",
+     "cell_type": "code",
+     "checksum": "b68a279085a1ed087f5e474a6602299e",
     "grade": false,
     "grade_id": "cell-8f43547dd788bb33",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -735,12 +754,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "9f1f7cec8ce4674971543728ada86674",
+     "cell_type": "code",
+     "checksum": "b4461d243cc058b1828769cc906d4947",
     "grade": true,
     "grade_id": "cell-e13a1c921af2f6eb",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -770,11 +790,12 @@
    "\n",
    "SELECT *\n",
    "WHERE { ... }\n",
-    "ORDER BY <variable> <variable> ... DESC(<variable>) ASC(<variable>)\n",
+    "ORDER BY <variable> <variable> ... \n",
    "... other statements like LIMIT ...\n",
    "```\n",
    "\n",
-    "The results can be sorted in ascending or descending order, and using several variables."
+    "The results can be sorted in ascending or descending order, and using several variables.\n",
+    "By default the results are ordered in ascending order, but you can indicate the order using an optional modifier (`ASC(<variable>)`, or `DESC(<variable>)`). \n"
   ]
  },
  {
@@ -790,17 +811,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "9dcd9c6d51a61ac129cffa06e1463c66",
+     "cell_type": "code",
+     "checksum": "335403f01e484ce5563ff059e9764ff4",
     "grade": false,
     "grade_id": "cell-a0f0b9d9b05c9631",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -820,12 +842,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "a044b3fd6b8bd4e098bbe4d818cb4e9f",
+     "cell_type": "code",
+     "checksum": "45530eb91cbc5b3fddcc93d96f07e579",
     "grade": true,
     "grade_id": "cell-bc012ca9d7ad2867",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -858,7 +881,7 @@
    "                    rdfs:label \"Ringo Starr\" .\n",
    "```\n",
    "\n",
-    "Using this structure, and the SPARQL statements you already know, to get the **names** of all musicians that collaborated in at least one song.\n"
+    "Using this structure, and the SPARQL statements you already know, get the **names** of all musicians that collaborated in at least one song.\n"
   ]
  },
  {
@@ -867,17 +890,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "9da7a62b6237078f5eab7e593a8eb590",
+     "cell_type": "code",
+     "checksum": "8fb253675d2e8510e2c6780b960721e5",
     "grade": false,
     "grade_id": "cell-523b963fa4e288d0",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -898,12 +922,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "c8e3a929faf2afa72207c6921382654c",
+     "cell_type": "code",
+     "checksum": "f4474b302bc2f634b3b2ee6e1c7e7257",
     "grade": true,
     "grade_id": "cell-aa9a4e18d6fda225",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -930,13 +955,13 @@
    "\n",
    "Results can be aggregated using different functions.\n",
    "One of the simplest functions is `COUNT`.\n",
-    "The syntax for COUNT is:\n",
+    "The syntax for `COUNT` is:\n",
    "    \n",
    "```sparql\n",
    "SELECT (COUNT(?variable) as ?count_name)\n",
    "```\n",
    "\n",
-    "Use `COUNT` to get the number of songs in which Ringo collaborated."
+    "Use `COUNT` to get the number of songs in which Ringo collaborated. Your query should return a column named `number`."
   ]
  },
  {
@@ -945,17 +970,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "d8419711d2db43ad657e2658a1ea86c4",
+     "cell_type": "code",
+     "checksum": "c7b6620f5ba28b482197ab693cb7142a",
     "grade": false,
     "grade_id": "cell-e89d08031e30b299",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX m: <http://learningsparql.com/ns/musician/>\n",
@@ -975,12 +1001,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "29404e07edf639cdc0ce0d82e654ec31",
+     "cell_type": "code",
+     "checksum": "c90e1427d7e48d9ae8abab40ff92e3b0",
     "grade": true,
     "grade_id": "cell-903d2be00885e1d2",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -1012,7 +1039,7 @@
    "\n",
    "Once results are grouped, they can be aggregated using any aggregation function, such as `COUNT`.\n",
    "\n",
-    "Using `GROUP BY` and `COUNT`, get the count of songs that use each instrument:"
+    "Using `GROUP BY` and `COUNT`, get the count of songs in which Ringo Starr has played each of the instruments:"
   ]
  },
  {
@@ -1021,17 +1048,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "7a0a7206384e7e1d9eb4450dd9e9871f",
+     "cell_type": "code",
+     "checksum": "7556bacb20c1fbd059dec165c982908d",
     "grade": false,
     "grade_id": "cell-1429e4eb5400dbc7",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX m: <http://learningsparql.com/ns/musician/>\n",
@@ -1053,12 +1081,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "bd4dc379fea969d513be0ea97ee75922",
+     "cell_type": "code",
+     "checksum": "34a8432e8d4cea70994c8214ed0e5eb6",
     "grade": true,
     "grade_id": "cell-907aaf6001e27e50",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -1094,7 +1123,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -1115,7 +1144,9 @@
    "Now, use the same principle to get the count of **different** instruments in each song.\n",
    "Some songs have several musicians playing the same instrument, but we only care about *different* instruments in each song.\n",
    "\n",
-    "Use `?number` for the count."
+    "Use `?song` for the song and `?number` for the count.\n",
+    "\n",
+    "Take into consideration that instruments are entities of type `i:Instrument`."
   ]
  },
  {
@@ -1124,17 +1155,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "4a231b4d6874dad435512b988c17c39e",
+     "cell_type": "code",
+     "checksum": "3139d9b7e620266946ffe1ae0cf67581",
     "grade": false,
     "grade_id": "cell-ee208c762d00da9c",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
@@ -1144,6 +1176,8 @@
    "    [] a           s:Song ;\n",
    "       rdfs:label  ?song ;\n",
    "       ?instrument ?musician .\n",
+    "    \n",
+    "?instrument a s:Instrument .\n",
    "}\n",
    "# YOUR ANSWER HERE\n",
    "ORDER BY DESC(?number)"
@@ -1156,19 +1190,20 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "8118099bf14d9f0eb241c4d93ea6f0b9",
+     "cell_type": "code",
+     "checksum": "5abf6eb7a67ebc9f7612b876105c1960",
     "grade": true,
     "grade_id": "cell-ddeec32b8ac3d894",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
    "s = solution()\n",
-    "assert s['columns']['number'][0] == '27'"
+    "assert s['columns']['number'][0] == '25'"
   ]
  },
  {
@@ -1193,7 +1228,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1213,10 +1248,10 @@
   "metadata": {},
   "source": [
    "However, there are some songs that do not have a vocalist (at least, in the dataset).\n",
-    "Those songs will not appear in the list above, because we they do not match part of the `WHERE` clause.\n",
+    "Those songs will not appear in the list above, because they do not match part of the `WHERE` clause.\n",
    "\n",
    "In these cases, we can specify optional values in a query using the `OPTIONAL` keyword.\n",
-    "When a set of clauses are inside an OPTIONAL group, the SPARQL endpoint will try to use them in the query.\n",
+    "When a set of clauses are inside an `OPTIONAL` group, the SPARQL endpoint will try to use them in the query.\n",
    "If there are no results for that part of the query, the variables it specifies will not be bound (i.e. they will be empty).\n",
    "\n",
    "To exemplify this, we can use a property that **does not exist in the dataset**:"
@@ -1228,7 +1263,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1261,17 +1296,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "4b0a0854457c37640aad67f375ed3a17",
+     "cell_type": "code",
+     "checksum": "3bc508872193750d57d07efbf334c212",
     "grade": false,
     "grade_id": "cell-dcd68c45c1608a28",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1294,12 +1330,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "f7122b2284b5d59d59ce4a2925f0bb21",
+     "cell_type": "code",
+     "checksum": "69edef3121b8dfab385a00cd181c956f",
     "grade": true,
     "grade_id": "cell-1e706b9c1c1331bc",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -1343,17 +1380,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "09621e7af911faf39a834e8281bc6d1f",
+     "cell_type": "code",
+     "checksum": "300df0a3cf9729dd4814b3153b2fedb4",
     "grade": false,
     "grade_id": "cell-0c7cc924a13d792a",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1379,12 +1417,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "cebff8ce42f3f36923e81e083a23d24c",
+     "cell_type": "code",
+     "checksum": "22d6fcdb72a8b2c5ab496cdbb5e2740a",
     "grade": true,
     "grade_id": "cell-2541abc93ab4d506",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -1416,17 +1455,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "ea9797f3b2d001ea41d7fa7a5170d5fb",
+     "cell_type": "code",
+     "checksum": "e4e898c8a16b8aa5865dfde2f6e68ec6",
     "grade": false,
     "grade_id": "cell-d750b6d64c6aa0a7",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#> \n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
@@ -1469,7 +1509,9 @@
   "source": [
    "Now, count how many instruments each musician have played in a song.\n",
    "\n",
-    "**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**."
+    "**Do not count lead (`i:vocals`) or backing vocals (`i:backingvocals`) as instruments**.\n",
+    "\n",
+    "Use `?musician` for the musician and `?number` for the count."
   ]
  },
  {
@@ -1478,17 +1520,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "2d82df272d43f678d3b19bf0b41530c1",
+     "cell_type": "code",
+     "checksum": "fade6ab714376e0eabfa595dd6bd6a8b",
     "grade": false,
     "grade_id": "cell-2f5aa516f8191787",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#> \n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
@@ -1513,12 +1556,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "bc83dd9577c9111b1f0ef5bd40c4ec08",
+     "cell_type": "code",
+     "checksum": "33e93ec2a3d1f9eb4b0310d4651b74c2",
     "grade": true,
     "grade_id": "cell-bcd0f7e26b6c11c2",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -1533,7 +1577,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Which songs had Ringo in dums OR Lennon in lead vocals? (UNION)"
+    "### Which songs had Ringo in drums OR Lennon in lead vocals? (UNION)"
   ]
  },
  {
@@ -1567,17 +1611,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "a1e20e2be817a592683dea89eed0120e",
+     "cell_type": "code",
+     "checksum": "09262d81449c498c37e4b9d9b1dcdfed",
     "grade": false,
     "grade_id": "cell-d3a742bd87d9c793",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#> \n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
@@ -1597,18 +1642,19 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "087630476d73bb415b065fafbd6024f0",
+     "cell_type": "code",
+     "checksum": "11061e79ec06ccb3a9c496319a528366",
     "grade": true,
     "grade_id": "cell-409402df0e801d09",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
   "outputs": [],
   "source": [
-    "assert len(solution()['tuples']) == 246"
+    "assert len(solution()['tuples']) == 209"
   ]
  },
  {
@@ -1648,17 +1694,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "1d2cb88412c89c35861a4f9fccea3bf2",
+     "cell_type": "code",
+     "checksum": "9ddd2d1f50f841b889bfd29b175d06da",
     "grade": false,
     "grade_id": "cell-9d1ec854eb530235",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "\n",
    "PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#> \n",
    "\n",
@@ -1680,12 +1727,13 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
-     "checksum": "aa20aa4d11632ea5bd6004df3187d979",
+     "cell_type": "code",
+     "checksum": "0ea5496acd1c3edd9e188b351690a533",
     "grade": true,
     "grade_id": "cell-a79c688b4566dbe8",
     "locked": true,
-     "points": 0,
-     "schema_version": 1,
+     "points": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -1729,7 +1777,9 @@
    "\n",
    "Using `GROUP_CONCAT`, get a list of the instruments that each musician could play.\n",
    "\n",
-    "You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/)."
+    "You can consult how to use GROUP_CONCAT [here](https://www.w3.org/TR/sparql11-query/).\n",
+    "\n",
+    "Use `?musician` for the musician and `?instruments` for the list of instruments."
   ]
  },
  {
@@ -1738,17 +1788,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "508b7f8656e849838aa93cd38f1c6635",
+     "cell_type": "code",
+     "checksum": "d18e8b6e1d32aed395a533febb29fcb5",
     "grade": false,
     "grade_id": "cell-7ea1f5154cdd8324",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#> \n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1773,7 +1824,9 @@
    "\n",
    "You can check if a string or URI matches a regular expression with `regex(?variable, \"<regex>\", \"i\")`.\n",
    "\n",
-    "The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/)."
+    "The documentation for regular expressions in SPARQL is [here](https://www.w3.org/TR/rdf-sparql-query/).\n",
+    "\n",
+    "Use `?instrument` for the instrument and `?ins` for the url of the type."
   ]
  },
  {
@@ -1782,17 +1835,18 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
-     "checksum": "cff1f9c034393f8af055e1f930d5fe32",
+     "cell_type": "code",
+     "checksum": "f926fa3a3568d122454a12312859cda1",
     "grade": false,
     "grade_id": "cell-b6bee887a1b1fc60",
     "locked": false,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/sitc/\n",
+    "%%sparql http://fuseki.gsi.upm.es/sitc/\n",
    "PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-schema#> \n",
    "PREFIX s: <http://learningsparql.com/ns/schema/>\n",
    "PREFIX i: <http://learningsparql.com/ns/instrument/>\n",
@@ -1830,7 +1884,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -1844,9 +1898,22 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.2"
+   "version": "3.8.10"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/lod/02_SPARQL_Custom_Endpoint.ipynb
+++ b/lod/02_SPARQL_Custom_Endpoint.ipynb
@@ -6,11 +6,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "7276f055a8c504d3c80098c62ed41a4f",
     "grade": false,
     "grade_id": "cell-0bfe38f97f6ab2d2",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -31,11 +32,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "42642609861283bc33914d16750b7efa",
     "grade": false,
     "grade_id": "cell-0cd673883ee592d1",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -59,11 +61,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "a3ecb4b300a5ab82376a4a8cb01f7e6b",
     "grade": false,
     "grade_id": "cell-10264483046abcc4",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -80,11 +83,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
     "grade": false,
     "grade_id": "cell-4f8492996e74bf20",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -100,11 +104,12 @@
    "deletable": false,
    "editable": false,
    "nbgrader": {
+     "cell_type": "markdown",
     "checksum": "c5f8646518bd832a47d71f9d3218237a",
     "grade": false,
     "grade_id": "cell-eb13908482825e42",
     "locked": true,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": false
    }
   },
@@ -148,7 +153,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+    "%%sparql http://fuseki.gsi.upm.es/hotels\n",
    "    \n",
    "SELECT ?g (COUNT(?s) as ?count) WHERE {\n",
    "    GRAPH ?g {\n",
@@ -160,14 +165,12 @@
   ]
  },
  {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
   "metadata": {},
-   "outputs": [],
   "source": [
    "You should see many graphs, with different triple counts.\n",
    "\n",
-    "The biggest one should be http://fuseki.cluster.gsi.dit.upm.es/synthetic"
+    "The biggest one should be http://fuseki.gsi.upm.es/synthetic"
   ]
  },
  {
@@ -183,11 +186,11 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+    "%%sparql http://fuseki.gsi.upm.es/hotels\n",
    "    \n",
    "SELECT *\n",
    "WHERE {\n",
-    "    GRAPH <http://fuseki.cluster.gsi.dit.upm.es/synthetic>{\n",
+    "    GRAPH <http://fuseki.gsi.upm.es/synthetic>{\n",
    "    ?s ?p ?o .\n",
    "    }\n",
    "}\n",
@@ -233,13 +236,13 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+    "%%sparql http://fuseki.gsi.upm.es/hotels\n",
    "\n",
    "PREFIX schema: <http://schema.org/>\n",
    "    \n",
    "SELECT ?s ?o\n",
    "WHERE {\n",
-    "    GRAPH <http://fuseki.cluster.gsi.dit.upm.es/35c20a49f8c6581be1cf7bd56d12d131>{\n",
+    "    GRAPH <http://fuseki.gsi.upm.es/35c20a49f8c6581be1cf7bd56d12d131>{\n",
    "        ?s a ?o .\n",
    "    }\n",
    "\n",
@@ -264,11 +267,11 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+    "%%sparql http://fuseki.gsi.upm.es/hotels\n",
    "    \n",
    "SELECT *\n",
    "WHERE {\n",
-    "    GRAPH <http://fuseki.cluster.gsi.dit.upm.es/synthetic>{\n",
+    "    GRAPH <http://fuseki.gsi.upm.es/synthetic>{\n",
    "    ?s ?p ?o .\n",
    "    }\n",
    "}\n",
@@ -295,7 +298,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%%sparql http://fuseki.cluster.gsi.dit.upm.es/hotels\n",
+    "%%sparql http://fuseki.gsi.upm.es/hotels\n",
    "\n",
    "PREFIX schema: <http://schema.org/>\n",
    "    \n",
@@ -308,7 +311,7 @@
    "        SELECT ?g\n",
    "        WHERE {\n",
    "           GRAPH ?g {}\n",
-    "           FILTER (str(?g) != 'http://fuseki.cluster.gsi.dit.upm.es/synthetic')\n",
+    "           FILTER (str(?g) != 'http://fuseki.gsi.upm.es/synthetic')\n",
    "        }\n",
    "    }\n",
    "\n",
@@ -339,12 +342,13 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
+     "cell_type": "code",
     "checksum": "860c3977cd06736f1342d535944dbb63",
     "grade": true,
     "grade_id": "cell-9bd08e4f5842cb89",
     "locked": false,
     "points": 0,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
@@ -366,12 +370,13 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
+     "cell_type": "code",
     "checksum": "1946a7ed4aba8d168bb3fad898c05651",
     "grade": true,
     "grade_id": "cell-9dc1c9033198bb18",
     "locked": false,
     "points": 0,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
@@ -393,12 +398,13 @@
   "metadata": {
    "deletable": false,
    "nbgrader": {
+     "cell_type": "code",
     "checksum": "6714abc5226618b76dc4c1aaed6d1a49",
     "grade": true,
     "grade_id": "cell-6c18003ced54be23",
     "locked": false,
     "points": 0,
-     "schema_version": 1,
+     "schema_version": 3,
     "solution": true
    }
   },
@@ -435,7 +441,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -449,7 +455,20 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.2"
+   "version": "3.8.10"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
  }
 },
 "nbformat": 4,
--- a/lod/03_SPARQL_Writers.ipynb
+++ b/lod/03_SPARQL_Writers.ipynb
--- a/lod/04_SPARQL_Advanced.ipynb
+++ b/lod/04_SPARQL_Advanced.ipynb
@@ -0,0 +1,652 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "7276f055a8c504d3c80098c62ed41a4f",
+     "grade": false,
+     "grade_id": "cell-0bfe38f97f6ab2d2",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "<header style=\"width:100%;position:relative\">\n",
+    "  <div style=\"width:80%;float:right;\">\n",
+    "    <h1>Course Notes for Learning Intelligent Systems</h1>\n",
+    "    <h3>Department of Telematic Engineering Systems</h3>\n",
+    "    <h5>Universidad Politécnica de Madrid</h5>\n",
+    "  </div>\n",
+    "        <img style=\"width:15%;\" src=\"../logo.jpg\" alt=\"UPM\" />\n",
+    "</header>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "bd478e6253226d24ba7f33cb9f6ba706",
+     "grade": false,
+     "grade_id": "cell-0cd673883ee592d1",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "## Advanced SPARQL\n",
+    "\n",
+    "This notebook complements [the SPARQL notebook](./01_SPARQL.ipynb) with some advanced commands.\n",
+    "\n",
+    "If you have not completed the exercises in the previous notebook, please do so before continuing.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "9ea4fd529653214745b937d5fc4559e5",
+     "grade": false,
+     "grade_id": "cell-10264483046abcc4",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "## Objectives\n",
+    "\n",
+    "* To cover some SPARQL concepts that are less frequently used "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "2fedf0d73fc90104d1ab72c3413dfc83",
+     "grade": false,
+     "grade_id": "cell-4f8492996e74bf20",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "## Tools\n",
+    "\n",
+    "See [the SPARQL notebook](./01_SPARQL_Introduction.ipynb#Tools)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "c5f8646518bd832a47d71f9d3218237a",
+     "grade": false,
+     "grade_id": "cell-eb13908482825e42",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "Run this line to enable the `%%sparql` magic command."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from helpers import *"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Exercises"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Working with dates"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To explore dates, we will focus on our Writers example.\n",
+    "\n",
+    "First, search for writers born in the XX century.\n",
+    "You can use a special filter, knowing that `\"2000\"^^xsd:date` is the first date of year 2000."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "1a23c8b9a53f7ae28f28b1c23b9706b5",
+     "grade": false,
+     "grade_id": "cell-ab7755944d46f9ca",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+    "PREFIX dct: <http://purl.org/dc/terms/>\n",
+    "PREFIX dbc: <http://dbpedia.org/resource/Category:>\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\n",
+    "\n",
+    "SELECT ?escritor ?nombre (year(?fechaNac) as ?nac)\n",
+    "WHERE {\n",
+    "    ?escritor dct:subject dbc:Spanish_novelists ;\n",
+    "              rdfs:label ?nombre ;\n",
+    "              dbo:birthDate ?fechaNac .\n",
+    "    FILTER(lang(?nombre) = \"es\") .\n",
+    "    # YOUR ANSWER HERE\n",
+    "}\n",
+    "# YOUR ANSWER HERE\n",
+    "LIMIT 1000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "e261d808f509c1e29227db94d1eab784",
+     "grade": true,
+     "grade_id": "cell-cf3821f2d33fb0f6",
+     "locked": true,
+     "points": 0,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "assert 'Ramiro Ledesma' in solution()['columns']['nombre']\n",
+    "assert 'Ray Loriga' in solution()['columns']['nombre']\n",
+    "assert all(int(x) > 1899 and int(x) < 2001 for x in solution()['columns']['nac'])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, get the list of Spanish novelists that are still alive.\n",
+    "\n",
+    "A person is alive if their death date is not defined and the were born less than 100 years ago.\n",
+    "\n",
+    "Remember, we can check whether the optional value for a key was bound in a SPARQL query using `BOUND(?key)`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "e4579d551790c33ba4662562c6a05d99",
+     "grade": false,
+     "grade_id": "cell-474b1a72dec6827c",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%%sparql\n",
+    "\n",
+    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+    "PREFIX dct:<http://purl.org/dc/terms/>\n",
+    "PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
+    "PREFIX dbo:<http://dbpedia.org/ontology/>\n",
+    "\n",
+    "SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac\n",
+    "\n",
+    "WHERE {\n",
+    "    ?escritor dct:subject dbc:Spanish_novelists .\n",
+    "    ?escritor rdfs:label ?nombre .\n",
+    "    ?escritor dbo:birthDate ?fechaNac .\n",
+    "# YOUR ANSWER HERE\n",
+    "    FILTER(lang(?nombre) = \"es\") .\n",
+    "}\n",
+    "# YOUR ANSWER HERE\n",
+    "LIMIT 1000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "770bbddef5210c28486a1929e4513ada",
+     "grade": true,
+     "grade_id": "cell-46b62dd2856bc919",
+     "locked": true,
+     "points": 0,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "assert 'Fernando Arrabal' in solution()['columns']['nombre']\n",
+    "assert 'Albert Espinosa' in solution()['columns']['nombre']\n",
+    "for year in solution()['columns']['nac']:\n",
+    "    assert int(year) >= 1918"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Working with badly formatted dates (OPTIONAL!)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, get the list of Spanish novelists that died before their fifties (i.e. younger than 50 years old), or that aren't 50 years old yet.\n",
+    "\n",
+    "For the sake of simplicity, you can use the `year(<date>)` function.\n",
+    "\n",
+    "Hint: you can use boolean logic in your filters (e.g. `&&` and `||`).\n",
+    "\n",
+    "Hint 2: Some dates are not formatted properly, which makes some queries fail when they shouldn't. As a workaround, you could convert the date to string, and back to date again: `xsd:dateTime(str(?date))`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "e55173801ab36337ad356a1bc286dbd1",
+     "grade": false,
+     "grade_id": "cell-ceefd3c8fbd39d79",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+    "PREFIX dct:<http://purl.org/dc/terms/>\n",
+    "PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
+    "PREFIX dbo:<http://dbpedia.org/ontology/>\n",
+    "\n",
+    "SELECT ?escritor, ?nombre, year(?fechaNac) as ?nac, ?fechaDef\n",
+    "\n",
+    "WHERE {\n",
+    "    ?escritor dct:subject dbc:Spanish_novelists .\n",
+    "    ?escritor rdfs:label ?nombre .\n",
+    "    ?escritor dbo:birthDate ?fechaNac .\n",
+    "    # YOUR ANSWER HERE\n",
+    "}\n",
+    "# YOUR ANSWER HERE\n",
+    "LIMIT 100"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "1b77cfaefb8b2ec286ce7b0c70804fe0",
+     "grade": true,
+     "grade_id": "cell-461cd6ccc6c2dc79",
+     "locked": true,
+     "points": 0,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "assert 'Javier Sierra' in solution()['columns']['nombre']\n",
+    "assert 'http://dbpedia.org/resource/José_Ángel_Mañas' in solution()['columns']['escritor']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Regular expressions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "[Regular expressions](https://www.w3.org/TR/rdf-sparql-query/#funcex-regex) are a very powerful tool, but we will only cover the basics in this exercise.\n",
+    "\n",
+    "In essence, regular expressions match strings against patterns.\n",
+    "In their simplest form, they can be used to find substrings within a variable.\n",
+    "For instance, using `regex(?label, \"substring\")` would only match if and only if the `?label` variable contains `substring`.\n",
+    "But regular expressions can be more complex than that.\n",
+    "For instance, we can find patterns such as: a 10 digit number, a 5 character long string, or variables without whitespaces.\n",
+    "\n",
+    "The syntax of the regex function is the following:\n",
+    "\n",
+    "```\n",
+    "regex(?variable, \"pattern\", \"flags\")\n",
+    "```\n",
+    "\n",
+    "Flags are optional configuration options for the regular expression, such as *do not care about case* (`i` flag).\n",
+    "\n",
+    "As an example, let us find the cities in Madrid that contain \"de\" in their name."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "SELECT ?localidad\n",
+    "WHERE {\n",
+    "    ?localidad <http://dbpedia.org/ontology/isPartOf> <http://dbpedia.org/resource/Community_of_Madrid> .\n",
+    "    ?localidad rdfs:label ?nombre .\n",
+    "    FILTER (lang(?nombre) = \"es\" ).\n",
+    "    FILTER regex(?nombre, \"de\", \"i\")\n",
+    "}\n",
+    "LIMIT 10"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, use regular expressions to find Spanish novelists whose **first name** is Juan.\n",
+    "In other words, their name **starts with** \"Juan\"."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "b70a9a4f102c253e864d2e8aec79ce81",
+     "grade": false,
+     "grade_id": "cell-a57d3546a812f689",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+    "PREFIX dct:<http://purl.org/dc/terms/>\n",
+    "PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
+    "PREFIX dbr:<http://dbpedia.org/resource/>\n",
+    "PREFIX dbo:<http://dbpedia.org/ontology/>\n",
+    "\n",
+    "# YOUR ANSWER HERE\n",
+    "\n",
+    "WHERE {\n",
+    "    {\n",
+    "        ?escritor dct:subject dbc:Spanish_poets .\n",
+    "    }\n",
+    "    UNION {\n",
+    "        ?escritor dct:subject dbc:Spanish_novelists .\n",
+    "    }\n",
+    "    ?escritor rdfs:label ?nombre\n",
+    "    FILTER(lang(?nombre) = \"es\") .\n",
+    "# YOUR ANSWER HERE\n",
+    "}\n",
+    "ORDER BY ?nombre\n",
+    "LIMIT 1000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "66db9abddfafa91c2dc25577457f71fb",
+     "grade": true,
+     "grade_id": "cell-c149fe65008f39a9",
+     "locked": true,
+     "points": 0,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "assert len(solution()['columns']['nombre']) > 15\n",
+    "for i in solution()['columns']['nombre']:\n",
+    "    assert 'Juan' in i\n",
+    "assert \"Robert Juan-Cantavella\" not in solution()['columns']['nombre']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "1be6d6e4d8e74240ef07deffcbe5e71a",
+     "grade": false,
+     "grade_id": "cell-0c2f0113d97dc9de",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "## Group concat"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "c8dbb73a781bd24080804f289a1cea0b",
+     "grade": false,
+     "grade_id": "asdasdasdddddddddddasdasdsad",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "Sometimes, it is useful to aggregate results from form different rows.\n",
+    "For instance, we might want to get a comma-separated list of the names in each each autonomous community in Spain.\n",
+    "\n",
+    "In those cases, we can use the `GROUP_CONCAT` function."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+    "PREFIX dbo: <http://dbpedia.org/ontology/>\n",
+    "PREFIX dbr: <http://dbpedia.org/resource/>\n",
+    "        \n",
+    "SELECT ?com, GROUP_CONCAT(?name, \",\") as ?places  # notice how we rename the variable\n",
+    "\n",
+    "WHERE {\n",
+    "    ?com dct:subject dbc:Autonomous_communities_of_Spain .\n",
+    "    ?localidad dbo:subdivision ?com ;\n",
+    "             rdfs:label ?name .\n",
+    "    FILTER (lang(?name)=\"es\")\n",
+    "}\n",
+    "\n",
+    "ORDER BY ?com\n",
+    "LIMIT 100"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false,
+    "editable": false,
+    "nbgrader": {
+     "cell_type": "markdown",
+     "checksum": "4779fb61645634308d0ed01e0c88e8a4",
+     "grade": false,
+     "grade_id": "asdiopjasdoijasdoijasd",
+     "locked": true,
+     "schema_version": 3,
+     "solution": false
+    }
+   },
+   "source": [
+    "Try it yourself, to get a list of works by each of the authors in this query:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "e5d87d1d8eba51c510241ba75981a597",
+     "grade": false,
+     "grade_id": "cell-2e3de17c75047652",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%%sparql https://dbpedia.org/sparql\n",
+    "\n",
+    "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+    "PREFIX dct:<http://purl.org/dc/terms/>\n",
+    "PREFIX dbc:<http://dbpedia.org/resource/Category:>\n",
+    "PREFIX dbr:<http://dbpedia.org/resource/>\n",
+    "PREFIX dbo:<http://dbpedia.org/ontology/>\n",
+    "\n",
+    "# YOUR ANSWER HERE\n",
+    "\n",
+    "WHERE {\n",
+    "    ?escritor a dbo:Writer .\n",
+    "    ?escritor rdfs:label ?nombre .\n",
+    "    ?escritor dbo:birthDate ?fechaNac .\n",
+    "    ?escritor dbo:birthPlace dbr:Madrid .\n",
+    "    # YOUR ANSWER HERE\n",
+    "    FILTER(lang(?nombre) = \"es\") .\n",
+    "    FILTER(!bound(?titulo) || lang(?titulo) = \"en\") .\n",
+    "\n",
+    "}\n",
+    "ORDER BY ?nombre\n",
+    "LIMIT 100"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## References"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© 2018 Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/lod/helpers.py
+++ b/lod/helpers.py
@@ -12,6 +12,7 @@ from urllib.request import Request, urlopen
 from urllib.parse import quote_plus, urlencode
 from urllib.error import HTTPError

+import ssl
 import json
 import sys

@@ -20,7 +21,9 @@ display_javascript(js, raw=True)


 def send_query(query, endpoint):
-    FORMATS = ",".join(["application/sparql-results+json", "text/javascript", "application/json"])
+    FORMATS = ",".join(["application/sparql-results+json",
+                        "text/javascript",
+                        "application/json"])

    data = {'query': query}
    # b = quote_plus(query)
@@ -30,10 +33,18 @@ def send_query(query, endpoint):
                headers={'content-type': 'application/x-www-form-urlencoded',
                         'accept': FORMATS},
                method='POST')
-    res = urlopen(r)
+    context = ssl.create_default_context()
+    context.check_hostname = False
+    context.verify_mode = ssl.CERT_NONE
+
+    res = urlopen(r, context=context, timeout=2)
    data = res.read().decode('utf-8')
    if res.getcode() == 200:
-        return json.loads(data)
+        try:
+            return json.loads(data)
+        except Exception:
+            print('Got: ', data, file=sys.stderr)
+            raise
    raise Exception('Error getting results: {}'.format(data))


@@ -60,7 +71,7 @@ def solution():
 def query(query, endpoint=None, print_table=False):
    global LAST_QUERY

-    endpoint = endpoint or "http://fuseki.cluster.gsi.dit.upm.es/sitc/"
+    endpoint = endpoint or "http://fuseki.gsi.upm.es/sitc/"
    results = send_query(query, endpoint)
    tuples = to_table(results)

--- a/lod/tests.py
+++ b/lod/tests.py
--- a/lod/tutorial/css/github.css
+++ b/lod/tutorial/css/github.css
@@ -0,0 +1,61 @@
+.highlight { padding-top: 0; margin: 0;}
+.highlight .c { color: #999988; font-style: italic } /* Comment */
+.highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */
+.highlight .k { color: #000000; font-weight: bold } /* Keyword */
+.highlight .o { color: #000000; font-weight: bold } /* Operator */
+.highlight .cm { color: #999988; font-style: italic } /* Comment.Multiline */
+.highlight .cp { color: #999999; font-weight: bold; font-style: italic } /* Comment.Preproc */
+.highlight .c1 { color: #999988; font-style: italic } /* Comment.Single */
+.highlight .cs { color: #999999; font-weight: bold; font-style: italic } /* Comment.Special */
+.highlight .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */
+.highlight .ge { color: #000000; font-style: italic } /* Generic.Emph */
+.highlight .gr { color: #aa0000 } /* Generic.Error */
+.highlight .gh { color: #999999 } /* Generic.Heading */
+.highlight .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */
+.highlight .go { color: #888888 } /* Generic.Output */
+.highlight .gp { color: #555555 } /* Generic.Prompt */
+.highlight .gs { font-weight: bold } /* Generic.Strong */
+.highlight .gu { color: #aaaaaa } /* Generic.Subheading */
+.highlight .gt { color: #aa0000 } /* Generic.Traceback */
+.highlight .kc { color: #000000; font-weight: bold } /* Keyword.Constant */
+.highlight .kd { color: #000000; font-weight: bold } /* Keyword.Declaration */
+.highlight .kn { color: #000000; font-weight: bold } /* Keyword.Namespace */
+.highlight .kp { color: #000000; font-weight: bold } /* Keyword.Pseudo */
+.highlight .kr { color: #000000; font-weight: bold } /* Keyword.Reserved */
+.highlight .kt { color: #445588; font-weight: bold } /* Keyword.Type */
+.highlight .m { color: #009999 } /* Literal.Number */
+.highlight .s { color: #d01040 } /* Literal.String */
+.highlight .na { color: #008080 } /* Name.Attribute */
+.highlight .nb { color: #0086B3 } /* Name.Builtin */
+.highlight .nc { color: #445588; font-weight: bold } /* Name.Class */
+.highlight .no { color: #008080 } /* Name.Constant */
+.highlight .nd { color: #3c5d5d; font-weight: bold } /* Name.Decorator */
+.highlight .ni { color: #800080 } /* Name.Entity */
+.highlight .ne { color: #990000; font-weight: bold } /* Name.Exception */
+.highlight .nf { color: #990000; font-weight: bold } /* Name.Function */
+.highlight .nl { color: #990000; font-weight: bold } /* Name.Label */
+.highlight .nn { color: #555555 } /* Name.Namespace */
+.highlight .nt { color: #000080 } /* Name.Tag */
+.highlight .nv { color: #008080 } /* Name.Variable */
+.highlight .ow { color: #000000; font-weight: bold } /* Operator.Word */
+.highlight .w { color: #bbbbbb } /* Text.Whitespace */
+.highlight .mf { color: #009999 } /* Literal.Number.Float */
+.highlight .mh { color: #009999 } /* Literal.Number.Hex */
+.highlight .mi { color: #009999 } /* Literal.Number.Integer */
+.highlight .mo { color: #009999 } /* Literal.Number.Oct */
+.highlight .sb { color: #d01040 } /* Literal.String.Backtick */
+.highlight .sc { color: #d01040 } /* Literal.String.Char */
+.highlight .sd { color: #d01040 } /* Literal.String.Doc */
+.highlight .s2 { color: #d01040 } /* Literal.String.Double */
+.highlight .se { color: #d01040 } /* Literal.String.Escape */
+.highlight .sh { color: #d01040 } /* Literal.String.Heredoc */
+.highlight .si { color: #d01040 } /* Literal.String.Interpol */
+.highlight .sx { color: #d01040 } /* Literal.String.Other */
+.highlight .sr { color: #009926 } /* Literal.String.Regex */
+.highlight .s1 { color: #d01040 } /* Literal.String.Single */
+.highlight .ss { color: #990073 } /* Literal.String.Symbol */
+.highlight .bp { color: #999999 } /* Name.Builtin.Pseudo */
+.highlight .vc { color: #008080 } /* Name.Variable.Class */
+.highlight .vg { color: #008080 } /* Name.Variable.Global */
+.highlight .vi { color: #008080 } /* Name.Variable.Instance */
+.highlight .il { color: #009999 } /* Literal.Number.Integer.Long */
--- a/lod/tutorial/css/style.css
+++ b/lod/tutorial/css/style.css
@@ -0,0 +1,968 @@
+@media screen {
+
+  body {
+    font-family: 'Quattrocento', Verdana, sans-serif;
+    font-size:16px;
+    background-color:#ffffff;
+  }
+
+  .container {
+    max-width: 48rem;
+    overflow: hidden;
+    text-overflow: ellipsis;
+  }
+
+
+/* =============================================================================
+Helper classes
+========================================================================== */
+
+  .noclear {
+    clear:none;
+  }
+
+  .expanded {
+    max-width: 58rem;
+  }
+
+  .garnish {
+    width: 23%;
+    padding:0;
+  }
+
+  .full-width {
+    width:80%;
+    margin: 0 auto;
+    text-align:center;
+  }
+
+  .float-right {
+    float:right;
+    margin-left: 1rem;
+    margin-bottom: 1rem;
+  }
+
+  .float-left {
+    margin-right: 1rem;
+    margin-bottom: 1rem;
+  }
+
+
+/* =============================================================================
+Home Page
+========================================================================== */
+
+  .home-block {
+    padding:3rem 0;
+    color:#666;
+  }
+
+  .home-block h2 {
+    margin:0;
+    font-size:2.8rem;
+    color:#333;
+    text-align:center;
+  }
+
+  .home-block p {
+    margin:0rem;
+    font-family:'Open Sans';
+    font-size:1.2rem;
+    padding-top:2rem;
+    text-align:justify;
+  }
+
+  .home-block a:visited {
+    color: #38c;
+  }
+
+  .home-stripe-1 {
+    color:#eee;
+    background:#27b;
+  }
+
+  .home-stripe-1 h2, .home-stripe-2 h2 {
+    color:#fff;
+  }
+
+  .home-stripe-1 a:visited, .home-stripe-1 a:link {
+    color:#6bf;
+  }
+
+  .home-stripe-2 {
+    color:#fff;
+    background:#289;
+  }
+
+  .home-stripe-2 a:visited, .home-stripe-2 a:link {
+    color:#6cd;
+  }
+
+  .home-image {
+    width: 75%;
+  }
+
+  .home-logo img {
+    width: 200px;
+  }
+
+  .home-logo a h1 {
+    color: #fff;
+  }
+
+  .home-logo {
+    color: #fff;
+  }
+
+  .home-logo li {
+    font-size: 1.2rem;
+  }
+
+  .en-back {
+    background-color: #444444;
+  }
+
+  .es-back {
+    background-color: #535D7F;
+  }
+
+  .fr-back {
+    background-color: #3D7C81;
+  }
+  
+  .pt-back {
+    background-color: #d6b664;
+  }
+
+
+  .sitewide-alert {
+    position: relative;
+    margin-bottom: 0;
+  }
+
+/* =============================================================================
+Lesson Headers
+========================================================================== */
+
+  header {
+    margin:-3rem 0 3rem 0;
+    padding:0;
+    font-family:'Roboto', sans-serif;
+    color:#ccc;
+    background: #efefef;
+    border-top:1px solid #333;
+    border-bottom:1px solid #333;
+    text-align:left;
+  }
+
+  header .container-fluid {
+    margin:0;
+    padding:1rem;
+    background: #f5f5f5;
+  }
+
+  header h1 {
+    margin:0;
+    padding:0;
+    font-size:1.8rem;
+    text-align:left;
+  }
+
+  header h2 {
+    font-family:'Roboto', sans-serif;
+    font-size:1.2rem;
+    color:#333;
+    margin: 1.5rem 0 1.5rem 0rem;
+    text-align:left;
+  }
+
+  header h3, header h4 {
+    font: .9rem/1.1rem 'Roboto Condensed', sans-serif;
+    text-transform:uppercase;
+    font-variant:small-caps;
+    letter-spacing:80%;
+    color:#666;
+    margin:.3rem 0 0 0;
+    padding:0;
+  }
+
+  header h4 {
+    display:inline;
+    margin:0;
+    line-height:1.3rem;
+  }
+
+  header .header-image {
+    float:left;
+    border:.2rem solid gray;
+    margin:0;
+    padding:0;
+    max-width: 200px;
+  }
+
+  header .header-abstract {
+    font: 1rem/1.4rem 'Roboto', sans-serif;
+    color:#666;
+    margin:1rem 0;
+  }
+
+  header .header-helpers {
+    clear:both;
+    background:#ccc;
+    color:#fff;
+    border-top:1px solid #999;
+    border-bottom:1px solid #999;
+  }
+
+  header ul {
+    margin:0;
+    padding:0;
+    list-style-type: none;
+  }
+
+  header li, header .metarow {
+    font: .9rem/1.1rem 'Roboto Condensed';
+  }
+
+  header .metarow {
+    color:#999;
+  }
+
+  header .peer-review, header .open-license {
+    font-size: 0.9rem;
+    color: #666;
+    margin: 0;
+  }
+
+/* =============================================================================
+Lessons Index
+========================================================================== */
+
+/*****************
+  FILTER BUTTONS
+******************/
+  ul.filter, ul.sort-by {
+    margin: 0 0 1rem 0;
+    padding: 0px;
+    text-align:center;
+  }
+
+  li.filter,
+  li.sort,
+  #filter-none {
+    font: .9rem/1.1rem 'Open Sans', sans-serif;
+    padding: .4rem .6rem;
+    border:none;
+    border-radius: 3px;
+    display:inline-block;
+    text-transform:uppercase;
+    text-decoration: none;
+  }
+
+  .filter li:hover,
+  .sort-by li:hover,
+  #filter-none:hover {
+    cursor: pointer;
+  }
+
+  .activities li.current:hover,
+  .filter li.current:hover,
+  .sort-by li.current:hover {
+    cursor:default;
+  }
+
+  .topic li a {
+    text-decoration: none;
+  }
+
+  .activities li {
+    background-color:#38c;
+    color:#fff;
+  }
+
+  .activities li:hover {
+    background-color:#16a;
+  }
+
+  .activities li.current {
+    background-color:#059;
+  }
+
+  .topics li {
+    background-color:#eee;
+    color: #38a;
+  }
+
+  .topics li:hover {
+    background-color:#ccc;
+  }
+
+  .topics li.current {
+    background-color:#aaa;
+    color: #333;
+  }
+
+
+  #filter-none {
+    width:99.5%;
+    clear:both;
+    text-align:center;
+    margin-bottom:1rem;
+    background-color:#fefefe;
+    color:#666;
+    border:1px solid #999;
+  }
+
+  #filter-none:hover {
+    background-color:#ededed;
+  }
+
+  /*****************
+    SEARCH
+  *****************/
+
+  .search-input {
+    width:55%;
+    clear:both;
+    margin-bottom:1rem;
+    background-color:#fefefe;
+    color:#666;
+    border:1px solid #999;
+    font: .9rem/1.1rem 'Open Sans',
+    sans-serif;
+    padding: .4rem .6rem;
+    border-radius: 3px;
+    display:inline-block;
+    text-transform:uppercase;
+    text-decoration: none;
+  }
+
+  #search-button,
+  #enable-search-button {
+    background-color: #efefef;
+    color: rgb(153, 143, 143);
+    width: 35%;
+    font: .9rem/1.1rem 'Open Sans',
+    sans-serif;
+    padding: .4rem .6rem;
+    border: none;
+    border-radius: 3px;
+    display: inline-block;
+    text-transform: uppercase;
+    text-decoration: none;
+  }
+
+  @media only screen and (max-width: 767px) {
+    /* phones */
+    #search-button,
+    #enable-search-button {
+      width: 80%;
+    }
+  }
+
+
+  #search-info-button {
+    padding: 0.5rem;
+    color: rgb(153, 143, 143);
+  }
+
+  #search-info {
+    display: none;
+    height:0px;
+    background:#efefef;
+    overflow:hidden;
+    transition:0.5s;
+    -webkit-transition:0.5s;
+    width: 100%;
+    text-align: left;
+    box-sizing: border-box;
+  }
+
+  #search-info.visible {
+    display: block;
+    height: fit-content;
+    height: -moz-max-content;
+    padding: 10px;
+    margin-top: 10px;
+  }
+
+  /*****************
+    SORT BUTTONS
+  *****************/
+
+  li.sort {
+    background-color: #efefef;
+    color:#666;
+    width:49.5%;
+  }
+
+  li.sort:hover {
+    text-decoration: none;
+    background-color:#cecece;
+  }
+
+  #current-sort {
+    font-size:75%;
+  }
+
+  .sort.my-desc:after, .sort-desc:after {
+    width: 0;
+    height: 0;
+    border-left: .4rem solid transparent;
+    border-right: .4rem solid transparent;
+    border-top: .4rem solid;
+    content:"";
+    position: relative;
+    top:.75rem;
+    right:-.3rem;
+  }
+
+  .sort.my-asc:after, .sort-asc:after {
+    width: 0;
+    height: 0;
+    border-left: .4rem solid transparent;
+    border-right: .4rem solid transparent;
+    border-bottom: .4rem solid;
+    content:"";
+    position: relative;
+    bottom:.75rem;
+    right:-.3rem;
+  }
+
+  .sort-desc:after {
+    top:1rem;
+  }
+
+  .sort-asc:after {
+    bottom:1rem;
+  }
+
+  /*****************************
+    LESSON INDEX RESULTS LIST
+  *****************************/
+
+   h2.results-title {
+     margin:1rem 0;
+     font: 1.6rem/2rem 'Roboto Condensed';
+     color:#666;
+     text-transform:uppercase;
+   }
+
+   #results-value {
+     color:#000;
+   }
+
+
+  #lesson-list .list ul {
+    margin:0;
+    padding:0;
+  }
+
+  #lesson-list .list li {
+    list-style-type:none;
+    margin:0;
+  }
+
+
+  .lesson-description {
+    margin-bottom:2rem;
+    padding:0rem;
+    min-height:120px;
+    text-align:left;
+  }
+
+  .lesson-description img {
+      width:100%;
+  }
+
+  .lesson-image {
+    width:120px;
+    float:left;
+    margin-right:1rem;
+  }
+
+  .above-title {
+    margin:0 0 .2rem 0;
+    font: .8rem/1rem 'Roboto Condensed';
+    color:#999;
+    text-transform:uppercase;
+    clear:none;
+  }
+
+  .lesson-description h2.title {
+    font: 1.2rem/1.3rem 'Crete Round', serif;
+    margin:0 0 .8rem 0;
+    clear:none;
+  }
+
+  .list .date,
+  .lesson-description .activity,
+  .lesson-description .topics,
+  .lesson-description .difficulty {
+    display: none;
+  }
+
+  #pre-loader {
+    visibility: hidden;
+    display: flex;
+    justify-content: center;
+    align-items: center;
+    height: 100vh;
+    width: 100%;
+    position: fixed;
+    top: 0;
+    left: 0;
+    z-index: 9999;
+    transition: opacity 0.3s linear;
+    background: rgba(211, 211, 211, 0.8);
+  }
+/* =============================================================================
+Top Navigation Bar
+========================================================================== */
+
+  .navbar {
+    padding: .6rem 1rem;
+    margin: 0 0 3rem 0;
+  }
+
+  .navbar-dark .navbar-nav .nav-link {
+    font-family:'Open Sans';
+    text-transform:uppercase;
+    color:#fff;
+    font-size:.9rem;
+  }
+
+  .btn-group > .btn-secondary {
+    border-color: #333333;
+    background-color: #888888;
+  }
+
+  .lang {
+    text-transform:lowercase !important;
+  }
+
+  .navbar-dark .navbar-nav .nav-link:hover, .navbar-dark .navbar-brand:hover {
+    color:#39a;
+  }
+
+  .navbar-toggler-icon {
+        background-image: url("data:image/svg+xml;charset=utf8,%3Csvg viewBox='0 0 32 32' xmlns='http://www.w3.org/2000/svg'%3E%3Cpath stroke='rgba(255,255,255, 1)' stroke-width='2' stroke-linecap='round' stroke-miterlimit='10' d='M4 8h24M4 16h24M4 24h24'/%3E%3C/svg%3E");
+  }
+
+  .navbar-collapse {
+    text-align:center;
+  }
+
+  .navbar-dark .navbar-brand {
+    font-family:'Crete Round', serif;
+    color:#fff;
+    letter-spacing: .02em;
+  }
+
+  .btn-group > a.btn {
+    padding-left: 1rem;
+    padding-right: 1rem;
+  }
+
+  a.dropdown-item {
+    border-bottom:1px solid #ccc;
+    font-family:'Roboto';
+  }
+
+  .dropdown-menu {
+  	position: absolute;
+  	background: #fff;
+  	border: 1px solid #ccc;
+    margin:0;
+    padding:0;
+  }
+
+  .dropdown-menu a {
+  	font-size:.8rem;
+    line-height:2rem;
+    text-transform:uppercase;
+  }
+
+  .dropdown-menu a:last-child {
+    border-bottom:none;
+  }
+
+  .dropdown-menu:after, .dropdown-menu:before {
+  	bottom: 100%;
+  	left: 20%;
+  	border: solid transparent;
+  	content: " ";
+  	height: 0;
+  	width: 0;
+  	position: absolute;
+  	pointer-events: none;
+  }
+
+  .dropdown-menu:after {
+  	border-color: rgba(255, 255, 255, 0);
+  	border-bottom-color: #fff;
+  	border-width: 12px;
+  	margin-left: -12px;
+  }
+  .dropdown-menu:before {
+  	border-color: rgba(51, 153, 170, 0);
+  	border-bottom-color: #ccc;
+  	border-width: 13px;
+  	margin-left: -13px;
+  }
+
+  .navbar-dark .navbar-nav .nav-link:focus {
+    color: #ccc;
+  }
+
+  .header-link {
+    position: absolute;
+    right: 0.6em;
+    opacity: 0;
+    -webkit-transition: opacity 0.2s ease-in-out 0.1s;
+    -moz-transition: opacity 0.2s ease-in-out 0.1s;
+    -ms-transition: opacity 0.2s ease-in-out 0.1s;
+  }
+
+  h2:hover .header-link,
+  h3:hover .header-link,
+  h4:hover .header-link,
+  h5:hover .header-link,
+  h6:hover .header-link {
+    opacity: 1;
+  }
+
+/* =============================================================================
+Lesson Typography
+========================================================================== */
+
+  a {text-decoration:none;}
+
+  a:link {color: #38c;}
+  a:visited {color: #39a;}
+  a:hover {color: #555;}
+  a:active {color: #555;}
+
+  b, strong { font-weight: bold; }
+
+  blockquote { margin: 1em 2em; padding: 0 1em 0 1em; font-style: italic; border:1px solid #666; background: #eeeeee;}
+
+  hr {
+    display: block; height: 1px; border: 0; border-top: 1px solid #ccc; margin: 2em 0; padding: 0; }
+
+  img {
+    max-width:100%;
+  }
+
+  ins { background: #ff9; color: #000; text-decoration: none; }
+
+
+  h1,h2,h3,h4,h5 {
+    font-family:'Crete Round', serif;
+    font-weight:normal;
+    clear:both;
+  }
+
+
+  h1 {
+    font-size:2rem;
+    margin-bottom:1.5rem;
+    letter-spacing:-.03rem;
+    text-align:center;
+  }
+
+  h2 {
+    font-size:1.6rem;
+    margin-top:3rem;
+    letter-spacing:-.02rem;
+  }
+
+  h3 {
+    font-size:1.4rem;
+    margin-top:2.5rem;
+  }
+
+  h4 {
+    font-size:1.2rem;
+    margin-top:1.8rem;
+  }
+
+  h5 {
+    font-size:1.0rem;
+    margin-top:1.4rem;
+  }
+
+  h1 a, h2 a, h3 a, h4 a, h5 a {
+    text-decoration:none;
+  }
+
+  h1 a:link { color: #38c; }
+  h1 a:visited {color: #39a; }
+
+
+  /* select button generated by codeblocks.js */
+  .fa-align-left {opacity: 0.2;}
+  .highlight:hover .fa-align-left {opacity: 1;}
+
+  q { quotes: none; }
+  q:before, q:after { content: ""; content: none; }
+
+  small { font-size: 85%; }
+
+  /* Position subscript and superscript content without affecting line-height: h5bp.com/k */
+  sub, sup { font-size: 75%; line-height: 0; position: relative; vertical-align: baseline; }
+  sup { top: -0.5em; }
+  sub { bottom: -0.25em; }
+
+  li {
+    margin-bottom:.5rem;
+    line-height:1.4rem;
+  }
+
+  li.nav-item {
+    margin-bottom:0;
+  }
+
+  .alert {
+    font-family: 'Roboto';
+  }
+
+  .alert h2, .alert h3, .alert h4 {
+    margin-top:0;
+  }
+
+
+/* =============================================================================
+Code Highlighting
+========================================================================== */
+
+  code {
+    font-family: monospace, serif;
+    font-size:.9rem;
+  }
+
+  .highlight {
+    margin: 1rem 0 1rem 0;
+    padding:.5rem .2rem;
+    font-size:.9rem;
+    white-space: pre;
+    word-wrap: normal;
+    overflow: auto;
+    border: 1px solid #eee;
+    background: #fafafa;
+  }
+
+
+/* =============================================================================
+Figures
+========================================================================== */
+
+  figure {
+    margin: 0 auto .5rem;
+    text-align: center;
+    display:table;
+  }
+
+  figcaption {
+    margin-top:.5rem;
+    font-family:'Open Sans';
+    font-size:0.8em;
+    color: #666;
+    display:block;
+    caption-side: bottom;
+  }
+
+  .author-info, .citation-info {
+    border-top:1px solid #333;
+    padding-top:1rem;
+    margin-top:2rem;
+  }
+
+  .author-name, .suggested-citation-header {
+    font-family:'Roboto Condensed';
+    font-weight: 600;
+    font-size:1.2rem;
+    color: #666;
+    text-transform:uppercase;
+  }
+
+  .author-description p, .suggested-citation-text p {
+    font-size:0.9rem;
+    font-family:'Open Sans';
+    color: #666;
+  }
+
+  /* =============================================================================
+  Tables
+  ========================================================================== */
+
+  table {
+    width: 100%;
+    margin-bottom: 1em;
+  }
+
+  th, td {
+    padding: 10px;
+    text-align: left;
+    border-bottom: 1px solid #ddd;
+  }
+
+  thead {
+    background-color: #535353;
+    color: #fff;
+    font-weight: bold;
+  }
+
+  tr:nth-child(even) {background-color: #f2f2f2}
+
+/* =============================================================================
+Blog Index and Layout
+========================================================================== */
+
+  .blog-header {
+    text-align:center;
+  }
+
+  .blog-header h2 {
+    margin:0;
+    line-height: 2rem;
+  }
+
+  .blog-header h3 { /*author*/
+    margin-top:.4rem;
+    color: #666;
+    font-size:1rem;
+  }
+
+  .blog-header h4{
+    color: #999;
+    font-size:1rem;
+    margin-bottom:.2rem;
+    font-family:'Roboto Condensed';
+    text-transform:uppercase;
+  }
+
+  .blog-header figure {
+    max-width:80%;
+  }
+
+  .blog-header figcaption {
+    text-align: center;
+  }
+
+  .blog-page-header {
+    margin-bottom:3rem;
+  }
+
+/* =============================================================================
+Project Team
+========================================================================== */
+
+  .contact-box {
+    margin-bottom:3rem;
+  }
+
+/* =============================================================================
+Footer
+========================================================================== */
+
+  footer[role="contentinfo"] {
+    margin-top: 2rem;
+    padding: 2rem 0;
+    font-family:'Open Sans';
+    font-size:.9rem;
+    color: #fff;
+    background-color:#666;
+    text-align:center;
+  }
+
+  footer a, footer a:link, footer a:visited {
+    color: #fff;
+    border-bottom:1px #eee dotted;
+  }
+
+  footer a:hover {
+    text-decoration: none;
+    border-bottom:1px #fff solid;
+  }
+
+  footer .fa {
+    margin: 0 .2rem 0rem 0rem;
+  }
+
+  .footer-head {
+    font-size:1.1rem;
+    line-height:1.4rem;
+    margin-bottom:1rem;
+  }
+
+
+} /* end screen */
+
+@media only screen and (max-width: 768px) {
+  .garnish {
+    display:none;
+  }
+  .dropdown-menu:after, .dropdown-menu:before {
+    display:none;
+  }
+}
+
+/* Print Styling */
+
+@media screen {
+  /* Class to hide elements only shown when printing */
+  .hide-print {
+    display: none !important;
+  }
+}
+
+@media print {
+  * { background: transparent !important; color: black !important; box-shadow:none !important; text-shadow: none !important; filter:none !important; -ms-filter: none !important; } /* Black prints faster: h5bp.com/s */
+  a, a:visited { text-decoration: underline; }
+  a[href]:after { content: " (" attr(href) ")"; }
+  abbr[title]:after { content: " (" attr(title) ")"; }
+  a[href^="javascript:"]:after, a[href^="#"]:after { content: ""; }  /* Don't show links for images, or javascript/internal links */
+  pre, blockquote {
+    border: 1px solid #999;
+    page-break-inside: avoid;
+    margin: 0.5cm;
+    padding: 0.5cm
+  }
+  thead { display: table-header-group; } /* h5bp.com/t */
+  tr, img { page-break-inside: avoid; }
+  img { max-width: 100% !important; }
+  @page {
+    margin: 1.5cm;
+  }
+
+  body { font-size: 0.85rem;}
+  p, h2, h3 { orphans: 3; widows: 3; }
+  h1, h2, h3 { page-break-after: avoid; }
+  h1 { font-size: 1.4rem; }
+  h2 { font-size: 1.1rem; }
+  h3 { font-size: 1rem; }
+  h4 { font-size: 0.9rem; }
+  .header-bottom {
+    margin-bottom: 2rem;
+    page-break-after: always;
+  }
+  .hide-screen {
+    /* Hide elements that only appear on screen */
+    display: none !important;
+  }
+
+  .print-header {
+    /* format navbar for print */
+    display: block;
+    z-index:1030;
+    width: 100%;
+    height: 3rem;
+    padding: .6rem 1rem;
+    margin-bottom: 1rem;
+    color:#fff;
+    white-space: nowrap;
+    font-family: 'Crete Round', serif;
+    border-bottom: 1px solid lightgrey;
+  }
+}
--- a/lod/tutorial/en/lessons/retired/graph-databases-and-SPARQL.html
+++ b/lod/tutorial/en/lessons/retired/graph-databases-and-SPARQL.html
--- a/lod/tutorial/es/lecciones/retirada/sparql-datos-abiertos-enlazados.html
+++ b/lod/tutorial/es/lecciones/retirada/sparql-datos-abiertos-enlazados.html
--- a/lod/tutorial/gallery/graph-databases-and-SPARQL.png
+++ b/lod/tutorial/gallery/graph-databases-and-SPARQL.png
--- a/lod/tutorial/graph-databases-and-SPARQL.html
+++ b/lod/tutorial/graph-databases-and-SPARQL.html
--- a/lod/tutorial/images/ORCIDiD_iconvector.svg
+++ b/lod/tutorial/images/ORCIDiD_iconvector.svg
@@ -0,0 +1,17 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!-- Generator: Adobe Illustrator 19.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)  -->
+<svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px"
+	 viewBox="0 0 256 256" style="enable-background:new 0 0 256 256;" xml:space="preserve">
+<style type="text/css">
+	.st0{fill:#A6CE39;}
+	.st1{fill:#FFFFFF;}
+</style>
+<path class="st0" d="M256,128c0,70.7-57.3,128-128,128C57.3,256,0,198.7,0,128C0,57.3,57.3,0,128,0C198.7,0,256,57.3,256,128z"/>
+<g>
+	<path class="st1" d="M86.3,186.2H70.9V79.1h15.4v48.4V186.2z"/>
+	<path class="st1" d="M108.9,79.1h41.6c39.6,0,57,28.3,57,53.6c0,27.5-21.5,53.6-56.8,53.6h-41.8V79.1z M124.3,172.4h24.5
+		c34.9,0,42.9-26.5,42.9-39.7c0-21.5-13.7-39.7-43.7-39.7h-23.7V172.4z"/>
+	<path class="st1" d="M88.7,56.8c0,5.5-4.5,10.1-10.1,10.1c-5.6,0-10.1-4.6-10.1-10.1c0-5.6,4.5-10.1,10.1-10.1
+		C84.2,46.7,88.7,51.3,88.7,56.8z"/>
+</g>
+</svg>
--- a/lod/tutorial/images/doi_icon.jpg
+++ b/lod/tutorial/images/doi_icon.jpg
--- a/lod/tutorial/images/favicons/en_favicon.ico
+++ b/lod/tutorial/images/favicons/en_favicon.ico
--- a/lod/tutorial/images/favicons/es_favicon.ico
+++ b/lod/tutorial/images/favicons/es_favicon.ico
--- a/lod/tutorial/images/graph-databases-and-SPARQL.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-01.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-01.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-02.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-02.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-03.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-03.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-04.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-04.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-05.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-05.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-06.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-06.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-07.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-07.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-08.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-08.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-09.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-09.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-10.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-10.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-11.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-11.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-12.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-12.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-13.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-13.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-14.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-14.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-15.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql-lod-15.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql01-1.svg
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql01-1.svg
@@ -0,0 +1,40 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: %3 Pages: 1 -->
+<svg width="262pt" height="124pt"
+ viewBox="0.00 0.00 261.53 124.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 120)">
+<title>%3</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-120 257.533,-120 257.533,4 -4,4"/>
+<!-- nw -->
+<g id="node1" class="node"><title>nw</title>
+<ellipse fill="none" stroke="gray" cx="49.3505" cy="-98" rx="49.2014" ry="18"/>
+<text text-anchor="middle" x="49.3505" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
+</g>
+<!-- oil -->
+<g id="node3" class="node"><title>oil</title>
+<ellipse fill="none" stroke="gray" cx="117.35" cy="-18" rx="42.8742" ry="18"/>
+<text text-anchor="middle" x="117.35" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
+</g>
+<!-- nw&#45;&gt;oil -->
+<g id="edge1" class="edge"><title>nw&#45;&gt;oil</title>
+<path fill="none" stroke="gray" d="M63.7715,-80.4582C73.3018,-69.5265 85.9453,-55.0236 96.5567,-42.8517"/>
+<polygon fill="gray" stroke="gray" points="99.2502,-45.0882 103.183,-35.2505 93.9738,-40.4882 99.2502,-45.0882"/>
+<text text-anchor="middle" x="108.138" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+<!-- wb -->
+<g id="node2" class="node"><title>wb</title>
+<ellipse fill="none" stroke="gray" cx="185.35" cy="-98" rx="68.3645" ry="18"/>
+<text text-anchor="middle" x="185.35" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
+</g>
+<!-- wb&#45;&gt;oil -->
+<g id="edge2" class="edge"><title>wb&#45;&gt;oil</title>
+<path fill="none" stroke="gray" d="M170.595,-80.0752C161.138,-69.2266 148.718,-54.9801 138.252,-42.9755"/>
+<polygon fill="gray" stroke="gray" points="140.589,-40.3299 131.38,-35.0922 135.313,-44.9299 140.589,-40.3299"/>
+<text text-anchor="middle" x="176.138" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+</g>
+</svg>
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql01.svg
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql01.svg
@@ -0,0 +1,104 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: %3 Pages: 1 -->
+<svg width="438pt" height="212pt"
+ viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
+<title>%3</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
+<!-- nw -->
+<g id="node1" class="node"><title>nw</title>
+<ellipse fill="none" stroke="gray" cx="132" cy="-186" rx="49.2014" ry="18"/>
+<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
+</g>
+<!-- rr -->
+<g id="node2" class="node"><title>rr</title>
+<ellipse fill="none" stroke="gray" cx="132" cy="-98" rx="60.0217" ry="18"/>
+<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
+</g>
+<!-- nw&#45;&gt;rr -->
+<g id="edge1" class="edge"><title>nw&#45;&gt;rr</title>
+<path fill="none" stroke="gray" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
+<polygon fill="gray" stroke="gray" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
+<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
+</g>
+<!-- oil -->
+<g id="node5" class="node"><title>oil</title>
+<ellipse fill="none" stroke="gray" cx="253" cy="-98" rx="42.8742" ry="18"/>
+<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
+</g>
+<!-- nw&#45;&gt;oil -->
+<g id="edge3" class="edge"><title>nw&#45;&gt;oil</title>
+<path fill="none" stroke="gray" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
+<polygon fill="gray" stroke="gray" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
+<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+<!-- nwd -->
+<g id="node7" class="node"><title>nwd</title>
+<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
+<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
+</g>
+<!-- nw&#45;&gt;nwd -->
+<g id="edge2" class="edge"><title>nw&#45;&gt;nwd</title>
+<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
+<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
+<text text-anchor="middle" x="106.566" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="106.566" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created in</text>
+</g>
+<!-- d -->
+<g id="node6" class="node"><title>d</title>
+<ellipse fill="none" stroke="gray" cx="263" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
+</g>
+<!-- rr&#45;&gt;d -->
+<g id="edge5" class="edge"><title>rr&#45;&gt;d</title>
+<path fill="none" stroke="gray" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
+<polygon fill="gray" stroke="gray" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
+<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
+</g>
+<!-- rrb -->
+<g id="node8" class="node"><title>rrb</title>
+<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
+</g>
+<!-- rr&#45;&gt;rrb -->
+<g id="edge4" class="edge"><title>rr&#45;&gt;rrb</title>
+<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
+<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
+<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
+</g>
+<!-- jv -->
+<g id="node3" class="node"><title>jv</title>
+<ellipse fill="none" stroke="gray" cx="372" cy="-98" rx="57.9076" ry="18"/>
+<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
+</g>
+<!-- jv&#45;&gt;d -->
+<g id="edge6" class="edge"><title>jv&#45;&gt;d</title>
+<path fill="none" stroke="gray" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
+<polygon fill="gray" stroke="gray" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
+<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
+</g>
+<!-- wb -->
+<g id="node4" class="node"><title>wb</title>
+<ellipse fill="none" stroke="gray" cx="277" cy="-186" rx="68.3645" ry="18"/>
+<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
+</g>
+<!-- wb&#45;&gt;jv -->
+<g id="edge7" class="edge"><title>wb&#45;&gt;jv</title>
+<path fill="none" stroke="gray" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
+<polygon fill="gray" stroke="gray" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
+<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
+</g>
+<!-- wb&#45;&gt;oil -->
+<g id="edge8" class="edge"><title>wb&#45;&gt;oil</title>
+<path fill="none" stroke="gray" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
+<polygon fill="gray" stroke="gray" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
+<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+</g>
+</svg>
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql02-1.svg
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql02-1.svg
@@ -0,0 +1,104 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: %3 Pages: 1 -->
+<svg width="438pt" height="212pt"
+ viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
+<title>%3</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
+<!-- nw -->
+<g id="node1" class="node"><title>nw</title>
+<ellipse fill="none" stroke="red" cx="132" cy="-186" rx="49.2014" ry="18"/>
+<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
+</g>
+<!-- rr -->
+<g id="node2" class="node"><title>rr</title>
+<ellipse fill="none" stroke="red" cx="132" cy="-98" rx="60.0217" ry="18"/>
+<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
+</g>
+<!-- nw&#45;&gt;rr -->
+<g id="edge1" class="edge"><title>nw&#45;&gt;rr</title>
+<path fill="none" stroke="orange" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
+<polygon fill="orange" stroke="orange" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
+<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
+</g>
+<!-- oil -->
+<g id="node5" class="node"><title>oil</title>
+<ellipse fill="none" stroke="gray" cx="253" cy="-98" rx="42.8742" ry="18"/>
+<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
+</g>
+<!-- nw&#45;&gt;oil -->
+<g id="edge3" class="edge"><title>nw&#45;&gt;oil</title>
+<path fill="none" stroke="gray" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
+<polygon fill="gray" stroke="gray" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
+<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+<!-- nwd -->
+<g id="node7" class="node"><title>nwd</title>
+<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
+<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
+</g>
+<!-- nw&#45;&gt;nwd -->
+<g id="edge2" class="edge"><title>nw&#45;&gt;nwd</title>
+<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
+<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
+<text text-anchor="middle" x="106.566" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="106.566" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created in</text>
+</g>
+<!-- d -->
+<g id="node6" class="node"><title>d</title>
+<ellipse fill="none" stroke="orange" cx="263" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
+</g>
+<!-- rr&#45;&gt;d -->
+<g id="edge5" class="edge"><title>rr&#45;&gt;d</title>
+<path fill="none" stroke="orange" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
+<polygon fill="orange" stroke="orange" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
+<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
+</g>
+<!-- rrb -->
+<g id="node8" class="node"><title>rrb</title>
+<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
+</g>
+<!-- rr&#45;&gt;rrb -->
+<g id="edge4" class="edge"><title>rr&#45;&gt;rrb</title>
+<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
+<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
+<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
+</g>
+<!-- jv -->
+<g id="node3" class="node"><title>jv</title>
+<ellipse fill="none" stroke="red" cx="372" cy="-98" rx="57.9076" ry="18"/>
+<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
+</g>
+<!-- jv&#45;&gt;d -->
+<g id="edge6" class="edge"><title>jv&#45;&gt;d</title>
+<path fill="none" stroke="orange" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
+<polygon fill="orange" stroke="orange" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
+<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
+</g>
+<!-- wb -->
+<g id="node4" class="node"><title>wb</title>
+<ellipse fill="none" stroke="red" cx="277" cy="-186" rx="68.3645" ry="18"/>
+<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
+</g>
+<!-- wb&#45;&gt;jv -->
+<g id="edge7" class="edge"><title>wb&#45;&gt;jv</title>
+<path fill="none" stroke="orange" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
+<polygon fill="orange" stroke="orange" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
+<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
+</g>
+<!-- wb&#45;&gt;oil -->
+<g id="edge8" class="edge"><title>wb&#45;&gt;oil</title>
+<path fill="none" stroke="gray" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
+<polygon fill="gray" stroke="gray" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
+<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+</g>
+</svg>
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql02.svg
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql02.svg
@@ -0,0 +1,104 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: %3 Pages: 1 -->
+<svg width="438pt" height="212pt"
+ viewBox="0.00 0.00 437.70 212.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 208)">
+<title>%3</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-208 433.703,-208 433.703,4 -4,4"/>
+<!-- nw -->
+<g id="node1" class="node"><title>nw</title>
+<ellipse fill="none" stroke="red" cx="132" cy="-186" rx="49.2014" ry="18"/>
+<text text-anchor="middle" x="132" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">The Nightwatch</text>
+</g>
+<!-- rr -->
+<g id="node2" class="node"><title>rr</title>
+<ellipse fill="none" stroke="gray" cx="132" cy="-98" rx="60.0217" ry="18"/>
+<text text-anchor="middle" x="132" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Rembrandt van Rijn</text>
+</g>
+<!-- nw&#45;&gt;rr -->
+<g id="edge1" class="edge"><title>nw&#45;&gt;rr</title>
+<path fill="none" stroke="gray" d="M132,-167.597C132,-155.746 132,-139.817 132,-126.292"/>
+<polygon fill="gray" stroke="gray" points="135.5,-126.084 132,-116.084 128.5,-126.084 135.5,-126.084"/>
+<text text-anchor="middle" x="150.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="150.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
+</g>
+<!-- oil -->
+<g id="node5" class="node"><title>oil</title>
+<ellipse fill="none" stroke="orange" cx="253" cy="-98" rx="42.8742" ry="18"/>
+<text text-anchor="middle" x="253" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">oil on canvas</text>
+</g>
+<!-- nw&#45;&gt;oil -->
+<g id="edge3" class="edge"><title>nw&#45;&gt;oil</title>
+<path fill="none" stroke="orange" d="M153.632,-169.625C173.196,-155.72 202.162,-135.133 223.779,-119.768"/>
+<polygon fill="orange" stroke="orange" points="225.837,-122.6 231.96,-113.954 221.781,-116.894 225.837,-122.6"/>
+<text text-anchor="middle" x="225.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+<!-- nwd -->
+<g id="node7" class="node"><title>nwd</title>
+<ellipse fill="none" stroke="gray" cx="27" cy="-98" rx="27" ry="18"/>
+<text text-anchor="middle" x="27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">1642</text>
+</g>
+<!-- nw&#45;&gt;nwd -->
+<g id="edge2" class="edge"><title>nw&#45;&gt;nwd</title>
+<path fill="none" stroke="gray" d="M112.741,-169.226C95.413,-155.034 69.877,-134.118 51.1799,-118.804"/>
+<polygon fill="gray" stroke="gray" points="53.3315,-116.043 43.3774,-112.414 48.896,-121.458 53.3315,-116.043"/>
+<text text-anchor="middle" x="105.455" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="105.455" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">createdin</text>
+</g>
+<!-- d -->
+<g id="node6" class="node"><title>d</title>
+<ellipse fill="none" stroke="gray" cx="263" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="263" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">Dutch</text>
+</g>
+<!-- rr&#45;&gt;d -->
+<g id="edge5" class="edge"><title>rr&#45;&gt;d</title>
+<path fill="none" stroke="gray" d="M157.881,-81.5897C180.033,-68.4005 211.867,-49.4456 234.691,-35.8559"/>
+<polygon fill="gray" stroke="gray" points="236.761,-38.6968 243.563,-30.5735 233.18,-32.6822 236.761,-38.6968"/>
+<text text-anchor="middle" x="227.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
+</g>
+<!-- rrb -->
+<g id="node8" class="node"><title>rrb</title>
+<ellipse fill="none" stroke="gray" cx="132" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="132" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">1606</text>
+</g>
+<!-- rr&#45;&gt;rrb -->
+<g id="edge4" class="edge"><title>rr&#45;&gt;rrb</title>
+<path fill="none" stroke="gray" d="M132,-79.6893C132,-69.8938 132,-57.4218 132,-46.335"/>
+<polygon fill="gray" stroke="gray" points="135.5,-46.2623 132,-36.2623 128.5,-46.2624 135.5,-46.2623"/>
+<text text-anchor="middle" x="152.455" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">was born in</text>
+</g>
+<!-- jv -->
+<g id="node3" class="node"><title>jv</title>
+<ellipse fill="none" stroke="gray" cx="372" cy="-98" rx="57.9076" ry="18"/>
+<text text-anchor="middle" x="372" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">Johannes Vermeer</text>
+</g>
+<!-- jv&#45;&gt;d -->
+<g id="edge6" class="edge"><title>jv&#45;&gt;d</title>
+<path fill="none" stroke="gray" d="M349.942,-81.2155C332.328,-68.6108 307.605,-50.9191 289.023,-37.6222"/>
+<polygon fill="gray" stroke="gray" points="290.879,-34.6462 280.71,-31.673 286.805,-40.3388 290.879,-34.6462"/>
+<text text-anchor="middle" x="345.572" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">has nationality</text>
+</g>
+<!-- wb -->
+<g id="node4" class="node"><title>wb</title>
+<ellipse fill="none" stroke="red" cx="277" cy="-186" rx="68.3645" ry="18"/>
+<text text-anchor="middle" x="277" y="-183" font-family="Helvetica,sans-Serif" font-size="10.00">Woman with a Balance</text>
+</g>
+<!-- wb&#45;&gt;jv -->
+<g id="edge7" class="edge"><title>wb&#45;&gt;jv</title>
+<path fill="none" stroke="gray" d="M295.278,-168.646C301.824,-162.776 309.251,-156.101 316,-150 325.983,-140.975 336.934,-131.015 346.484,-122.31"/>
+<polygon fill="gray" stroke="gray" points="349.095,-124.666 354.124,-115.341 344.377,-119.494 349.095,-124.666"/>
+<text text-anchor="middle" x="350.678" y="-143.6" font-family="Helvetica,sans-Serif" font-size="8.00">was</text>
+<text text-anchor="middle" x="350.678" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">created by</text>
+</g>
+<!-- wb&#45;&gt;oil -->
+<g id="edge8" class="edge"><title>wb&#45;&gt;oil</title>
+<path fill="none" stroke="orange" d="M272.258,-168.009C268.905,-155.995 264.342,-139.641 260.498,-125.869"/>
+<polygon fill="orange" stroke="orange" points="263.791,-124.647 257.732,-115.956 257.049,-126.528 263.791,-124.647"/>
+<text text-anchor="middle" x="289.787" y="-139.6" font-family="Helvetica,sans-Serif" font-size="8.00">has medium</text>
+</g>
+</g>
+</svg>
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql03.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql03.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql04-1.svg
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql04-1.svg
@@ -0,0 +1,127 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: %3 Pages: 1 -->
+<svg width="636pt" height="364pt"
+ viewBox="0.00 0.00 636.30 364.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 360)">
+<title>%3</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-360 632.301,-360 632.301,4 -4,4"/>
+<!-- o -->
+<g id="node1" class="node"><title>o</title>
+<ellipse fill="none" stroke="red" cx="274.27" cy="-338" rx="53.3595" ry="18"/>
+<text text-anchor="middle" x="274.27" y="-335" font-family="Helvetica,sans-Serif" font-size="10.00">object/PPA82633</text>
+</g>
+<!-- th1 -->
+<g id="node2" class="node"><title>th1</title>
+<ellipse fill="none" stroke="red" cx="40.2702" cy="-258" rx="40.0417" ry="18"/>
+<text text-anchor="middle" x="40.2702" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">thes:x11409</text>
+</g>
+<!-- o&#45;&gt;th1 -->
+<g id="edge1" class="edge"><title>o&#45;&gt;th1</title>
+<path fill="none" stroke="red" d="M224.341,-331.342C190.446,-326.351 145.117,-317.397 107.454,-302 93.6639,-296.363 79.5997,-287.87 67.927,-279.917"/>
+<polygon fill="red" stroke="red" points="69.704,-276.888 59.5111,-273.997 65.6765,-282.613 69.704,-276.888"/>
+<text text-anchor="middle" x="147.178" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:p45_consists_of</text>
+</g>
+<!-- dep -->
+<g id="node4" class="node"><title>dep</title>
+<ellipse fill="none" stroke="red" cx="172.27" cy="-258" rx="74.1479" ry="18"/>
+<text text-anchor="middle" x="172.27" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">person&#45;institution/147800</text>
+</g>
+<!-- o&#45;&gt;dep -->
+<g id="edge3" class="edge"><title>o&#45;&gt;dep</title>
+<path fill="none" stroke="red" d="M235.65,-325.351C222.058,-319.869 207.403,-312.234 196.239,-302 191.195,-297.376 186.961,-291.439 183.525,-285.462"/>
+<polygon fill="red" stroke="red" points="186.46,-283.516 178.779,-276.219 180.234,-286.714 186.46,-283.516"/>
+<text text-anchor="middle" x="228.286" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P62_depicts</text>
+</g>
+<!-- etc -->
+<g id="node6" class="node"><title>etc</title>
+<ellipse fill="none" stroke="gray" cx="274.27" cy="-18" rx="27" ry="18"/>
+<text text-anchor="middle" x="274.27" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">etc...</text>
+</g>
+<!-- o&#45;&gt;etc -->
+<g id="edge10" class="edge"><title>o&#45;&gt;etc</title>
+<path fill="none" stroke="gray" d="M274.27,-319.958C274.27,-304.156 274.27,-279.99 274.27,-259 274.27,-259 274.27,-259 274.27,-97 274.27,-80.1099 274.27,-61.1626 274.27,-46.172"/>
+<polygon fill="gray" stroke="gray" points="277.77,-46.0417 274.27,-36.0418 270.77,-46.0418 277.77,-46.0417"/>
+</g>
+<!-- own -->
+<g id="node7" class="node"><title>own</title>
+<ellipse fill="none" stroke="red" cx="395.27" cy="-258" rx="93.1176" ry="18"/>
+<text text-anchor="middle" x="395.27" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">thesIdentifier:the&#45;british&#45;museum</text>
+</g>
+<!-- o&#45;&gt;own -->
+<g id="edge5" class="edge"><title>o&#45;&gt;own</title>
+<path fill="none" stroke="red" d="M297.887,-321.776C315.86,-310.19 340.856,-294.077 361.023,-281.077"/>
+<polygon fill="red" stroke="red" points="363.112,-283.894 369.621,-275.534 359.319,-278.011 363.112,-283.894"/>
+<text text-anchor="middle" x="392.854" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P54_has_current_owner</text>
+</g>
+<!-- con -->
+<g id="node8" class="node"><title>con</title>
+<ellipse fill="none" stroke="red" cx="524.27" cy="-178" rx="80.1403" ry="18"/>
+<text text-anchor="middle" x="524.27" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">object/PPA82633/concept/1</text>
+</g>
+<!-- o&#45;&gt;con -->
+<g id="edge6" class="edge"><title>o&#45;&gt;con</title>
+<path fill="none" stroke="red" d="M322.863,-330.474C381.749,-321.472 475.782,-303.207 497.27,-276 513.047,-256.024 519.612,-227.389 522.339,-206.409"/>
+<polygon fill="red" stroke="red" points="525.845,-206.541 523.445,-196.222 518.886,-205.786 525.845,-206.541"/>
+<text text-anchor="middle" x="548.839" y="-255.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P128_carries</text>
+</g>
+<!-- th1lab -->
+<g id="node3" class="node"><title>th1lab</title>
+<ellipse fill="none" stroke="gray" cx="40.2702" cy="-178" rx="27" ry="18"/>
+<text text-anchor="middle" x="40.2702" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">paper</text>
+</g>
+<!-- th1&#45;&gt;th1lab -->
+<g id="edge2" class="edge"><title>th1&#45;&gt;th1lab</title>
+<path fill="none" stroke="gray" d="M40.2702,-239.689C40.2702,-229.894 40.2702,-217.422 40.2702,-206.335"/>
+<polygon fill="gray" stroke="gray" points="43.7703,-206.262 40.2702,-196.262 36.7703,-206.262 43.7703,-206.262"/>
+<text text-anchor="middle" x="66.2858" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
+</g>
+<!-- deplab -->
+<g id="node5" class="node"><title>deplab</title>
+<ellipse fill="none" stroke="gray" cx="172.27" cy="-178" rx="66.8537" ry="18"/>
+<text text-anchor="middle" x="172.27" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">Julius Caesar Scaliger</text>
+</g>
+<!-- dep&#45;&gt;deplab -->
+<g id="edge4" class="edge"><title>dep&#45;&gt;deplab</title>
+<path fill="none" stroke="gray" d="M172.27,-239.689C172.27,-229.894 172.27,-217.422 172.27,-206.335"/>
+<polygon fill="gray" stroke="gray" points="175.77,-206.262 172.27,-196.262 168.77,-206.262 175.77,-206.262"/>
+<text text-anchor="middle" x="198.286" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
+</g>
+<!-- contype -->
+<g id="node9" class="node"><title>contype</title>
+<ellipse fill="none" stroke="gray" cx="431.27" cy="-98" rx="85.8678" ry="18"/>
+<text text-anchor="middle" x="431.27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">ecrm:E73_Information_Object</text>
+</g>
+<!-- con&#45;&gt;contype -->
+<g id="edge7" class="edge"><title>con&#45;&gt;contype</title>
+<path fill="none" stroke="gray" d="M504.547,-160.458C491.321,-149.365 473.71,-134.595 459.068,-122.314"/>
+<polygon fill="gray" stroke="gray" points="461.196,-119.531 451.285,-115.787 456.698,-124.895 461.196,-119.531"/>
+<text text-anchor="middle" x="494.61" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">rdf:type</text>
+</g>
+<!-- concon -->
+<g id="node10" class="node"><title>concon</title>
+<ellipse fill="none" stroke="gray" cx="576.27" cy="-98" rx="40.8927" ry="18"/>
+<text text-anchor="middle" x="576.27" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">thes:x12440</text>
+</g>
+<!-- con&#45;&gt;concon -->
+<g id="edge8" class="edge"><title>con&#45;&gt;concon</title>
+<path fill="none" stroke="gray" d="M535.553,-160.075C542.599,-149.507 551.795,-135.713 559.664,-123.91"/>
+<polygon fill="gray" stroke="gray" points="562.73,-125.619 565.365,-115.357 556.906,-121.736 562.73,-125.619"/>
+<text text-anchor="middle" x="588.96" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P129_is_about</text>
+</g>
+<!-- conlab -->
+<g id="node11" class="node"><title>conlab</title>
+<ellipse fill="none" stroke="gray" cx="576.27" cy="-18" rx="33.894" ry="18"/>
+<text text-anchor="middle" x="576.27" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">academic</text>
+</g>
+<!-- concon&#45;&gt;conlab -->
+<g id="edge9" class="edge"><title>concon&#45;&gt;conlab</title>
+<path fill="none" stroke="gray" d="M576.27,-79.6893C576.27,-69.8938 576.27,-57.4218 576.27,-46.335"/>
+<polygon fill="gray" stroke="gray" points="579.77,-46.2623 576.27,-36.2623 572.77,-46.2624 579.77,-46.2623"/>
+<text text-anchor="middle" x="602.286" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
+</g>
+</g>
+</svg>
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql04.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql04.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql05.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql05.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql06.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql06.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql07.svg
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql07.svg
@@ -0,0 +1,114 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 2.38.0 (20140413.2041)
+ -->
+<!-- Title: %3 Pages: 1 -->
+<svg width="367pt" height="364pt"
+ viewBox="0.00 0.00 367.21 364.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 360)">
+<title>%3</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-360 363.215,-360 363.215,4 -4,4"/>
+<!-- obj -->
+<g id="node1" class="node"><title>obj</title>
+<ellipse fill="none" stroke="gray" cx="148.735" cy="-338" rx="148.97" ry="18"/>
+<text text-anchor="middle" x="148.735" y="-335" font-family="Helvetica,sans-Serif" font-size="10.00">http://collection.britishmuseum.org/id/object/PPA82633</text>
+</g>
+<!-- object_type -->
+<g id="node2" class="node"><title>object_type</title>
+<ellipse fill="none" stroke="gray" cx="51.7348" cy="-258" rx="38.5366" ry="18"/>
+<text text-anchor="middle" x="51.7348" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">object_type</text>
+</g>
+<!-- obj&#45;&gt;object_type -->
+<g id="edge1" class="edge"><title>obj&#45;&gt;object_type</title>
+<path fill="none" stroke="gray" d="M98.3951,-320.855C88.4182,-315.964 78.6502,-309.764 70.9106,-302 66.4004,-297.476 62.8568,-291.706 60.1099,-285.873"/>
+<polygon fill="gray" stroke="gray" points="63.1969,-284.17 56.2095,-276.205 56.7054,-286.789 63.1969,-284.17"/>
+<text text-anchor="middle" x="108.647" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">bmo:PX_object_type</text>
+</g>
+<!-- production -->
+<g id="node4" class="node"><title>production</title>
+<ellipse fill="none" stroke="gray" cx="148.735" cy="-258" rx="36.3999" ry="18"/>
+<text text-anchor="middle" x="148.735" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">production</text>
+</g>
+<!-- obj&#45;&gt;production -->
+<g id="edge3" class="edge"><title>obj&#45;&gt;production</title>
+<path fill="none" stroke="gray" d="M148.735,-319.689C148.735,-309.894 148.735,-297.422 148.735,-286.335"/>
+<polygon fill="gray" stroke="gray" points="152.235,-286.262 148.735,-276.262 145.235,-286.262 152.235,-286.262"/>
+<text text-anchor="middle" x="203.657" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P108i_was_produced_by</text>
+</g>
+<!-- other -->
+<g id="node8" class="node"><title>other</title>
+<text text-anchor="middle" x="281.735" y="-255" font-family="Helvetica,sans-Serif" font-size="10.00">Other top&#45;level object attributes</text>
+</g>
+<!-- obj&#45;&gt;other -->
+<g id="edge7" class="edge"><title>obj&#45;&gt;other</title>
+<path fill="none" stroke="gray" d="M223.709,-322.337C236.677,-317.399 249.318,-310.804 259.735,-302 264.947,-297.595 269.068,-291.646 272.265,-285.579"/>
+<polygon fill="gray" stroke="gray" points="275.598,-286.706 276.554,-276.154 269.227,-283.807 275.598,-286.706"/>
+<text text-anchor="middle" x="271.069" y="-295.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
+</g>
+<!-- print -->
+<g id="node3" class="node"><title>print</title>
+<ellipse fill="none" stroke="gray" cx="51.7348" cy="-178" rx="27" ry="18"/>
+<text text-anchor="middle" x="51.7348" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">print</text>
+</g>
+<!-- object_type&#45;&gt;print -->
+<g id="edge2" class="edge"><title>object_type&#45;&gt;print</title>
+<path fill="none" stroke="gray" d="M51.7348,-239.689C51.7348,-229.894 51.7348,-217.422 51.7348,-206.335"/>
+<polygon fill="gray" stroke="gray" points="55.2349,-206.262 51.7348,-196.262 48.2349,-206.262 55.2349,-206.262"/>
+<text text-anchor="middle" x="77.7504" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">skos:prefLabel</text>
+</g>
+<!-- date -->
+<g id="node5" class="node"><title>date</title>
+<ellipse fill="none" stroke="gray" cx="134.735" cy="-178" rx="27" ry="18"/>
+<text text-anchor="middle" x="134.735" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">date</text>
+</g>
+<!-- production&#45;&gt;date -->
+<g id="edge4" class="edge"><title>production&#45;&gt;date</title>
+<path fill="none" stroke="gray" d="M143.042,-240.075C141.328,-234.389 139.605,-227.974 138.481,-222 137.543,-217.015 136.839,-211.66 136.311,-206.48"/>
+<polygon fill="gray" stroke="gray" points="139.778,-205.937 135.456,-196.264 132.803,-206.521 139.778,-205.937"/>
+<text text-anchor="middle" x="175.862" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P9_consists_of</text>
+</g>
+<!-- other_prod -->
+<g id="node9" class="node"><title>other_prod</title>
+<text text-anchor="middle" x="234.735" y="-175" font-family="Helvetica,sans-Serif" font-size="10.00">Other production info</text>
+</g>
+<!-- production&#45;&gt;other_prod -->
+<g id="edge8" class="edge"><title>production&#45;&gt;other_prod</title>
+<path fill="none" stroke="gray" d="M176.351,-246.343C188.633,-240.562 202.579,-232.439 212.735,-222 217.34,-217.267 221.193,-211.376 224.322,-205.489"/>
+<polygon fill="gray" stroke="gray" points="227.508,-206.94 228.646,-196.406 221.187,-203.931 227.508,-206.94"/>
+<text text-anchor="middle" x="222.069" y="-215.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
+</g>
+<!-- timespan -->
+<g id="node6" class="node"><title>timespan</title>
+<ellipse fill="none" stroke="gray" cx="134.735" cy="-98" rx="32.8294" ry="18"/>
+<text text-anchor="middle" x="134.735" y="-95" font-family="Helvetica,sans-Serif" font-size="10.00">timespan</text>
+</g>
+<!-- date&#45;&gt;timespan -->
+<g id="edge5" class="edge"><title>date&#45;&gt;timespan</title>
+<path fill="none" stroke="gray" d="M134.735,-159.689C134.735,-149.894 134.735,-137.422 134.735,-126.335"/>
+<polygon fill="gray" stroke="gray" points="138.235,-126.262 134.735,-116.262 131.235,-126.262 138.235,-126.262"/>
+<text text-anchor="middle" x="178.088" y="-135.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P4_has_time&#45;span</text>
+</g>
+<!-- start_date -->
+<g id="node7" class="node"><title>start_date</title>
+<ellipse fill="none" stroke="gray" cx="66.7348" cy="-18" rx="34.828" ry="18"/>
+<text text-anchor="middle" x="66.7348" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">start_date</text>
+</g>
+<!-- timespan&#45;&gt;start_date -->
+<g id="edge6" class="edge"><title>timespan&#45;&gt;start_date</title>
+<path fill="none" stroke="gray" d="M105.598,-89.5265C91.4682,-84.285 75.76,-75.6942 67.3129,-62 64.4467,-57.3534 63.1708,-51.8529 62.8105,-46.3654"/>
+<polygon fill="gray" stroke="gray" points="66.3171,-46.1414 63.0686,-36.0569 59.3193,-45.9661 66.3171,-46.1414"/>
+<text text-anchor="middle" x="124.446" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">ecrm:P82a_begin_of_the_begin</text>
+</g>
+<!-- other_date -->
+<g id="node10" class="node"><title>other_date</title>
+<text text-anchor="middle" x="202.735" y="-15" font-family="Helvetica,sans-Serif" font-size="10.00">End of begin, Start of end...</text>
+</g>
+<!-- timespan&#45;&gt;other_date -->
+<g id="edge9" class="edge"><title>timespan&#45;&gt;other_date</title>
+<path fill="none" stroke="gray" d="M155.643,-84.0943C164.163,-78.1192 173.652,-70.4665 180.735,-62 184.802,-57.138 188.386,-51.398 191.417,-45.7224"/>
+<polygon fill="gray" stroke="gray" points="194.728,-46.9184 195.983,-36.3981 188.442,-43.8394 194.728,-46.9184"/>
+<text text-anchor="middle" x="190.069" y="-55.6" font-family="Helvetica,sans-Serif" font-size="8.00">...</text>
+</g>
+</g>
+</svg>
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql08.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql08.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql09-1.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql09-1.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql09.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql09.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql10.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql10.png
--- a/lod/tutorial/images/graph-databases-and-SPARQL/sparql11.png
+++ b/lod/tutorial/images/graph-databases-and-SPARQL/sparql11.png
--- a/lod/tutorial/js/bootstrap-4-navbar.js
+++ b/lod/tutorial/js/bootstrap-4-navbar.js
@@ -0,0 +1,34 @@
+
+/*!
+ * Bootstrap 4 multi dropdown navbar ( https://bootstrapthemes.co/demo/resource/bootstrap-4-multi-dropdown-navbar/ )
+ * Copyright 2017.
+ * Licensed under the GPL license
+ */
+
+
+$( document ).ready( function () {
+    $( '.mobile-drop a.dropdown-toggle' ).on( 'click', function ( e ) {
+        var $el = $( this );
+        var $parent = $( this ).offsetParent( ".mobile-drop" );
+        if ($('.show.mobile-drop').length > 0){
+          $('.show.mobile-drop').each(function(item){
+            $(this).toggleClass('show');
+          });
+        }
+
+        var $subMenu = $( this ).next( ".mobile-drop" );
+        $subMenu.toggleClass( 'show' );
+
+        $( this ).parent( "li" ).toggleClass( 'show' );
+
+        $( this ).parents( 'li.nav-item.dropdown.mobile-drop.show' ).on( 'click', function ( e ) {
+            $( '.mobile-drop .show' ).removeClass( "show" );
+        } );
+
+         if ( !$parent.parent().hasClass( 'navbar-nav' ) ) {
+            $el.next().css( { "top": $el[0].offsetTop, "left": $parent.outerWidth() - 4 } );
+        }
+
+        return false;
+    } );
+} );
--- a/lod/tutorial/js/ext_links.js
+++ b/lod/tutorial/js/ext_links.js
@@ -0,0 +1,8 @@
+$(document).ready(function() {
+   $('a').each(function() {
+      var a = new RegExp('/' + window.location.host + '/');
+      if (!a.test(this.href)) {
+      $(this).attr("target","_blank");
+      }
+   });
+});
--- a/lod/tutorial/js/header_links.js
+++ b/lod/tutorial/js/header_links.js
@@ -0,0 +1,13 @@
+// http://ben.balter.com/2014/03/13/pages-anchor-links/
+
+$(function() {
+  return $("h2, h3, h4, h5, h6").each(function(i, el) {
+    var $el, icon, id;
+    $el = $(el);
+    id = $el.attr('id');
+    icon = '<i class="fa fa-link" style="font-size: 0.8em"></i>';
+    if (id) {
+      return $el.append($("<a />").addClass("header-link").attr("href", "#" + id).html(icon));
+    }
+  });
+});
--- a/lod/tutorial/sparql-datos-abiertos-enlazados.html
+++ b/lod/tutorial/sparql-datos-abiertos-enlazados.html
--- a/lod/tutorial/sparql-intro.ipynb
+++ b/lod/tutorial/sparql-intro.ipynb
--- a/lod/tutorial/sparql-vanGogh.ipynb
+++ b/lod/tutorial/sparql-vanGogh.ipynb
--- a/lod/upload.sh
+++ b/lod/upload.sh
@@ -0,0 +1,53 @@
+#!/bin/sh
+
+# This is a bit messy
+if [ "$#" -lt 1 ]; then
+        graph="http://example.com/sitc/submission/"
+        endpoint="http://fuseki.gsi.upm.es/hotels/data"
+else if [ "$#" -lt 2 ]; then
+        endpoint=$1
+        graph_base="http://example.com/sitc"
+     else
+         if [ "$#" -lt 3 ]; then
+             endpoint=$1
+             graph=$2
+         else
+             echo "Usage: $0 [<endpoint>] [<graph_base_uri>]"
+             echo
+             exit 1
+         fi
+     fi
+fi
+
+
+upload(){
+    name=$1
+    file=$2
+    echo '###'
+    echo "Uploading: $graph"
+    echo "Graph: $graph"
+    echo "Endpoint: $endpoint"
+    curl -X POST \
+         --digest -u admin:$PASSWORD \
+         -H Content-Type:text/turtle \
+         -T "$file" \
+         --data-urlencode graph=$graph_base/$name \
+	 -G $endpoint 
+
+}
+
+
+total=0
+echo -n "Password: "
+read -s PASSWORD
+
+echo "Uploading synthethic"
+upload "synthetic" synthetic/reviews.ttl || exit 1
+
+for i in *.ttl; do
+    identifier=$(echo ${i%.ttl} | md5sum | awk '{print $1}')
+    echo "Uploading $i"
+    upload $identifier $i 
+    total=$((total + 1))
+done
+echo Uploaded $total
--- a/ml1/2_0_0_Intro_ML.ipynb
+++ b/ml1/2_0_0_Intro_ML.ipynb
@@ -71,8 +71,7 @@
   "source": [
    "* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
-    "* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
-    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
   ]
  },
  {
@@ -88,7 +87,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -102,7 +101,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.6.7"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml1/2_0_1_Objectives.ipynb
+++ b/ml1/2_0_1_Objectives.ipynb
@@ -63,9 +63,7 @@
   "metadata": {},
   "source": [
    "* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
-    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
-    "* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
-    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
+    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
   ]
  },
  {
@@ -81,7 +79,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -95,7 +93,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.6.7"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml1/2_1_Intro_ScikitLearn.ipynb
+++ b/ml1/2_1_Intro_ScikitLearn.ipynb
@@ -87,10 +87,10 @@
   "metadata": {},
   "source": [
    "Scikit-learn provides algorithms for solving the following problems:\n",
-    "* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, kNN, ...), SVM, Random forest, Perceptron, etc. \n",
+    "* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, C4.5, ...), kNN, SVM, Random forest, Perceptron, etc. \n",
    "* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n",
    "* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n",
-    "* ** Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
+    "* **Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
   ]
  },
  {
--- a/ml1/2_2_Read_Data.ipynb
+++ b/ml1/2_2_Read_Data.ipynb
@@ -36,7 +36,7 @@
   "source": [
    "The goal of this notebook is to learn how to read and load a sample dataset.\n",
    "\n",
-    "Scikit-learn comes with some bundled [datasets](http://scikit-learn.org/stable/datasets/): iris, digits, boston, etc.\n",
+    "Scikit-learn comes with some bundled [datasets](https://scikit-learn.org/stable/datasets.html): iris, digits, boston, etc.\n",
    "\n",
    "In this notebook we are going to use the Iris dataset."
   ]
@@ -54,7 +54,7 @@
   "source": [
    "The [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), available at [UCI dataset repository](https://archive.ics.uci.edu/ml/datasets/Iris), is a classic dataset for classification.\n",
    "\n",
-    "The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features.\n",
+    "The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features, a machine learning model will learn to differentiate the species of Iris.\n",
    "\n",
    "![Iris](files/images/iris-dataset.jpg)"
   ]
@@ -63,7 +63,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "In ordert to read the dataset, we import the datasets bundle and then load the Iris dataset. "
+    "In order to read the dataset, we import the datasets bundle and then load the Iris dataset. "
   ]
  },
  {
--- a/ml1/2_3_0_Visualisation.ipynb
+++ b/ml1/2_3_0_Visualisation.ipynb
@@ -228,7 +228,6 @@
   "source": [
    "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
    "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
-    "* [Mastering Pandas](http://proquest.safaribooksonline.com/book/programming/python/9781783981960), Femi Anthony, Packt Publishing, 2015.\n",
    "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
    "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
    "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n",
--- a/ml1/2_3_1_Advanced_Visualisation.ipynb
+++ b/ml1/2_3_1_Advanced_Visualisation.ipynb
--- a/ml1/2_4_Preprocessing.ipynb
+++ b/ml1/2_4_Preprocessing.ipynb
@@ -163,7 +163,6 @@
   "source": [
    "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
    "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
-    "* [Mastering Pandas](http://proquest.safaribooksonline.com/book/programming/python/9781783981960), Femi Anthony, Packt Publishing, 2015.\n",
    "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
    "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
    "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)"
--- a/ml1/2_5_0_Machine_Learning.ipynb
+++ b/ml1/2_5_0_Machine_Learning.ipynb
@@ -154,7 +154,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "* [General concepts of machine learning with scikit-learn](http://www.astroml.org/sklearn_tutorial/general_concepts.html)\n",
+    "* [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/index.html)\n",
    "* [A Tour of Machine Learning Algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/)"
   ]
  },
@@ -177,7 +177,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -191,7 +191,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.5.6"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml1/2_5_1_kNN_Model.ipynb
+++ b/ml1/2_5_1_kNN_Model.ipynb
--- a/ml1/2_5_2_Decision_Tree_Model.ipynb
+++ b/ml1/2_5_2_Decision_Tree_Model.ipynb
@@ -130,12 +130,7 @@
    {
     "data": {
      "text/plain": [
-       "DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=3,\n",
-       "            max_features=None, max_leaf_nodes=None,\n",
-       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
-       "            min_samples_leaf=1, min_samples_split=2,\n",
-       "            min_weight_fraction_leaf=0.0, presort=False, random_state=1,\n",
-       "            splitter='best')"
+       "DecisionTreeClassifier(max_depth=3, random_state=1)"
      ]
     },
     "execution_count": 2,
@@ -277,20 +272,23 @@
   "metadata": {},
   "outputs": [
    {
-     "ename": "ModuleNotFoundError",
-     "evalue": "No module named 'pydotplus'",
+     "ename": "InvocationException",
+     "evalue": "GraphViz's executables not found",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mModuleNotFoundError\u001b[0m                       Traceback (most recent call last)",
-      "\u001b[0;32m<ipython-input-7-1bf5ec7fb043>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mIPython\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdisplay\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mImage\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0msklearn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexternals\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msix\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mpydotplus\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpydot\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0mdot_data\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
-      "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pydotplus'"
+      "\u001b[0;31mInvocationException\u001b[0m                       Traceback (most recent call last)",
+      "\u001b[0;32m/tmp/ipykernel_47326/3723147494.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m     12\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     13\u001b[0m \u001b[0mgraph\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpydot\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraph_from_dot_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdot_data\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgetvalue\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 14\u001b[0;31m \u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite_png\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'iris-tree.png'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     15\u001b[0m \u001b[0mImage\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgraph\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate_png\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
+      "\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36m<lambda>\u001b[0;34m(path, f, prog)\u001b[0m\n\u001b[1;32m   1808\u001b[0m                 \u001b[0;32mlambda\u001b[0m \u001b[0mpath\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1809\u001b[0m                 \u001b[0mf\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mfrmt\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1810\u001b[0;31m                 \u001b[0mprog\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprog\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1811\u001b[0m             )\n\u001b[1;32m   1812\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
+      "\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36mwrite\u001b[0;34m(self, path, prog, format)\u001b[0m\n\u001b[1;32m   1916\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1917\u001b[0m             \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1918\u001b[0;31m                 \u001b[0mfobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mprog\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mformat\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1919\u001b[0m         \u001b[0;32mfinally\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1920\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
+      "\u001b[0;32m~/anaconda3/lib/python3.8/site-packages/pydotplus/graphviz.py\u001b[0m in \u001b[0;36mcreate\u001b[0;34m(self, prog, format)\u001b[0m\n\u001b[1;32m   1957\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprogs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfind_graphviz\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1958\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprogs\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1959\u001b[0;31m                 raise InvocationException(\n\u001b[0m\u001b[1;32m   1960\u001b[0m                     'GraphViz\\'s executables not found')\n\u001b[1;32m   1961\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
+      "\u001b[0;31mInvocationException\u001b[0m: GraphViz's executables not found"
     ]
    }
   ],
   "source": [
    "from IPython.display import Image \n",
-    "from sklearn.externals.six import StringIO\n",
+    "from six import StringIO\n",
    "import pydotplus as pydot\n",
    "\n",
    "dot_data = StringIO()  \n",
@@ -510,10 +508,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n",
-    "* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
-    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
-    "* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
+    "* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
+    "* [Parameter estimation using grid search with cross-validation](https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html)\n",
    "* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
   ]
  },
@@ -529,8 +525,17 @@
  }
 ],
 "metadata": {
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -544,7 +549,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml1/2_6_Model_Tuning.ipynb
+++ b/ml1/2_6_Model_Tuning.ipynb
@@ -39,7 +39,7 @@
    "* [Train classifier](#Train-classifier)\n",
    "* [More about Pipelines](#More-about-Pipelines)\n",
    "* [Tuning the algorithm](#Tuning-the-algorithm)\n",
-    "\t* [Grid Search for Parameter optimization](#Grid-Search-for-Parameter-optimization)\n",
+    "\t* [Grid Search for Hyperparameter optimization](#Grid-Search-for-Hyperparameter-optimization)\n",
    "* [Evaluating the algorithm](#Evaluating-the-algorithm)\n",
    "\t* [K-Fold validation](#K-Fold-validation)\n",
    "* [References](#References)\n"
@@ -56,9 +56,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the parameters of the estimator?\n",
+    "In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the hyperparameters of the estimator?\n",
    "\n",
-    "The goal of this notebook is to learn how to tune an algorithm by opimizing its parameters using grid search."
+    "The goal of this notebook is to learn how to tune an algorithm by opimizing its hyperparameters using grid search."
   ]
  },
  {
@@ -300,21 +300,21 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "You can try different values for these parameters and observe the results."
+    "You can try different values for these hyperparameters and observe the results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Grid Search for Parameter optimization"
+    "### Grid Search for Hyperparameter optimization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Changing manually the parameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the parameters as an *optimization problem*. \n",
+    "Changing manually the hyperparameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the hyperparameters as an *optimization problem*. \n",
    "\n",
    "The sklearn comes with several optimization techniques for this purpose, such as  **grid search** and  **randomized search**. In this notebook we are going to introduce the former one."
   ]
@@ -323,7 +323,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The sklearn provides an object that, given data, computes the score during the fit of an estimator on a parameter grid and chooses the parameters to maximize the cross-validation score. "
+    "The sklearn provides an object that, given data, computes the score during the fit of an estimator on a hyperparameter grid and chooses the hyperparameters to maximize the cross-validation score. "
   ]
  },
  {
@@ -371,7 +371,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We can now evaluate the KFold with this optimized parameter as follows."
+    "We can now evaluate the KFold with this optimized hyperparameter as follows."
   ]
  },
  {
@@ -405,7 +405,7 @@
   "source": [
    "We have got an *improvement* from 0.947 to 0.953 with k-fold.\n",
    "\n",
-    "We are now to try to fit the best combination of the parameters of the algorithm. It can take some time to compute it."
+    "We are now to try to fit the best combination of the hyperparameters of the algorithm. It can take some time to compute it."
   ]
  },
  {
@@ -414,12 +414,12 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Set the parameters by cross-validation\n",
+    "# Set the hyperparameters by cross-validation\n",
    "\n",
-    "from sklearn.metrics import classification_report\n",
+    "from sklearn.metrics import classification_report, recall_score, precision_score, make_scorer\n",
    "\n",
-    "# set of parameters to test\n",
-    "tuned_parameters = [{'max_depth': np.arange(3, 10),\n",
+    "# set of hyperparameters to test\n",
+    "tuned_hyperparameters = [{'max_depth': np.arange(3, 10),\n",
    "#                     'max_weights': [1, 10, 100, 1000]},\n",
    "                     'criterion': ['gini', 'entropy'], \n",
    "                     'splitter': ['best', 'random'],\n",
@@ -431,14 +431,19 @@
    "scores = ['precision', 'recall']\n",
    "\n",
    "for score in scores:\n",
-    "    print(\"# Tuning hyper-parameters for %s\" % score)\n",
+    "    print(\"# Tuning hyperparameters for %s\" % score)\n",
    "    print()\n",
    "\n",
+    "    if score == 'precision':\n",
+    "        scorer = make_scorer(precision_score, average='weighted', zero_division=0)\n",
+    "    elif score == 'recall':\n",
+    "        scorer = make_scorer(recall_score, average='weighted', zero_division=0)\n",
+    "    \n",
    "    # cv = the fold of the cross-validation cv, defaulted to 5\n",
-    "    gs = GridSearchCV(DecisionTreeClassifier(), tuned_parameters, cv=10, scoring='%s_weighted' % score)\n",
+    "    gs = GridSearchCV(DecisionTreeClassifier(), tuned_hyperparameters, cv=10, scoring=scorer)\n",
    "    gs.fit(x_train, y_train)\n",
    "\n",
-    "    print(\"Best parameters set found on development set:\")\n",
+    "    print(\"Best hyperparameters set found on development set:\")\n",
    "    print()\n",
    "    print(gs.best_params_)\n",
    "    print()\n",
@@ -512,10 +517,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n",
-    "* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
-    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
-    "* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
+    "* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
+    "* [Hyperparameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
    "* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
   ]
  },
@@ -538,7 +541,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -552,7 +555,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.6.7"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml1/2_7_Model_Persistence.ipynb
+++ b/ml1/2_7_Model_Persistence.ipynb
@@ -117,7 +117,7 @@
   "outputs": [],
   "source": [
    "# save model\n",
-    "from sklearn.externals import joblib\n",
+    "import joblib\n",
    "joblib.dump(model, 'filename.pkl') \n",
    "\n",
    "#load model\n",
@@ -136,7 +136,9 @@
   "metadata": {},
   "source": [
    "* [Tutorial scikit-learn](http://scikit-learn.org/stable/tutorial/basic/tutorial.html)\n",
-    "* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)"
+    "* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)\n",
+    "* [scikit-learn : Machine Learning Simplified](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
   ]
  },
  {
@@ -151,8 +153,17 @@
  }
 ],
 "metadata": {
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -166,7 +177,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.6.7"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml1/util_ds.py
+++ b/ml1/util_ds.py
@@ -47,7 +47,7 @@ def get_code(tree, feature_names, target_names,

    recurse(left, right, threshold, features, 0, 0)

-# Taken from http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#example-tree-plot-iris-py
+# Taken from https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html
 import numpy as np
 import matplotlib.pyplot as plt

@@ -114,4 +114,4 @@ def plot_tree_iris():

    plt.suptitle("Decision surface of a decision tree using paired features")
    plt.legend()
-    plt.show()  
+    plt.show()  
--- a/ml1/util_knn.py
+++ b/ml1/util_knn.py
@@ -2,6 +2,7 @@ import numpy as np
 import matplotlib.pyplot as plt
 from matplotlib.colors import ListedColormap
 from sklearn import neighbors, datasets
+import seaborn as sns
 from sklearn.neighbors import KNeighborsClassifier

 # Taken from http://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html
@@ -19,9 +20,9 @@ def plot_classification_iris():
    h = .02  # step size in the mesh
    n_neighbors = 15

-    # Create color maps
-    cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA', '#AAAAFF'])
-    cmap_bold = ListedColormap(['#FF0000', '#00FF00', '#0000FF'])
+  # Create color maps
+    cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue'])
+    cmap_bold = ['darkorange', 'c', 'darkblue']

    for weights in ['uniform', 'distance']:
        # we create an instance of Neighbours Classifier and fit the data.
@@ -29,7 +30,7 @@ def plot_classification_iris():
        clf.fit(X, y)

        # Plot the decision boundary. For that, we will assign a color to each
-        # point in the mesh [x_min, m_max]x[y_min, y_max].
+        # point in the mesh [x_min, x_max]x[y_min, y_max].
        x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
        y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
        xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
@@ -38,14 +39,17 @@ def plot_classification_iris():

        # Put the result into a color plot
        Z = Z.reshape(xx.shape)
-        plt.figure()
-        plt.pcolormesh(xx, yy, Z, cmap=cmap_light)
+        plt.figure(figsize=(8, 6))
+        plt.contourf(xx, yy, Z, cmap=cmap_light)

        # Plot also the training points
-        plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold)
+        sns.scatterplot(x=X[:, 0], y=X[:, 1], hue=iris.target_names[y],
+                        palette=cmap_bold, alpha=1.0, edgecolor="black")
        plt.xlim(xx.min(), xx.max())
        plt.ylim(yy.min(), yy.max())
        plt.title("3-Class classification (k = %i, weights = '%s')"
-                % (n_neighbors, weights))
+                  % (n_neighbors, weights))
+        plt.xlabel(iris.feature_names[0])
+        plt.ylabel(iris.feature_names[1])

-    plt.show()
+plt.show()
--- a/ml2/3_0_0_Intro_ML_2.ipynb
+++ b/ml2/3_0_0_Intro_ML_2.ipynb
@@ -74,9 +74,7 @@
   "metadata": {},
   "source": [
    "* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
-    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
-    "* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
-    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
+    "* [Scikit-learn videos and notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
   ]
  },
  {
@@ -92,7 +90,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -106,7 +104,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_1_Read_Data.ipynb
+++ b/ml2/3_1_Read_Data.ipynb
@@ -213,8 +213,7 @@
    "* [Pandas API input-output](http://pandas.pydata.org/pandas-docs/stable/api.html#input-output)\n",
    "* [Pandas API - pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)\n",
    "* [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html)\n",
-    "* [An introduction to NumPy and Scipy](http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf)\n",
-    "* [NumPy tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)"
+    "* [An introduction to NumPy and Scipy](https://sites.engineering.ucsb.edu/~shell/che210d/numpy.pdf)\n"
   ]
  },
  {
--- a/ml2/3_2_Pandas.ipynb
+++ b/ml2/3_2_Pandas.ipynb
@@ -433,10 +433,9 @@
   "metadata": {},
   "source": [
    "* [Pandas](http://pandas.pydata.org/)\n",
-    "* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n",
-    "* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
+    "* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
    "* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
-    "* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)"
+    "* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)"
   ]
  },
  {
@@ -458,7 +457,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -472,7 +471,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_3_Data_Munging_with_Pandas.ipynb
+++ b/ml2/3_3_Data_Munging_with_Pandas.ipynb
@@ -373,8 +373,8 @@
   "source": [
    "#Mean age of  passengers per Passenger class\n",
    "\n",
-    "#First we calculate the mean\n",
-    "df.groupby('Pclass').mean()"
+    "#First we calculate the mean for the numeric columns\n",
+    "df.select_dtypes(np.number).groupby('Pclass').mean()"
   ]
  },
  {
@@ -404,7 +404,7 @@
   "outputs": [],
   "source": [
    "#Mean Age and SibSp of passengers grouped by passenger class and sex\n",
-    "df.groupby(['Pclass', 'Sex'])['Age','SibSp'].mean()"
+    "df.groupby(['Pclass', 'Sex'])[['Age','SibSp']].mean()"
   ]
  },
  {
@@ -414,7 +414,7 @@
   "outputs": [],
   "source": [
    "#Show mean  Age and  SibSp for passengers  older than 25 grouped by Passenger Class and Sex\n",
-    "df[df.Age > 25].groupby(['Pclass', 'Sex'])['Age','SibSp'].mean()"
+    "df[df.Age > 25].groupby(['Pclass', 'Sex'])[['Age','SibSp']].mean()"
   ]
  },
  {
@@ -424,7 +424,7 @@
   "outputs": [],
   "source": [
    "# Mean age, SibSp , Survived of passengers older than 25 which survived, grouped by Passenger Class and Sex \n",
-    "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].mean()"
+    "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])[['Age','SibSp','Survived']].mean()"
   ]
  },
  {
@@ -436,7 +436,7 @@
    "# We can also decide which function apply in each column\n",
    "\n",
    "#Show mean Age, mean SibSp, and number of passengers older than 25 that survived,  grouped by Passenger Class and Sex\n",
-    "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].agg({'Age': np.mean, \n",
+    "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])[['Age','SibSp','Survived']].agg({'Age': np.mean, \n",
    "                                                                         'SibSp': np.mean, 'Survived': np.sum})"
   ]
  },
@@ -600,8 +600,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Fill missing values with the median\n",
-    "df_filled = df.fillna(df.median())\n",
+    "# Fill missing values with the median, we avoid empty (None) values with numeric_only\n",
+    "df_filled = df.fillna(df.median(numeric_only=True))\n",
    "df_filled[-5:]"
   ]
  },
@@ -685,7 +685,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# But we are working on a copy \n",
+    "# But we are working on a copy, so we get a warning\n",
    "df.iloc[889]['Sex'] = np.nan"
   ]
  },
@@ -695,7 +695,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# If we want to change, we should not chain selections\n",
+    "# If we want to change it, we should not chain selections\n",
    "# The selection can be done with the column name\n",
    "df.loc[889, 'Sex']"
   ]
@@ -932,11 +932,11 @@
   "metadata": {},
   "source": [
    "* [Pandas](http://pandas.pydata.org/)\n",
-    "* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n",
-    "* [Useful Pandas Snippets](https://gist.github.com/bsweger/e5817488d161f37dcbd2)\n",
-    "* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
+    "* [Learning Pandas, Michael Heydt, Packt Publishing, 2017](https://learning.oreilly.com/library/view/learning-pandas/9781787123137/)\n",
+    "* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
    "* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
-    "* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)"
+    "* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)\n",
+    "* [Useful Pandas Snippets](https://gist.github.com/bsweger/e5817488d161f37dcbd2)"
   ]
  },
  {
@@ -958,7 +958,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -972,7 +972,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_4_Visualisation_Pandas.ipynb
+++ b/ml2/3_4_Visualisation_Pandas.ipynb
@@ -220,7 +220,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Analise distributon\n",
+    "# Analise distribution\n",
    "df.hist(figsize=(10,10))\n",
    "plt.show()"
   ]
@@ -233,7 +233,7 @@
   "source": [
    "# We can see the pairwise correlation between variables. A value near 0 means low correlation\n",
    "# while a value  near -1 or 1 indicates strong correlation.\n",
-    "df.corr()"
+    "df.corr(numeric_only = True)"
   ]
  },
  {
@@ -249,11 +249,10 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# General description of relationship betweek variables uwing Seaborn PairGrid\n",
+    "# General description of relationship between variables uwing Seaborn PairGrid\n",
    "# We use df_clean, since the null values of df would gives us an error, you can check it.\n",
    "g = sns.PairGrid(df_clean, hue=\"Survived\")\n",
-    "g.map_diag(plt.hist)\n",
-    "g.map_offdiag(plt.scatter)\n",
+    "g.map(sns.scatterplot)\n",
    "g.add_legend()"
   ]
  },
@@ -367,7 +366,7 @@
   "outputs": [],
   "source": [
    "# Now we visualise age and survived to see if there is some relationship\n",
-    "sns.FacetGrid(df, hue=\"Survived\", size=5).map(sns.kdeplot, \"Age\").add_legend()"
+    "sns.FacetGrid(df, hue=\"Survived\", height=5).map(sns.kdeplot, \"Age\").add_legend()"
   ]
  },
  {
@@ -567,7 +566,7 @@
   "outputs": [],
   "source": [
    "# Plot with seaborn\n",
-    "sns.countplot('Sex', data=df)"
+    "sns.countplot(x='Sex', data=df)"
   ]
  },
  {
@@ -683,16 +682,6 @@
    "df.groupby('Pclass').size()"
   ]
  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Distribution\n",
-    "sns.countplot('Pclass', data=df)"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -725,7 +714,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "sns.factorplot('Pclass',data=df,hue='Sex',kind='count')"
+    "sns.catplot(x='Pclass',data=df,hue='Sex',kind='count')"
   ]
  },
  {
@@ -906,7 +895,7 @@
   "outputs": [],
   "source": [
    "# Distribution\n",
-    "sns.countplot('Embarked', data=df)"
+    "sns.countplot(x='Embarked', data=df)"
   ]
  },
  {
@@ -997,7 +986,7 @@
   "outputs": [],
   "source": [
    "# Distribution\n",
-    "sns.countplot('SibSp', data=df)"
+    "sns.countplot(x='SibSp', data=df)"
   ]
  },
  {
@@ -1180,7 +1169,7 @@
   "outputs": [],
   "source": [
    "# Distribution\n",
-    "sns.countplot('Parch', data=df)"
+    "sns.countplot(x='Parch', data=df)"
   ]
  },
  {
@@ -1233,7 +1222,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "df.groupby(['Pclass', 'Sex', 'Parch'])['Parch', 'SibSp', 'Survived'].agg({'Parch': np.size, 'SibSp': np.mean, 'Survived': np.mean})"
+    "df.groupby(['Pclass', 'Sex', 'Parch'])[['Parch', 'SibSp', 'Survived']].agg({'Parch': np.size, 'SibSp': np.mean, 'Survived': np.mean})"
   ]
  },
  {
@@ -1576,7 +1565,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -1590,7 +1579,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_5_Exercise_1.ipynb
+++ b/ml2/3_5_Exercise_1.ipynb
@@ -72,7 +72,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv\"\n",
+    "Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv.\n",
    "\n",
    "Print *df*."
   ]
@@ -214,7 +214,7 @@
   "outputs": [],
   "source": [
    "df['FamilySize'] = df['SibSp'] + df['Parch']\n",
-    "df.head()"
+    "df"
   ]
  },
  {
@@ -377,8 +377,8 @@
   "outputs": [],
   "source": [
    "# Group ages to simplify machine learning algorithms.  0: 0-5, 1: 6-10, 2: 11-15, 3: 16-59 and 4: 60-80\n",
-    "df['AgeGroup'] = 0\n",
-    "df.loc[(.Age<6),'AgeGroup'] = 0\n",
+    "df['AgeGroup'] = np.nan\n",
+    "df.loc[(df.Age<6),'AgeGroup'] = 0\n",
    "df.loc[(df.Age>=6) & (df.Age < 11),'AgeGroup'] = 1\n",
    "df.loc[(df.Age>=11) & (df.Age < 16),'AgeGroup'] = 2\n",
    "df.loc[(df.Age>=16) & (df.Age < 60),'AgeGroup'] = 3\n",
@@ -404,8 +404,8 @@
    "        if np.isnan(big_string):\n",
    "            return 'X'\n",
    "    for substring in substrings:\n",
-    "        if big_string.find(substring) != 1:\n",
-    "            return substring\n",
+    "        if substring in big_string:\n",
+    "            return substring[0::]\n",
    "    print(big_string)\n",
    "    return 'X'\n",
    " \n",
@@ -478,8 +478,17 @@
  }
 ],
 "metadata": {
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -493,7 +502,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_6_Machine_Learning.ipynb
+++ b/ml2/3_6_Machine_Learning.ipynb
@@ -78,7 +78,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka and Vahid Mirjalili, Packt Publishing, 2019."
   ]
  },
  {
@@ -100,7 +100,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -114,7 +114,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_7_SVM.ipynb
+++ b/ml2/3_7_SVM.ipynb
@@ -222,7 +222,7 @@
    "kernel = types_of_kernels[0]\n",
    "gamma = 3.0\n",
    "\n",
-    "# Create kNN model\n",
+    "# Create SVM model\n",
    "model = SVC(kernel=kernel, probability=True, gamma=gamma)"
   ]
  },
@@ -276,7 +276,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We can evaluate the accuracy if the model always predict the most frequent class, following this [refeference](http://blog.kaggle.com/2015/10/23/scikit-learn-video-9-better-evaluation-of-classification-models/)."
+    "We can evaluate the accuracy if the model always predict the most frequent class, following this [reference](https://medium.com/analytics-vidhya/model-validation-for-classification-5ff4a0373090)."
   ]
  },
  {
@@ -351,10 +351,10 @@
    "We can obtain more information from the confussion matrix and the metric F1-score.\n",
    "In a confussion matrix, we can see:\n",
    "\n",
-    "||**Predicted**: 0| **Predicted: 1**|\n",
-    "|---------------------------|\n",
-    "|**Actual: 0**| TN | FP |\n",
-    "|**Actual: 1**| FN|TP|\n",
+    "|             |**Predicted**: 0| **Predicted: 1**|\n",
+    "|-------------|----------------|-----------------|\n",
+    "|**Actual: 0**| TN             | FP              |\n",
+    "|**Actual: 1**| FN             | TP              |\n",
    "\n",
    "* **True negatives (TN)**: actual negatives that were predicted as negatives\n",
    "* **False positives (FP)**: actual negatives that were predicted as positives\n",
@@ -418,7 +418,7 @@
    "plt.ylim([0.0, 1.0])\n",
    "plt.title('ROC curve for Titanic')\n",
    "plt.xlabel('False Positive Rate (1 - Recall)')\n",
-    "plt.xlabel('True Positive Rate (Sensitivity)')\n",
+    "plt.ylabel('True Positive Rate (Sensitivity)')\n",
    "plt.grid(True)"
   ]
  },
@@ -535,13 +535,13 @@
   "source": [
    "# This step will take some time\n",
    "# Cross-validationt\n",
-    "cv = KFold(n_splits=5, shuffle=False, random_state=33)\n",
+    "cv = KFold(n_splits=5, shuffle=True, random_state=33)\n",
    "# StratifiedKFold has is a variation of k-fold which returns stratified folds:\n",
    "# each set contains approximately the same percentage of samples of each target class as the complete set.\n",
-    "#cv = StratifiedKFold(y, n_folds=3, shuffle=False, random_state=33)\n",
+    "#cv = StratifiedKFold(y, n_folds=3, shuffle=True, random_state=33)\n",
    "scores = cross_val_score(model, X, y, cv=cv)\n",
    "print(\"Scores in every iteration\", scores)\n",
-    "print(\"Accuracy: %0.2f (+/- %0.2f)\" % (scores.mean(), scores.std() * 2))\n"
+    "print(\"Accuracy: %0.2f (+/- %0.2f)\" % (scores.mean(), scores.std() * 2))"
   ]
  },
  {
@@ -644,7 +644,7 @@
   "source": [
    "* [Titanic Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
    "* [API SVC scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html)\n",
-    "* [Better evaluation of classification models](http://blog.kaggle.com/2015/10/23/scikit-learn-video-9-better-evaluation-of-classification-models/)"
+    "* [How to choose the right metric for evaluating an ML model](https://www.kaggle.com/vipulgandhi/how-to-choose-right-metric-for-evaluating-ml-model)"
   ]
  },
  {
@@ -666,7 +666,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -680,7 +680,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/3_8_Exercise_2.ipynb
+++ b/ml2/3_8_Exercise_2.ipynb
@@ -39,7 +39,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "In this exercise we are going to put in practice what we have learnt in the notebooks of the session. \n",
+    "In this exercise, we are going to put in practice what we have learnt in the notebooks of the session. \n",
    "\n",
    "In the previous notebook we have been applying the SVM machine learning algorithm.\n",
    "\n",
@@ -67,7 +67,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -81,7 +81,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.1"
+   "version": "3.8.12"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
--- a/ml2/plot_learning_curve.py
+++ b/ml2/plot_learning_curve.py
@@ -1,21 +1,21 @@
 """
-Taken from http://scikit-learn.org/stable/auto_examples/model_selection/plot_learning_curve.html
-
 ========================
 Plotting Learning Curves
 ========================
+In the first column, first row the learning curve of a naive Bayes classifier
+is shown for the digits dataset. Note that the training score and the
+cross-validation score are both not very good at the end. However, the shape
+of the curve can be found in more complex datasets very often: the training
+score is very high at the beginning and decreases and the cross-validation
+score is very low at the beginning and increases. In the second column, first
+row we see the learning curve of an SVM with RBF kernel. We can see clearly
+that the training score is still around the maximum and the validation score
+could be increased with more training samples. The plots in the second row
+show the times required by the models to train with various sizes of training
+dataset. The plots in the third row show how much time was required to train
+the models for each training sizes.

-On the left side the learning curve of a naive Bayes classifier is shown for
-the digits dataset. Note that the training score and the cross-validation score
-are both not very good at the end. However, the shape of the curve can be found
-in more complex datasets very often: the training score is very high at the
-beginning and decreases and the cross-validation score is very low at the
-beginning and increases. On the right side we see the learning curve of an SVM
-with RBF kernel. We can see clearly that the training score is still around
-the maximum and the validation score could be increased with more training
-samples.
 """
-#print(__doc__)

 import numpy as np
 import matplotlib.pyplot as plt
@@ -23,86 +23,181 @@ from sklearn.naive_bayes import GaussianNB
 from sklearn.svm import SVC
 from sklearn.datasets import load_digits
 from sklearn.model_selection import learning_curve
+from sklearn.model_selection import ShuffleSplit


-def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,
-                        n_jobs=1, train_sizes=np.linspace(.1, 1.0, 5)):
+def plot_learning_curve(
+    estimator,
+    title,
+    X,
+    y,
+    axes=None,
+    ylim=None,
+    cv=None,
+    n_jobs=None,
+    train_sizes=np.linspace(0.1, 1.0, 5),
+):
    """
-    Generate a simple plot of the test and traning learning curve.
+    Generate 3 plots: the test and training learning curve, the training
+    samples vs fit times curve, the fit times vs score curve.

    Parameters
    ----------
-    estimator : object type that implements the "fit" and "predict" methods
-        An object of that type which is cloned for each validation.
+    estimator : estimator instance
+        An estimator instance implementing `fit` and `predict` methods which
+        will be cloned for each validation.

-    title : string
+    title : str
        Title for the chart.

-    X : array-like, shape (n_samples, n_features)
-        Training vector, where n_samples is the number of samples and
-        n_features is the number of features.
+    X : array-like of shape (n_samples, n_features)
+        Training vector, where ``n_samples`` is the number of samples and
+        ``n_features`` is the number of features.

-    y : array-like, shape (n_samples) or (n_samples, n_features), optional
-        Target relative to X for classification or regression;
+    y : array-like of shape (n_samples) or (n_samples, n_features)
+        Target relative to ``X`` for classification or regression;
        None for unsupervised learning.

-    ylim : tuple, shape (ymin, ymax), optional
-        Defines minimum and maximum yvalues plotted.
+    axes : array-like of shape (3,), default=None
+        Axes to use for plotting the curves.

-    cv : integer, cross-validation generator, optional
-        If an integer is passed, it is the number of folds (defaults to 3).
-        Specific cross-validation objects can be passed, see
-        sklearn.model_selection module for the list of possible objects
+    ylim : tuple of shape (2,), default=None
+        Defines minimum and maximum y-values plotted, e.g. (ymin, ymax).

-    n_jobs : integer, optional
-        Number of jobs to run in parallel (default 1).
+    cv : int, cross-validation generator or an iterable, default=None
+        Determines the cross-validation splitting strategy.
+        Possible inputs for cv are:
+
+          - None, to use the default 5-fold cross-validation,
+          - integer, to specify the number of folds.
+          - :term:`CV splitter`,
+          - An iterable yielding (train, test) splits as arrays of indices.
+
+        For integer/None inputs, if ``y`` is binary or multiclass,
+        :class:`StratifiedKFold` used. If the estimator is not a classifier
+        or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.
+
+        Refer :ref:`User Guide <cross_validation>` for the various
+        cross-validators that can be used here.
+
+    n_jobs : int or None, default=None
+        Number of jobs to run in parallel.
+        ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
+        ``-1`` means using all processors. See :term:`Glossary <n_jobs>`
+        for more details.
+
+    train_sizes : array-like of shape (n_ticks,)
+        Relative or absolute numbers of training examples that will be used to
+        generate the learning curve. If the ``dtype`` is float, it is regarded
+        as a fraction of the maximum size of the training set (that is
+        determined by the selected validation method), i.e. it has to be within
+        (0, 1]. Otherwise it is interpreted as absolute sizes of the training
+        sets. Note that for classification the number of samples usually have
+        to be big enough to contain at least one sample from each class.
+        (default: np.linspace(0.1, 1.0, 5))
    """
-    plt.figure()
-    plt.title(title)
+    if axes is None:
+        _, axes = plt.subplots(1, 3, figsize=(20, 5))
+
+    axes[0].set_title(title)
    if ylim is not None:
-        plt.ylim(*ylim)
-    plt.xlabel("Training examples")
-    plt.ylabel("Score")
-    train_sizes, train_scores, test_scores = learning_curve(
-        estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)
+        axes[0].set_ylim(*ylim)
+    axes[0].set_xlabel("Training examples")
+    axes[0].set_ylabel("Score")
+
+    train_sizes, train_scores, test_scores, fit_times, _ = learning_curve(
+        estimator,
+        X,
+        y,
+        cv=cv,
+        n_jobs=n_jobs,
+        train_sizes=train_sizes,
+        return_times=True,
+    )
    train_scores_mean = np.mean(train_scores, axis=1)
    train_scores_std = np.std(train_scores, axis=1)
    test_scores_mean = np.mean(test_scores, axis=1)
    test_scores_std = np.std(test_scores, axis=1)
-    plt.grid()
+    fit_times_mean = np.mean(fit_times, axis=1)
+    fit_times_std = np.std(fit_times, axis=1)

-    plt.fill_between(train_sizes, train_scores_mean - train_scores_std,
-                     train_scores_mean + train_scores_std, alpha=0.1,
-                     color="r")
-    plt.fill_between(train_sizes, test_scores_mean - test_scores_std,
-                     test_scores_mean + test_scores_std, alpha=0.1, color="g")
-    plt.plot(train_sizes, train_scores_mean, 'o-', color="r",
-             label="Training score")
-    plt.plot(train_sizes, test_scores_mean, 'o-', color="g",
-             label="Cross-validation score")
+    # Plot learning curve
+    axes[0].grid()
+    axes[0].fill_between(
+        train_sizes,
+        train_scores_mean - train_scores_std,
+        train_scores_mean + train_scores_std,
+        alpha=0.1,
+        color="r",
+    )
+    axes[0].fill_between(
+        train_sizes,
+        test_scores_mean - test_scores_std,
+        test_scores_mean + test_scores_std,
+        alpha=0.1,
+        color="g",
+    )
+    axes[0].plot(
+        train_sizes, train_scores_mean, "o-", color="r", label="Training score"
+    )
+    axes[0].plot(
+        train_sizes, test_scores_mean, "o-", color="g", label="Cross-validation score"
+    )
+    axes[0].legend(loc="best")
+
+    # Plot n_samples vs fit_times
+    axes[1].grid()
+    axes[1].plot(train_sizes, fit_times_mean, "o-")
+    axes[1].fill_between(
+        train_sizes,
+        fit_times_mean - fit_times_std,
+        fit_times_mean + fit_times_std,
+        alpha=0.1,
+    )
+    axes[1].set_xlabel("Training examples")
+    axes[1].set_ylabel("fit_times")
+    axes[1].set_title("Scalability of the model")
+
+    # Plot fit_time vs score
+    fit_time_argsort = fit_times_mean.argsort()
+    fit_time_sorted = fit_times_mean[fit_time_argsort]
+    test_scores_mean_sorted = test_scores_mean[fit_time_argsort]
+    test_scores_std_sorted = test_scores_std[fit_time_argsort]
+    axes[2].grid()
+    axes[2].plot(fit_time_sorted, test_scores_mean_sorted, "o-")
+    axes[2].fill_between(
+        fit_time_sorted,
+        test_scores_mean_sorted - test_scores_std_sorted,
+        test_scores_mean_sorted + test_scores_std_sorted,
+        alpha=0.1,
+    )
+    axes[2].set_xlabel("fit_times")
+    axes[2].set_ylabel("Score")
+    axes[2].set_title("Performance of the model")

-    plt.legend(loc="best")
    return plt


-#digits = load_digits()
-#X, y = digits.data, digits.target
+fig, axes = plt.subplots(3, 2, figsize=(10, 15))

+X, y = load_digits(return_X_y=True)

-#title = "Learning Curves (Naive Bayes)"
-# Cross validation with 100 iterations to get smoother mean test and train
+title = "Learning Curves (Naive Bayes)"
+# Cross validation with 50 iterations to get smoother mean test and train
 # score curves, each time with 20% data randomly selected as a validation set.
-#cv = cross_validation.ShuffleSplit(digits.data.shape[0], n_iter=100,
-#                                   test_size=0.2, random_state=0)
+cv = ShuffleSplit(n_splits=50, test_size=0.2, random_state=0)

-#estimator = GaussianNB()
-#plot_learning_curve(estimator, title, X, y, ylim=(0.7, 1.01), cv=cv, n_jobs=4)
+estimator = GaussianNB()
+plot_learning_curve(
+    estimator, title, X, y, axes=axes[:, 0], ylim=(0.7, 1.01), cv=cv, n_jobs=4
+)

-#title = "Learning Curves (SVM, RBF kernel, $\gamma=0.001$)"
+title = r"Learning Curves (SVM, RBF kernel, $\gamma=0.001$)"
 # SVC is more expensive so we do a lower number of CV iterations:
-#cv = cross_validation.ShuffleSplit(digits.data.shape[0], n_iter=10,
-#	                                   test_size=0.2, random_state=0)
-#estimator = SVC(gamma=0.001)
-#plot_learning_curve(estimator, title, X, y, (0.7, 1.01), cv=cv, n_jobs=4)
+cv = ShuffleSplit(n_splits=5, test_size=0.2, random_state=0)
+estimator = SVC(gamma=0.001)
+plot_learning_curve(
+    estimator, title, X, y, axes=axes[:, 1], ylim=(0.7, 1.01), cv=cv, n_jobs=4
+)

-#plt.show()
+plt.show()
--- a/ml2/plot_svm.py
+++ b/ml2/plot_svm.py
@@ -3,7 +3,7 @@ import matplotlib.pyplot as plt
 import numpy as np
 from sklearn import svm

-#Taken from http://nbviewer.jupyter.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb
+# Taken from http://nbviewer.jupyter.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb

 def plot_svm(df):
 	# set plotting parameters
--- a/ml21/.gitkeep
+++ b/ml21/.gitkeep
@@ -0,0 +1 @@
+
--- a/ml21/preprocessing/.gitkeep
+++ b/ml21/preprocessing/.gitkeep
@@ -0,0 +1 @@
+
--- a/ml21/preprocessing/00_Intro_Preprocessing.ipynb
+++ b/ml21/preprocessing/00_Intro_Preprocessing.ipynb
@@ -0,0 +1,157 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Introduction to Preprocessing\n",
+    "In this session, we will get more insight regarding how to preprocess data.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Objectives\n",
+    "The main objectives of this session are:\n",
+    "* Understanding the need for preprocessing\n",
+    "* Understanding different preprocessing techniques\n",
+    "* Experimenting with several environments for preprocessing"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Table of Contents"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "1. [Home](00_Intro_Preprocessing.ipynb)\n",
+    "3. [Initial Check](02_Initial_Check.ipynb)\n",
+    "4. [Filter Data](03_Filter_Data.ipynb)\n",
+    "5. [Unknown values](04_Unknown_Values.ipynb)\n",
+    "6. [Duplicated values](05_Duplicated_Values.ipynb)\n",
+    "7. [Rescaling Data](06_Rescaling_Data.ipynb)\n",
+    "8. [Binarize Data](07_Binarize_Data.ipynb)\n",
+    "9. [Categorial features](08_Categorical.ipynb)\n",
+    "10. [String Data](09_String_Data.ipynb)\n",
+    "12. [Handy libraries for preprocessing](11_0_Handy.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/02_Initial_Check.ipynb
+++ b/ml21/preprocessing/02_Initial_Check.ipynb
@@ -0,0 +1,714 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# Initial Check with Pandas\n",
+    "\n",
+    "We can start with a quick quality check."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Load and check data\n",
+    "Check which data you are loading."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>PassengerId</th>\n",
+       "      <th>Survived</th>\n",
+       "      <th>Pclass</th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>SibSp</th>\n",
+       "      <th>Parch</th>\n",
+       "      <th>Ticket</th>\n",
+       "      <th>Fare</th>\n",
+       "      <th>Cabin</th>\n",
+       "      <th>Embarked</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Braund, Mr. Owen Harris</td>\n",
+       "      <td>male</td>\n",
+       "      <td>22.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>A/5 21171</td>\n",
+       "      <td>7.2500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
+       "      <td>female</td>\n",
+       "      <td>38.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>PC 17599</td>\n",
+       "      <td>71.2833</td>\n",
+       "      <td>C85</td>\n",
+       "      <td>C</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>3</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Heikkinen, Miss. Laina</td>\n",
+       "      <td>female</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>STON/O2. 3101282</td>\n",
+       "      <td>7.9250</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
+       "      <td>female</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>113803</td>\n",
+       "      <td>53.1000</td>\n",
+       "      <td>C123</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>5</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Allen, Mr. William Henry</td>\n",
+       "      <td>male</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>373450</td>\n",
+       "      <td>8.0500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>6</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Moran, Mr. James</td>\n",
+       "      <td>male</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>330877</td>\n",
+       "      <td>8.4583</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Q</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>7</td>\n",
+       "      <td>0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>McCarthy, Mr. Timothy J</td>\n",
+       "      <td>male</td>\n",
+       "      <td>54.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>17463</td>\n",
+       "      <td>51.8625</td>\n",
+       "      <td>E46</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7</th>\n",
+       "      <td>8</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Palsson, Master. Gosta Leonard</td>\n",
+       "      <td>male</td>\n",
+       "      <td>2.0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>1</td>\n",
+       "      <td>349909</td>\n",
+       "      <td>21.0750</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>8</th>\n",
+       "      <td>9</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)</td>\n",
+       "      <td>female</td>\n",
+       "      <td>27.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>347742</td>\n",
+       "      <td>11.1333</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>9</th>\n",
+       "      <td>10</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>Nasser, Mrs. Nicholas (Adele Achem)</td>\n",
+       "      <td>female</td>\n",
+       "      <td>14.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>237736</td>\n",
+       "      <td>30.0708</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>C</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   PassengerId  Survived  Pclass  \\\n",
+       "0            1         0       3   \n",
+       "1            2         1       1   \n",
+       "2            3         1       3   \n",
+       "3            4         1       1   \n",
+       "4            5         0       3   \n",
+       "5            6         0       3   \n",
+       "6            7         0       1   \n",
+       "7            8         0       3   \n",
+       "8            9         1       3   \n",
+       "9           10         1       2   \n",
+       "\n",
+       "                                                Name     Sex   Age  SibSp  \\\n",
+       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
+       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
+       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
+       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
+       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
+       "5                                   Moran, Mr. James    male   NaN      0   \n",
+       "6                            McCarthy, Mr. Timothy J    male  54.0      0   \n",
+       "7                     Palsson, Master. Gosta Leonard    male   2.0      3   \n",
+       "8  Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)  female  27.0      0   \n",
+       "9                Nasser, Mrs. Nicholas (Adele Achem)  female  14.0      1   \n",
+       "\n",
+       "   Parch            Ticket     Fare Cabin Embarked  \n",
+       "0      0         A/5 21171   7.2500   NaN        S  \n",
+       "1      0          PC 17599  71.2833   C85        C  \n",
+       "2      0  STON/O2. 3101282   7.9250   NaN        S  \n",
+       "3      0            113803  53.1000  C123        S  \n",
+       "4      0            373450   8.0500   NaN        S  \n",
+       "5      0            330877   8.4583   NaN        Q  \n",
+       "6      0             17463  51.8625   E46        S  \n",
+       "7      1            349909  21.0750   NaN        S  \n",
+       "8      2            347742  11.1333   NaN        S  \n",
+       "9      0            237736  30.0708   NaN        C  "
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "df = pd.read_csv('https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv')\n",
+    "df.head(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# Check number of columns and rows"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(891, 12)"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df.shape"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Check names and types of columns\n",
+    "Check the data and type, for example if dates are of strings or what."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',\n",
+      "       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],\n",
+      "      dtype='object')\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "PassengerId      int64\n",
+       "Survived         int64\n",
+       "Pclass           int64\n",
+       "Name            object\n",
+       "Sex             object\n",
+       "Age            float64\n",
+       "SibSp            int64\n",
+       "Parch            int64\n",
+       "Ticket          object\n",
+       "Fare           float64\n",
+       "Cabin           object\n",
+       "Embarked        object\n",
+       "dtype: object"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Get column names\n",
+    "print(df.columns)\n",
+    "# Get column data types\n",
+    "df.dtypes"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "## Check if the column is unique"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "PassengerId is unique: True\n",
+      "Survived is unique: False\n",
+      "Pclass is unique: False\n",
+      "Name is unique: True\n",
+      "Sex is unique: False\n",
+      "Age is unique: False\n",
+      "SibSp is unique: False\n",
+      "Parch is unique: False\n",
+      "Ticket is unique: False\n",
+      "Fare is unique: False\n",
+      "Cabin is unique: False\n",
+      "Embarked is unique: False\n"
+     ]
+    }
+   ],
+   "source": [
+    "for i in column_names:\n",
+    "    print('{} is unique: {}'.format(i, df[i].is_unique))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Check if the dataframe has an index\n",
+    "We will need it to do joins or merges."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "RangeIndex(start=0, stop=891, step=1)"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# check if there is an index. If not,  you will get 'AtributeError: function object has no atribute index'\n",
+    "df.index"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,\n",
+       "        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,\n",
+       "        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,\n",
+       "        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,\n",
+       "        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,\n",
+       "        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,\n",
+       "        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,\n",
+       "        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,\n",
+       "       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,\n",
+       "       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,\n",
+       "       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,\n",
+       "       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,\n",
+       "       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,\n",
+       "       169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181,\n",
+       "       182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194,\n",
+       "       195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207,\n",
+       "       208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220,\n",
+       "       221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233,\n",
+       "       234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246,\n",
+       "       247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259,\n",
+       "       260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272,\n",
+       "       273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285,\n",
+       "       286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298,\n",
+       "       299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311,\n",
+       "       312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324,\n",
+       "       325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337,\n",
+       "       338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350,\n",
+       "       351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363,\n",
+       "       364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376,\n",
+       "       377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389,\n",
+       "       390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402,\n",
+       "       403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415,\n",
+       "       416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428,\n",
+       "       429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441,\n",
+       "       442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454,\n",
+       "       455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467,\n",
+       "       468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480,\n",
+       "       481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493,\n",
+       "       494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506,\n",
+       "       507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519,\n",
+       "       520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532,\n",
+       "       533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545,\n",
+       "       546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558,\n",
+       "       559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571,\n",
+       "       572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584,\n",
+       "       585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597,\n",
+       "       598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610,\n",
+       "       611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623,\n",
+       "       624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636,\n",
+       "       637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649,\n",
+       "       650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662,\n",
+       "       663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675,\n",
+       "       676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688,\n",
+       "       689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701,\n",
+       "       702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714,\n",
+       "       715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727,\n",
+       "       728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740,\n",
+       "       741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753,\n",
+       "       754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766,\n",
+       "       767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779,\n",
+       "       780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792,\n",
+       "       793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805,\n",
+       "       806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818,\n",
+       "       819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831,\n",
+       "       832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844,\n",
+       "       845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857,\n",
+       "       858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870,\n",
+       "       871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883,\n",
+       "       884, 885, 886, 887, 888, 889, 890])"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# # Check the index values\n",
+    "df.index.values"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# If index does not exist\n",
+    "df.set_index('column_name_to_use', inplace=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "PassengerId      0\n",
+       "Survived         0\n",
+       "Pclass           0\n",
+       "Name             0\n",
+       "Sex              0\n",
+       "Age            177\n",
+       "SibSp            0\n",
+       "Parch            0\n",
+       "Ticket           0\n",
+       "Fare             0\n",
+       "Cabin          687\n",
+       "Embarked         2\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Count missing vales per column\n",
+    "df.isnull().sum()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/03_Filter_Data.ipynb
+++ b/ml21/preprocessing/03_Filter_Data.ipynb
@@ -0,0 +1,150 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# Filter Data\n",
+    "\n",
+    "Select the columns you want and delete the others."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "# Create list comprehension of the columns you want to lose\n",
+    "columns_to_drop = [column_names[i] for i in [1, 3, 5]]\n",
+    "# Drop unwanted columns \n",
+    "df.drop(columns_to_drop, inplace=True, axis=1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/04_Unknown_Values.ipynb
+++ b/ml21/preprocessing/04_Unknown_Values.ipynb
@@ -0,0 +1,591 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# Unknown values\n",
+    "\n",
+    "Two possible approaches are **remove** these rows or **fill** them. It depends on every case."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Filling NaN values\n",
+    "If we need to fill errors or blanks, we can use the methods **fillna()** or **dropna()**.\n",
+    "\n",
+    "* For **string** fields, we can fill NaN with **' '**.\n",
+    "\n",
+    "* For **numbers**, we can fill with the **mean** or **median** value. \n"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# Fill NaN with ' '\n",
+    "df['col'] = df['col'].fillna(' ')\n",
+    "# Fill NaN with 99\n",
+    "df['col'] = df['col'].fillna(99)\n",
+    "# Fill NaN with the mean of the column\n",
+    "df['col'] = df['col'].fillna(df['col'].mean())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Propagate non-null values forward or backward\n",
+    "You can also **propagate** non-null values with these methods:\n",
+    "\n",
+    "* **ffill**: Fill values by propagating the last valid observation to the next valid.\n",
+    "* **bfill**:  Fill values using the following valid observation to fill the gap.\n",
+    "* **interpolate**:  Fill NaN values using interpolation.\n",
+    "\n",
+    "It will fill the next value in the dataframe with the previous non-NaN value. \n",
+    "\n",
+    "You may want to fill in one value (**limit=1**) or all the values. You can also indicate inplace=True to fill in-place."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.DataFrame(data={'col1':[np.nan, np.nan, 2,3,4, np.nan, np.nan]})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "0   NaN\n",
+       "1   NaN\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0\n",
+       "5   NaN\n",
+       "6   NaN"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We fill forward the value 4.0 and fill the next one (limit = 1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "0   NaN\n",
+       "1   NaN\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0\n",
+       "5   4.0\n",
+       "6   NaN"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    " df.ffill(limit = 1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.ffill()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can also backfilling with **bfill**. Since we do not include *limit*, we fill all the values."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "0   2.0\n",
+       "1   2.0\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0\n",
+       "5   NaN\n",
+       "6   NaN"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df.bfill()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Removing NaN values\n",
+    "We can remove them by row or column (use inplace=True if you want to modify the DataFrame)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0"
+      ]
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Drop any rows which have any nans\n",
+    "df1 = df.dropna()\n",
+    "# Drop columns that have any nans (axis = 1 -> drop columns, axis = 0 -> drop rows)\n",
+    "df2 = df.dropna(axis=1)\n",
+    "# Only drop columns which have at least 90% non-NaNs \n",
+    "df3 = df.dropna(thresh=int(df.shape[0] * .9), axis=1)\n",
+    "df1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/05_Duplicated_Values.ipynb
+++ b/ml21/preprocessing/05_Duplicated_Values.ipynb
--- a/ml21/preprocessing/06_Rescaling_Data.ipynb
+++ b/ml21/preprocessing/06_Rescaling_Data.ipynb
--- a/ml21/preprocessing/07_Binarize_Data.ipynb
+++ b/ml21/preprocessing/07_Binarize_Data.ipynb
@@ -0,0 +1,198 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Binarize Data\n",
+    "* We can transform our data using a binary threshold. All values above the threshold are marked 1, and all values equal to or below are marked 0.\n",
+    "* This is called binarizing your data or thresholding your data. \n",
+    "\n",
+    "* It can be helpful when you have probabilities that you want to make crisp values."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Binarize Data with Scikit-Learn\n",
+    "We can create new binary attributes in Python using Scikit-learn with the Binarizer class.\n",
+    "I"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from sklearn.preprocessing import Binarizer\n",
+    "\n",
+    "X = [[ 1., -1.,  2.],\n",
+    "     [ 2.,  0.,  0.],\n",
+    "     [ 0.,  1.1, -1.]]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "transformer = Binarizer(threshold=1.0).fit(X) # threshold 1.0"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([[0., 0., 1.],\n",
+       "       [1., 0., 0.],\n",
+       "       [0., 1., 0.]])"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "transformer.transform(X)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Binarizer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Binarizer.html), Scikit Learn"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/08_Categorical.ipynb
+++ b/ml21/preprocessing/08_Categorical.ipynb
@@ -0,0 +1,812 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Categorical Data\n",
+    "\n",
+    "For many ML algorithms, we need to transform categorical data into numbers.\n",
+    "\n",
+    "For example:\n",
+    "* **'Sex'** with values *'M'*, *'F'*, *'Unknown'*. \n",
+    "* **'Position'** with values 'phD', *'Professor'*, *'TA'*, *'graduate'*.\n",
+    "* **'Temperature'** with values *'low'*, *'medium'*, *'high'*.\n",
+    "\n",
+    "There are two main approaches:\n",
+    "* Integer encoding\n",
+    "* One hot encoding"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Integer Encoding\n",
+    "We assign a number to every value:\n",
+    "\n",
+    "['M', 'F', 'Unknown', 'M'] --> [0, 1, 2, 0]\n",
+    "\n",
+    "['phD', 'Professor', 'TA','graduate', 'phD'] --> [0, 1, 2, 3, 0]\n",
+    "\n",
+    "['low', 'medium', 'high', 'low'] --> [0, 1, 2, 0]\n",
+    "\n",
+    "The main problem with this representation is integers have a natural order, and some ML algorithms can be confused. \n",
+    "\n",
+    "In our examples, this representation can be suitable for **temperature**, but not for the other two."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## One Hot Encoding\n",
+    "A binary column is created for each value of the categorical variable."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "Sex                                               M  F U\n",
+    "-----                                            ---------\n",
+    "M                                                 1  0 0\n",
+    "F                     is transformed into         0  1 0\n",
+    "Unknown                                           0  0 1\n",
+    "M                                                 1  0 0 "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Transforming categorical data  with Scikit-Learn\n",
+    "\n",
+    "We can use:\n",
+    "* **get_dummies()** (one hot encoding)\n",
+    "* **LabelEncoder** (integer encoding) and **OneHotEncoder** (one hot encoding). \n",
+    "\n",
+    "We are going to learn the first approach."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### One Hot Encoding\n",
+    "We can use Pandas (*get_dummies*) or Scikit-Learn (*OneHotEncoder*)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "     Name  Age     Sex   Position\n",
+      "0  Marius   18    Male   graduate\n",
+      "1   Maria   19  Female  professor\n",
+      "2    John   20    Male         TA\n",
+      "3   Carla   30  Female        phD\n"
+     ]
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "data = {\"Name\": [\"Marius\", \"Maria\", \"John\", \"Carla\"],\n",
+    "        \"Age\": [18, 19, 20, 30],\n",
+    "\t\t\"Sex\": [\"Male\", \"Female\", \"Male\", \"Female\"],\n",
+    "        \"Position\": [\"graduate\", \"professor\", \"TA\", \"phD\"]\n",
+    "       }\n",
+    "df = pd.DataFrame(data)\n",
+    "print(df)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>sex_encoded</th>\n",
+       "      <th>position_encoded</th>\n",
+       "      <th>Sex_Female</th>\n",
+       "      <th>Sex_Male</th>\n",
+       "      <th>Position_TA</th>\n",
+       "      <th>Position_graduate</th>\n",
+       "      <th>Position_phD</th>\n",
+       "      <th>Position_professor</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Marius</td>\n",
+       "      <td>18</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Maria</td>\n",
+       "      <td>19</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>John</td>\n",
+       "      <td>20</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Carla</td>\n",
+       "      <td>30</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Name  Age  sex_encoded  position_encoded  Sex_Female  Sex_Male  \\\n",
+       "0  Marius   18            1                 1       False      True   \n",
+       "1   Maria   19            0                 3        True     False   \n",
+       "2    John   20            1                 0       False      True   \n",
+       "3   Carla   30            0                 2        True     False   \n",
+       "\n",
+       "   Position_TA  Position_graduate  Position_phD  Position_professor  \n",
+       "0        False               True         False               False  \n",
+       "1        False              False         False                True  \n",
+       "2         True              False         False               False  \n",
+       "3        False              False          True               False  "
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.\n"
+     ]
+    }
+   ],
+   "source": [
+    "df_onehot = pd.get_dummies(df, columns=['Sex', 'Position'])\n",
+    "df_onehot"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also use *OneHotEncoder* from Scikit."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Sex_Female</th>\n",
+       "      <th>Sex_Male</th>\n",
+       "      <th>Position_TA</th>\n",
+       "      <th>Position_graduate</th>\n",
+       "      <th>Position_phD</th>\n",
+       "      <th>Position_professor</th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>sex_encoded</th>\n",
+       "      <th>position_encoded</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>Marius</td>\n",
+       "      <td>18</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>Maria</td>\n",
+       "      <td>19</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>John</td>\n",
+       "      <td>20</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>Carla</td>\n",
+       "      <td>30</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "  Sex_Female Sex_Male Position_TA Position_graduate Position_phD  \\\n",
+       "0        0.0      1.0         0.0               1.0          0.0   \n",
+       "1        1.0      0.0         0.0               0.0          0.0   \n",
+       "2        0.0      1.0         1.0               0.0          0.0   \n",
+       "3        1.0      0.0         0.0               0.0          1.0   \n",
+       "\n",
+       "  Position_professor    Name Age sex_encoded position_encoded  \n",
+       "0                0.0  Marius  18           1                1  \n",
+       "1                1.0   Maria  19           0                3  \n",
+       "2                0.0    John  20           1                0  \n",
+       "3                0.0   Carla  30           0                2  "
+      ]
+     },
+     "execution_count": 27,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from sklearn.preprocessing import OneHotEncoder\n",
+    "from sklearn.compose import make_column_transformer\n",
+    "\n",
+    "df_onehotencoder = df\n",
+    "# create OneHotEncoder object\n",
+    "encoder = OneHotEncoder()\n",
+    "\n",
+    "# Transformer for several columns\n",
+    "transformer = make_column_transformer(\n",
+    "  (OneHotEncoder(), ['Sex', 'Position']),\n",
+    "  remainder='passthrough',\n",
+    "  verbose_feature_names_out=False)\n",
+    "\n",
+    "# transform\n",
+    "transformed = transformer.fit_transform(df_onehotencoder)\n",
+    "\n",
+    "df_onehotencoder = pd.DataFrame(\n",
+    "  transformed,\n",
+    "  columns=transformer.get_feature_names_out())\n",
+    "df_onehotencoder"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Pandas' get_dummy is easier for transforming DataFrames. OneHotEncoder is more efficient and can be good for integrating the step in a machine learning pipeline."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Integer encoding\n",
+    "We will use **LabelEncoder**. It is possible to get the original values with *inverse_transform*. See [LabelEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Position</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Marius</td>\n",
+       "      <td>18</td>\n",
+       "      <td>Male</td>\n",
+       "      <td>graduate</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Maria</td>\n",
+       "      <td>19</td>\n",
+       "      <td>Female</td>\n",
+       "      <td>professor</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>John</td>\n",
+       "      <td>20</td>\n",
+       "      <td>Male</td>\n",
+       "      <td>TA</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Carla</td>\n",
+       "      <td>30</td>\n",
+       "      <td>Female</td>\n",
+       "      <td>phD</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Name  Age     Sex   Position\n",
+       "0  Marius   18    Male   graduate\n",
+       "1   Maria   19  Female  professor\n",
+       "2    John   20    Male         TA\n",
+       "3   Carla   30  Female        phD"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from sklearn.preprocessing import LabelEncoder\n",
+    "# creating instance of labelencoder\n",
+    "labelencoder = LabelEncoder()\n",
+    "df_encoded = df\n",
+    "# Assigning numerical values and storing in another column\n",
+    "sex_values = ('Male', 'Female')\n",
+    "position_values = ('graduate', 'professor', 'TA', 'phD')\n",
+    "df_encoded"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Position</th>\n",
+       "      <th>sex_encoded</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Marius</td>\n",
+       "      <td>18</td>\n",
+       "      <td>Male</td>\n",
+       "      <td>graduate</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Maria</td>\n",
+       "      <td>19</td>\n",
+       "      <td>Female</td>\n",
+       "      <td>professor</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>John</td>\n",
+       "      <td>20</td>\n",
+       "      <td>Male</td>\n",
+       "      <td>TA</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Carla</td>\n",
+       "      <td>30</td>\n",
+       "      <td>Female</td>\n",
+       "      <td>phD</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Name  Age     Sex   Position  sex_encoded\n",
+       "0  Marius   18    Male   graduate            1\n",
+       "1   Maria   19  Female  professor            0\n",
+       "2    John   20    Male         TA            1\n",
+       "3   Carla   30  Female        phD            0"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df_encoded['sex_encoded'] = labelencoder.fit_transform(df_encoded['Sex'])\n",
+    "df_encoded"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Position</th>\n",
+       "      <th>sex_encoded</th>\n",
+       "      <th>position_encoded</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Marius</td>\n",
+       "      <td>18</td>\n",
+       "      <td>Male</td>\n",
+       "      <td>graduate</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Maria</td>\n",
+       "      <td>19</td>\n",
+       "      <td>Female</td>\n",
+       "      <td>professor</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>John</td>\n",
+       "      <td>20</td>\n",
+       "      <td>Male</td>\n",
+       "      <td>TA</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Carla</td>\n",
+       "      <td>30</td>\n",
+       "      <td>Female</td>\n",
+       "      <td>phD</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Name  Age     Sex   Position  sex_encoded  position_encoded\n",
+       "0  Marius   18    Male   graduate            1                 1\n",
+       "1   Maria   19  Female  professor            0                 3\n",
+       "2    John   20    Male         TA            1                 0\n",
+       "3   Carla   30  Female        phD            0                 2"
+      ]
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df_encoded['position_encoded'] = labelencoder.fit_transform(df_encoded['Position'])\n",
+    "df_encoded"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Binarizer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Binarizer.html), Scikit Learn"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/09_String_Data.ipynb
+++ b/ml21/preprocessing/09_String_Data.ipynb
@@ -0,0 +1,652 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# String Data\n",
+    "It is widespread to clean string columns to follow a predefined format (e.g., emails, URLs, ...).\n",
+    "\n",
+    "We can do it using regular expressions or specific libraries."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Beautifier\n",
+    "A simple [library](https://github.com/labtocat/beautifier) to cleanup and prettify URL patterns, domains, and so on. The library helps to clean Unicode, special characters, and unnecessary redirection patterns from the URLs and gives you a clean date.\n",
+    "\n",
+    "Install with **'pip install beautifier'**."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Email cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from beautifier import Email\n",
+    "email = Email('me@imsach.in')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'imsach.in'"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email.domain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'me'"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email.username"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email.is_free_email"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "email2 = Email('This my address')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email2.is_valid"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "email3 = Email('pepe@gmail.com')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email3.is_valid"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email3.is_free_email"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## URL cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from beautifier import Url\n",
+    "url = Url('https://in.linkedin.com/in/sachinphilip?authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'https://in.linkedin.com/in/sachinphilip'"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'in.linkedin.com'"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.domain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['authtoken=887nasdadasd6hasdtg21', 'secret=98jy766yhhuhnjk']"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.param"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk'"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'sachinphilip'"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.username"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Unicode\n",
+    "Problem: Some unicode code has been broken. We see the character in a different character dataset.\n",
+    "\n",
+    "A **mojibake** is a character displayed in an unintended character encoding. Example:  \"<22>\").\n",
+    "\n",
+    "We will use the library **ftfy** (fixed text for you) to fix it.\n",
+    "\n",
+    "First, you should install the library: **conda install ftfy** (or **pip install ftfy**)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "¯\\_(ツ)_/¯\n",
+      "Party\n",
+      "I'm\n"
+     ]
+    }
+   ],
+   "source": [
+    "import ftfy\n",
+    "foo = '&macr;\\\\_(ã\\x83\\x84)_/&macr;'\n",
+    "bar = '\\ufeffParty'\n",
+    "baz = '\\001\\033[36;44mI&#x92;m'\n",
+    "print(ftfy.fix_text(foo))\n",
+    "print(ftfy.fix_text(bar))\n",
+    "print(ftfy.fix_text(baz))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can understand which heuristics ftfy is using."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "U+0026  &       [Po] AMPERSAND\n",
+      "U+006D  m       [Ll] LATIN SMALL LETTER M\n",
+      "U+0061  a       [Ll] LATIN SMALL LETTER A\n",
+      "U+0063  c       [Ll] LATIN SMALL LETTER C\n",
+      "U+0072  r       [Ll] LATIN SMALL LETTER R\n",
+      "U+003B  ;       [Po] SEMICOLON\n",
+      "U+005C  \\       [Po] REVERSE SOLIDUS\n",
+      "U+005F  _       [Pc] LOW LINE\n",
+      "U+0028  (       [Ps] LEFT PARENTHESIS\n",
+      "U+00E3  ã       [Ll] LATIN SMALL LETTER A WITH TILDE\n",
+      "U+0083  \\x83    [Cc] <unknown>\n",
+      "U+0084  \\x84    [Cc] <unknown>\n",
+      "U+0029  )       [Pe] RIGHT PARENTHESIS\n",
+      "U+005F  _       [Pc] LOW LINE\n",
+      "U+002F  /       [Po] SOLIDUS\n",
+      "U+0026  &       [Po] AMPERSAND\n",
+      "U+006D  m       [Ll] LATIN SMALL LETTER M\n",
+      "U+0061  a       [Ll] LATIN SMALL LETTER A\n",
+      "U+0063  c       [Ll] LATIN SMALL LETTER C\n",
+      "U+0072  r       [Ll] LATIN SMALL LETTER R\n",
+      "U+003B  ;       [Po] SEMICOLON\n"
+     ]
+    }
+   ],
+   "source": [
+    "ftfy.explain_unicode(foo)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Dates\n",
+    "Sometimes we want to extract date from text. We can use regular expressions or handy packages, such as [**python-dateutil**](https://dateutil.readthedocs.io/en/stable/). An alternative is [arrow](https://arrow.readthedocs.io/en/latest/).\n",
+    "\n",
+    "Install the library: **pip install python-dateutil**."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2019-08-22 10:22:46+00:00\n"
+     ]
+    }
+   ],
+   "source": [
+    "from dateutil.parser import parse\n",
+    "now = parse(\"Thu Aug 22 10:22:46 UTC 2019\")\n",
+    "print(now)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2019-08-08 10:20:00\n"
+     ]
+    }
+   ],
+   "source": [
+    "dt = parse(\"Today is Thursday 8, 2019 at 10:20:00AM\", fuzzy=True)\n",
+    "print(dt)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/), , A. Sharma, 2018.\n",
+    "* [Beautifier](https://github.com/labtocat/beautifier) package\n",
+    "* [Ftfy](https://ftfy.readthedocs.io/en/latest/) package\n",
+    "* [python-dateutil](https://dateutil.readthedocs.io/en/stable/)package"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/11_0_Handy.ipynb
+++ b/ml21/preprocessing/11_0_Handy.ipynb
@@ -0,0 +1,139 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "#  Handy libraries\n",
+    "Libraries that help in several preprocessing tasks.\n",
+    "\n",
+    "* [datacleaner](11_1_datacleaner.ipynb)\n",
+    "* [autoclean](11_3_autoclean.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/), A. Sharma, 2018.\n",
+    "* [Handy Python Libraries for Formatting and Cleaning Data](https://mode.com/blog/python-data-cleaning-libraries),  M. Bierly, 2016\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": false
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/11_1_datacleaner.ipynb
+++ b/ml21/preprocessing/11_1_datacleaner.ipynb
@@ -0,0 +1,673 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# Datacleaner\n",
+    "[Datacleaner](https://github.com/rhiever/datacleaner) supports:\n",
+    "\n",
+    "* drop rows with missing values\n",
+    "* replace missing values with the mode or median on a column-by-column basis\n",
+    "* encode non-numeric variables with numerical equivalents\n",
+    "\n",
+    "\n",
+    "Install with\n",
+    "\n",
+    "**pip install datacleaner**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>PassengerId</th>\n",
+       "      <th>Survived</th>\n",
+       "      <th>Pclass</th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>SibSp</th>\n",
+       "      <th>Parch</th>\n",
+       "      <th>Ticket</th>\n",
+       "      <th>Fare</th>\n",
+       "      <th>Cabin</th>\n",
+       "      <th>Embarked</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Braund, Mr. Owen Harris</td>\n",
+       "      <td>male</td>\n",
+       "      <td>22.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>A/5 21171</td>\n",
+       "      <td>7.2500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
+       "      <td>female</td>\n",
+       "      <td>38.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>PC 17599</td>\n",
+       "      <td>71.2833</td>\n",
+       "      <td>C85</td>\n",
+       "      <td>C</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>3</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Heikkinen, Miss. Laina</td>\n",
+       "      <td>female</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>STON/O2. 3101282</td>\n",
+       "      <td>7.9250</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
+       "      <td>female</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>113803</td>\n",
+       "      <td>53.1000</td>\n",
+       "      <td>C123</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>5</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Allen, Mr. William Henry</td>\n",
+       "      <td>male</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>373450</td>\n",
+       "      <td>8.0500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>886</th>\n",
+       "      <td>887</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>Montvila, Rev. Juozas</td>\n",
+       "      <td>male</td>\n",
+       "      <td>27.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>211536</td>\n",
+       "      <td>13.0000</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>887</th>\n",
+       "      <td>888</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Graham, Miss. Margaret Edith</td>\n",
+       "      <td>female</td>\n",
+       "      <td>19.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>112053</td>\n",
+       "      <td>30.0000</td>\n",
+       "      <td>B42</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>888</th>\n",
+       "      <td>889</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
+       "      <td>female</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>W./C. 6607</td>\n",
+       "      <td>23.4500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>889</th>\n",
+       "      <td>890</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Behr, Mr. Karl Howell</td>\n",
+       "      <td>male</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>111369</td>\n",
+       "      <td>30.0000</td>\n",
+       "      <td>C148</td>\n",
+       "      <td>C</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>890</th>\n",
+       "      <td>891</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Dooley, Mr. Patrick</td>\n",
+       "      <td>male</td>\n",
+       "      <td>32.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>370376</td>\n",
+       "      <td>7.7500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Q</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>891 rows × 12 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     PassengerId  Survived  Pclass  \\\n",
+       "0              1         0       3   \n",
+       "1              2         1       1   \n",
+       "2              3         1       3   \n",
+       "3              4         1       1   \n",
+       "4              5         0       3   \n",
+       "..           ...       ...     ...   \n",
+       "886          887         0       2   \n",
+       "887          888         1       1   \n",
+       "888          889         0       3   \n",
+       "889          890         1       1   \n",
+       "890          891         0       3   \n",
+       "\n",
+       "                                                  Name     Sex   Age  SibSp  \\\n",
+       "0                              Braund, Mr. Owen Harris    male  22.0      1   \n",
+       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
+       "2                               Heikkinen, Miss. Laina  female  26.0      0   \n",
+       "3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
+       "4                             Allen, Mr. William Henry    male  35.0      0   \n",
+       "..                                                 ...     ...   ...    ...   \n",
+       "886                              Montvila, Rev. Juozas    male  27.0      0   \n",
+       "887                       Graham, Miss. Margaret Edith  female  19.0      0   \n",
+       "888           Johnston, Miss. Catherine Helen \"Carrie\"  female   NaN      1   \n",
+       "889                              Behr, Mr. Karl Howell    male  26.0      0   \n",
+       "890                                Dooley, Mr. Patrick    male  32.0      0   \n",
+       "\n",
+       "     Parch            Ticket     Fare Cabin Embarked  \n",
+       "0        0         A/5 21171   7.2500   NaN        S  \n",
+       "1        0          PC 17599  71.2833   C85        C  \n",
+       "2        0  STON/O2. 3101282   7.9250   NaN        S  \n",
+       "3        0            113803  53.1000  C123        S  \n",
+       "4        0            373450   8.0500   NaN        S  \n",
+       "..     ...               ...      ...   ...      ...  \n",
+       "886      0            211536  13.0000   NaN        S  \n",
+       "887      0            112053  30.0000   B42        S  \n",
+       "888      2        W./C. 6607  23.4500   NaN        S  \n",
+       "889      0            111369  30.0000  C148        C  \n",
+       "890      0            370376   7.7500   NaN        Q  \n",
+       "\n",
+       "[891 rows x 12 columns]"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "\n",
+    "from datacleaner import autoclean\n",
+    "\n",
+    "df = pd.read_csv('https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv')\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>PassengerId</th>\n",
+       "      <th>Survived</th>\n",
+       "      <th>Pclass</th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>SibSp</th>\n",
+       "      <th>Parch</th>\n",
+       "      <th>Ticket</th>\n",
+       "      <th>Fare</th>\n",
+       "      <th>Cabin</th>\n",
+       "      <th>Embarked</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>108</td>\n",
+       "      <td>1</td>\n",
+       "      <td>22.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>523</td>\n",
+       "      <td>7.2500</td>\n",
+       "      <td>47</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>190</td>\n",
+       "      <td>0</td>\n",
+       "      <td>38.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>596</td>\n",
+       "      <td>71.2833</td>\n",
+       "      <td>81</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>3</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>353</td>\n",
+       "      <td>0</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>669</td>\n",
+       "      <td>7.9250</td>\n",
+       "      <td>47</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>272</td>\n",
+       "      <td>0</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>49</td>\n",
+       "      <td>53.1000</td>\n",
+       "      <td>55</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>5</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>15</td>\n",
+       "      <td>1</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>472</td>\n",
+       "      <td>8.0500</td>\n",
+       "      <td>47</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>886</th>\n",
+       "      <td>887</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>548</td>\n",
+       "      <td>1</td>\n",
+       "      <td>27.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>101</td>\n",
+       "      <td>13.0000</td>\n",
+       "      <td>47</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>887</th>\n",
+       "      <td>888</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>303</td>\n",
+       "      <td>0</td>\n",
+       "      <td>19.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>14</td>\n",
+       "      <td>30.0000</td>\n",
+       "      <td>30</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>888</th>\n",
+       "      <td>889</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>413</td>\n",
+       "      <td>0</td>\n",
+       "      <td>28.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>675</td>\n",
+       "      <td>23.4500</td>\n",
+       "      <td>47</td>\n",
+       "      <td>2</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>889</th>\n",
+       "      <td>890</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>81</td>\n",
+       "      <td>1</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>8</td>\n",
+       "      <td>30.0000</td>\n",
+       "      <td>60</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>890</th>\n",
+       "      <td>891</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>220</td>\n",
+       "      <td>1</td>\n",
+       "      <td>32.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>466</td>\n",
+       "      <td>7.7500</td>\n",
+       "      <td>47</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>891 rows × 12 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     PassengerId  Survived  Pclass  Name  Sex   Age  SibSp  Parch  Ticket  \\\n",
+       "0              1         0       3   108    1  22.0      1      0     523   \n",
+       "1              2         1       1   190    0  38.0      1      0     596   \n",
+       "2              3         1       3   353    0  26.0      0      0     669   \n",
+       "3              4         1       1   272    0  35.0      1      0      49   \n",
+       "4              5         0       3    15    1  35.0      0      0     472   \n",
+       "..           ...       ...     ...   ...  ...   ...    ...    ...     ...   \n",
+       "886          887         0       2   548    1  27.0      0      0     101   \n",
+       "887          888         1       1   303    0  19.0      0      0      14   \n",
+       "888          889         0       3   413    0  28.0      1      2     675   \n",
+       "889          890         1       1    81    1  26.0      0      0       8   \n",
+       "890          891         0       3   220    1  32.0      0      0     466   \n",
+       "\n",
+       "        Fare  Cabin  Embarked  \n",
+       "0     7.2500     47         2  \n",
+       "1    71.2833     81         0  \n",
+       "2     7.9250     47         2  \n",
+       "3    53.1000     55         2  \n",
+       "4     8.0500     47         2  \n",
+       "..       ...    ...       ...  \n",
+       "886  13.0000     47         2  \n",
+       "887  30.0000     30         2  \n",
+       "888  23.4500     47         2  \n",
+       "889  30.0000     60         0  \n",
+       "890   7.7500     47         1  \n",
+       "\n",
+       "[891 rows x 12 columns]"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df_clean = autoclean(df, copy=True)\n",
+    "df_clean"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/), A. Sharma, 2018.\n",
+    "* [Handy Python Libraries for Formatting and Cleaning Data](https://mode.com/blog/python-data-cleaning-libraries),  M. Bierly, 2016\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "datacleaner": {
+   "position": {
+    "top": "50px"
+   },
+   "python": {
+    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
+   },
+   "window_display": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/ml21/preprocessing/11_3_autoclean.ipynb
+++ b/ml21/preprocessing/11_3_autoclean.ipynb
@@ -0,0 +1,578 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "849ad57e-6adb-4c2e-afd6-73db37eef572",
+   "metadata": {},
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "179cc802-9f1d-40b0-bf0c-9d4fb7ea1262",
+   "metadata": {},
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9858d815-0390-4e77-a5ff-a8d2a1960981",
+   "metadata": {},
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "238bab60-75f0-4d29-ab05-66afc463b506",
+   "metadata": {},
+   "source": [
+    "# Autoclean\n",
+    "A simple library to clean data. [Autoclean](https://github.com/elisemercury/AutoClean) supports:\n",
+    "AutoClean supports:\n",
+    "\n",
+    "* Handling of duplicates\n",
+    "* Various imputation methods for missing values\n",
+    "* Handling of outliers\n",
+    "* Encoding of categorical data (OneHot, Label)\n",
+    "* Extraction of data time values\n",
+    "\n",
+    "Install the package: **pip install py-AutoClean**.\n",
+    "\n",
+    "Parameters:\n",
+    "\n",
+    "* **duplicates**\n",
+    "    *  default: False,\n",
+    "    *  other values: 'auto', True\n",
+    "* **missing_num**\n",
+    "    * default:False,\n",
+    "    * other values:\t'auto', 'linreg', 'knn', 'mean', 'median', 'most_frequent', 'delete', False\n",
+    "* **missing_categ**\n",
+    "    * default: False,\n",
+    "    * other values:\t'auto', 'logreg', 'knn', 'most_frequent', 'delete', False\n",
+    "* **encode_categ**\n",
+    "    * default: False,\n",
+    "    * other values:\t'auto', ['onehot'], ['label'], False ; to encode only specific columns add a list of column names or indexes: ['auto', ['col1', 2]]\n",
+    "* **extract_datetime**\n",
+    "    * default:\tFalse,\n",
+    "    * other values:\t'auto', 'D', 'M', 'Y', 'h', 'm', 's'\n",
+    "* **outliers**\n",
+    "    * default:\tFalse,\n",
+    "    * other values:\t'auto', 'winz', 'delete'\n",
+    "* **outlier_param**\tdefault:\t1.5,  other values:\tany int or float, False\n",
+    "* **logfile**\n",
+    "    * default: True,\n",
+    "    * other values:\tFalse\n",
+    "* **verbose**\n",
+    "    * default: False,\n",
+    "    * other values:\tTrue"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "id": "491b034b-994e-4f06-b4bc-df0590a62aab",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>PassengerId</th>\n",
+       "      <th>Survived</th>\n",
+       "      <th>Pclass</th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>SibSp</th>\n",
+       "      <th>Parch</th>\n",
+       "      <th>Ticket</th>\n",
+       "      <th>Fare</th>\n",
+       "      <th>Cabin</th>\n",
+       "      <th>Embarked</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Braund, Mr. Owen Harris</td>\n",
+       "      <td>male</td>\n",
+       "      <td>22.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>A/5 21171</td>\n",
+       "      <td>7.2500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
+       "      <td>female</td>\n",
+       "      <td>38.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>PC 17599</td>\n",
+       "      <td>71.2833</td>\n",
+       "      <td>C85</td>\n",
+       "      <td>C</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>3</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Heikkinen, Miss. Laina</td>\n",
+       "      <td>female</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>STON/O2. 3101282</td>\n",
+       "      <td>7.9250</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
+       "      <td>female</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>113803</td>\n",
+       "      <td>53.1000</td>\n",
+       "      <td>C123</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>5</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Allen, Mr. William Henry</td>\n",
+       "      <td>male</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>373450</td>\n",
+       "      <td>8.0500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>886</th>\n",
+       "      <td>887</td>\n",
+       "      <td>0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>Montvila, Rev. Juozas</td>\n",
+       "      <td>male</td>\n",
+       "      <td>27.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>211536</td>\n",
+       "      <td>13.0000</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>887</th>\n",
+       "      <td>888</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Graham, Miss. Margaret Edith</td>\n",
+       "      <td>female</td>\n",
+       "      <td>19.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>112053</td>\n",
+       "      <td>30.0000</td>\n",
+       "      <td>B42</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>888</th>\n",
+       "      <td>889</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
+       "      <td>female</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>1</td>\n",
+       "      <td>2</td>\n",
+       "      <td>W./C. 6607</td>\n",
+       "      <td>23.4500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>S</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>889</th>\n",
+       "      <td>890</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Behr, Mr. Karl Howell</td>\n",
+       "      <td>male</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>111369</td>\n",
+       "      <td>30.0000</td>\n",
+       "      <td>C148</td>\n",
+       "      <td>C</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>890</th>\n",
+       "      <td>891</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Dooley, Mr. Patrick</td>\n",
+       "      <td>male</td>\n",
+       "      <td>32.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>370376</td>\n",
+       "      <td>7.7500</td>\n",
+       "      <td>NaN</td>\n",
+       "      <td>Q</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>891 rows × 12 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     PassengerId  Survived  Pclass  \\\n",
+       "0              1         0       3   \n",
+       "1              2         1       1   \n",
+       "2              3         1       3   \n",
+       "3              4         1       1   \n",
+       "4              5         0       3   \n",
+       "..           ...       ...     ...   \n",
+       "886          887         0       2   \n",
+       "887          888         1       1   \n",
+       "888          889         0       3   \n",
+       "889          890         1       1   \n",
+       "890          891         0       3   \n",
+       "\n",
+       "                                                  Name     Sex   Age  SibSp  \\\n",
+       "0                              Braund, Mr. Owen Harris    male  22.0      1   \n",
+       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
+       "2                               Heikkinen, Miss. Laina  female  26.0      0   \n",
+       "3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
+       "4                             Allen, Mr. William Henry    male  35.0      0   \n",
+       "..                                                 ...     ...   ...    ...   \n",
+       "886                              Montvila, Rev. Juozas    male  27.0      0   \n",
+       "887                       Graham, Miss. Margaret Edith  female  19.0      0   \n",
+       "888           Johnston, Miss. Catherine Helen \"Carrie\"  female   NaN      1   \n",
+       "889                              Behr, Mr. Karl Howell    male  26.0      0   \n",
+       "890                                Dooley, Mr. Patrick    male  32.0      0   \n",
+       "\n",
+       "     Parch            Ticket     Fare Cabin Embarked  \n",
+       "0        0         A/5 21171   7.2500   NaN        S  \n",
+       "1        0          PC 17599  71.2833   C85        C  \n",
+       "2        0  STON/O2. 3101282   7.9250   NaN        S  \n",
+       "3        0            113803  53.1000  C123        S  \n",
+       "4        0            373450   8.0500   NaN        S  \n",
+       "..     ...               ...      ...   ...      ...  \n",
+       "886      0            211536  13.0000   NaN        S  \n",
+       "887      0            112053  30.0000   B42        S  \n",
+       "888      2        W./C. 6607  23.4500   NaN        S  \n",
+       "889      0            111369  30.0000  C148        C  \n",
+       "890      0            370376   7.7500   NaN        Q  \n",
+       "\n",
+       "[891 rows x 12 columns]"
+      ]
+     },
+     "execution_count": 29,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "\n",
+    "from AutoClean import AutoClean\n",
+    "\n",
+    "df = pd.read_csv('https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv')\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 36,
+   "id": "d842eedf-3971-4966-a8b4-543bb56dd60d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "AutoClean process completed in 0.289385 seconds\n",
+      "Logfile saved to: /home/cif/GoogleDrive/cursos/summer-school-romania/2019/notebooks/preprocessing/autoclean.log\n"
+     ]
+    }
+   ],
+   "source": [
+    "autoclean = AutoClean(df, mode='auto')\n",
+    "\n",
+    "# We can control the preprocessing\n",
+    "#autoclean = AutoClean(df, mode='auto', duplicates=False, missing_num=False, missing_categ=False, encode_categ=False, extract_datetime=False, outliers=False, outlier_param=1.5, logfile=True, verbose=False)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 38,
+   "id": "4ede7c55-475a-4748-8cc4-788f46c88b26",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>PassengerId</th>\n",
+       "      <th>Survived</th>\n",
+       "      <th>Pclass</th>\n",
+       "      <th>Name</th>\n",
+       "      <th>Sex</th>\n",
+       "      <th>Age</th>\n",
+       "      <th>SibSp</th>\n",
+       "      <th>Parch</th>\n",
+       "      <th>Ticket</th>\n",
+       "      <th>Fare</th>\n",
+       "      <th>Cabin</th>\n",
+       "      <th>Embarked</th>\n",
+       "      <th>Sex_female</th>\n",
+       "      <th>Sex_male</th>\n",
+       "      <th>Embarked_C</th>\n",
+       "      <th>Embarked_Q</th>\n",
+       "      <th>Embarked_S</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Braund, Mr. Owen Harris</td>\n",
+       "      <td>male</td>\n",
+       "      <td>22.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>A/5 21171</td>\n",
+       "      <td>7.2500</td>\n",
+       "      <td>C128</td>\n",
+       "      <td>S</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
+       "      <td>female</td>\n",
+       "      <td>38.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>PC 17599</td>\n",
+       "      <td>65.6344</td>\n",
+       "      <td>C85</td>\n",
+       "      <td>C</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>3</td>\n",
+       "      <td>1</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Heikkinen, Miss. Laina</td>\n",
+       "      <td>female</td>\n",
+       "      <td>26.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>STON/O2. 3101282</td>\n",
+       "      <td>7.9250</td>\n",
+       "      <td>C128</td>\n",
+       "      <td>S</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>4</td>\n",
+       "      <td>1</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
+       "      <td>female</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>0</td>\n",
+       "      <td>113803</td>\n",
+       "      <td>53.1000</td>\n",
+       "      <td>C123</td>\n",
+       "      <td>S</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>5</td>\n",
+       "      <td>0</td>\n",
+       "      <td>3</td>\n",
+       "      <td>Allen, Mr. William Henry</td>\n",
+       "      <td>male</td>\n",
+       "      <td>35.0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>0</td>\n",
+       "      <td>373450</td>\n",
+       "      <td>8.0500</td>\n",
+       "      <td>C128</td>\n",
+       "      <td>S</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "      <td>False</td>\n",
+       "      <td>False</td>\n",
+       "      <td>True</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   PassengerId  Survived  Pclass  \\\n",
+       "0            1         0       3   \n",
+       "1            2         1       1   \n",
+       "2            3         1       3   \n",
+       "3            4         1       1   \n",
+       "4            5         0       3   \n",
+       "\n",
+       "                                                Name     Sex   Age  SibSp  \\\n",
+       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
+       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
+       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
+       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
+       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
+       "\n",
+       "   Parch            Ticket     Fare Cabin Embarked  Sex_female  Sex_male  \\\n",
+       "0      0         A/5 21171   7.2500  C128        S       False      True   \n",
+       "1      0          PC 17599  65.6344   C85        C        True     False   \n",
+       "2      0  STON/O2. 3101282   7.9250  C128        S        True     False   \n",
+       "3      0            113803  53.1000  C123        S        True     False   \n",
+       "4      0            373450   8.0500  C128        S       False      True   \n",
+       "\n",
+       "   Embarked_C  Embarked_Q  Embarked_S  \n",
+       "0       False       False        True  \n",
+       "1        True       False       False  \n",
+       "2       False       False        True  \n",
+       "3       False       False        True  \n",
+       "4       False       False        True  "
+      ]
+     },
+     "execution_count": 38,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df_clean = autoclean.output\n",
+    "df_clean[0:5]"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/ml21/preprocessing/5_Duplicated_Values.ipynb
+++ b/ml21/preprocessing/5_Duplicated_Values.ipynb
@@ -0,0 +1,502 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "# Duplicated values\n",
+    "\n",
+    "There are two possible approaches: **remove** these rows or **filling** them. It depends on every case.\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Filling NaN values\n",
+    "If we need to fill errors or blanks, we can use the methods **fillna()** or **dropna()**.\n",
+    "\n",
+    "* For **string** fields, we can fill NaN with **' '**.\n",
+    "\n",
+    "* For **numbers**, we can fill with the **mean** or **median** value. \n"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "# Fill NaN with ' '\n",
+    "df['col'] = df['col'].fillna(' ')\n",
+    "# Fill NaN with 99\n",
+    "df['col'] = df['col'].fillna(99)\n",
+    "# Fill NaN with the mean of the column\n",
+    "df['col'] = df['col'].fillna(df['col'].mean())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Propagate non-null values forward or backwards\n",
+    "You can also propagate non-null values forward or backwards by putting\n",
+    "method=’pad’ as the method argument. It will fill the next value in the\n",
+    "dataframe with the previous non-NaN value. Maybe you just want to fill one\n",
+    "value ( limit=1 )or you want to fill all the values."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.DataFrame(data={'col1':[np.nan, np.nan, 2,3,4, np.nan, np.nan]})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "0   NaN\n",
+       "1   NaN\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0\n",
+       "5   NaN\n",
+       "6   NaN"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "0   NaN\n",
+       "1   NaN\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0\n",
+       "5   4.0\n",
+       "6   NaN"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# We fill forward the value 4.0 and fill the next one (limit = 1)\n",
+    "df.fillna(method='pad', limit=1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can also backfilling with **bfill**."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>col1</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   col1\n",
+       "0   2.0\n",
+       "1   2.0\n",
+       "2   2.0\n",
+       "3   3.0\n",
+       "4   4.0\n",
+       "5   NaN\n",
+       "6   NaN"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Fill the first two NaN values with the first available value\n",
+    "df.fillna(method='bfill')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Removing NaN values\n",
+    "We can remove them by row or column."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "source": [
+    "/# Drop any rows which have any nans\n",
+    "df.dropna()\n",
+    "/# Drop columns that have any nans\n",
+    "df.dropna(axis=1)\n",
+    "/# Only drop columns which have at least 90% non-NaNs\n",
+    "df.dropna(thresh=int(df.shape[0] * .9), axis=1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.4"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/ml21/preprocessing/9_String_Data.ipynb
+++ b/ml21/preprocessing/9_String_Data.ipynb
@@ -0,0 +1,619 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "# String Data\n",
+    "It is common to clean string columns so that they follow a predefined format (e.g. emails, URLs, ...).\n",
+    "\n",
+    "We can do it using regular expressions or specific libraries."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Beautifier\n",
+    "Simple [library](https://github.com/labtocat/beautifier) to cleanup and prettify url patterns, domains and so on. Library helps to clean unicodes, special characters and unnecessary redirection patterns from the urls and gives you clean date.\n",
+    "\n",
+    "Install with **'pip install beautifier'**."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Email cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from beautifier import Email\n",
+    "email = Email('me@imsach.in')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'imsach.in'"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email.domain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'me'"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email.username"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email.is_free_email"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "email2 = Email('This my address')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "False"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email2.is_valid"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "email3 = Email('pepe@gmail.com')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email3.is_valid"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 27,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "email3.is_free_email"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## URL cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from beautifier import Url\n",
+    "url = Url('https://in.linkedin.com/in/sachinphilip?authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'https://in.linkedin.com/in/sachinphilip'"
+      ]
+     },
+     "execution_count": 31,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'in.linkedin.com'"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.domain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 35,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['authtoken=887nasdadasd6hasdtg21', 'secret=98jy766yhhuhnjk']"
+      ]
+     },
+     "execution_count": 35,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.param"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk'"
+      ]
+     },
+     "execution_count": 37,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.parameters"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 39,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'sachinphilip'"
+      ]
+     },
+     "execution_count": 39,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "url.username"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Unicode\n",
+    "Problem: Some unicode code has been broken. We see the character in a different character dataset.\n",
+    "\n",
+    "A **mojibake** is a character displayed in an unintended character enconding. Example:  \"<22>\").\n",
+    "\n",
+    "We will use the library **ftfy** (fixed text for you) to fix it.\n",
+    "\n",
+    "First, you should install the library: ***conda install ftfy**. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 41,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "¯\\_(ツ)_/¯\n",
+      "Party\n",
+      "I'm\n"
+     ]
+    }
+   ],
+   "source": [
+    "import ftfy\n",
+    "foo = '&macr;\\\\_(ã\\x83\\x84)_/&macr;'\n",
+    "bar = '\\ufeffParty'\n",
+    "baz = '\\001\\033[36;44mI&#x92;m'\n",
+    "print(ftfy.fix_text(foo))\n",
+    "print(ftfy.fix_text(bar))\n",
+    "print(ftfy.fix_text(baz))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "subslide"
+    }
+   },
+   "source": [
+    "We can understand which heuristics ftfy is using."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "ename": "NameError",
+     "evalue": "name 'ftfy' is not defined",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
+      "\u001b[0;32m<ipython-input-1-4030b963ff0a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mftfy\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexplain_unicode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfoo\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+      "\u001b[0;31mNameError\u001b[0m: name 'ftfy' is not defined"
+     ]
+    }
+   ],
+   "source": [
+    "ftfy.explain_unicode(foo)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "slide"
+    }
+   },
+   "source": [
+    "## Dates\n",
+    "Sometimes we want to extract date from text. We can use regular expressions or handy packages, such as **python-dateutil**.\n",
+    "\n",
+    "Install the library: **pip install python-dateutil**."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "slideshow": {
+     "slide_type": "fragment"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2019-08-22 10:22:46+00:00\n"
+     ]
+    }
+   ],
+   "source": [
+    "from dateutil.parser import parse\n",
+    "now = parse(\"Thu Aug 22 10:22:46 UTC 2019\")\n",
+    "print(now)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2019-08-22 10:20:00\n"
+     ]
+    }
+   ],
+   "source": [
+    "dt = parse(\"Today is Thursday 8, 2019 at 10:20:00AM\", fuzzy=True)\n",
+    "print(dt)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "slideshow": {
+     "slide_type": "skip"
+    }
+   },
+   "source": [
+    "# References\n",
+    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
+    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)\n",
+    "* Beautifier https://github.com/labtocat/beautifier\n",
+    "* Ftfy https://ftfy.readthedocs.io/en/latest/\n",
+    "* python-dateutil https://dateutil.readthedocs.io/en/stable/"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Slideshow",
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.4"
+  },
+  "latex_envs": {
+   "LaTeX_envs_menu_present": true,
+   "autocomplete": true,
+   "bibliofile": "biblio.bib",
+   "cite_by": "apalike",
+   "current_citInitial": 1,
+   "eqLabelWithNumbers": true,
+   "eqNumInitial": 1,
+   "hotkeys": {
+    "equation": "Ctrl-E",
+    "itemize": "Ctrl-I"
+   },
+   "labels_anchors": false,
+   "latex_user_defs": false,
+   "report_style_numbering": false,
+   "user_envs_cfg": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/ml21/preprocessing/images/EscUpmPolit_p.gif
+++ b/ml21/preprocessing/images/EscUpmPolit_p.gif
--- a/ml21/preprocessing/images/titanic.jpg
+++ b/ml21/preprocessing/images/titanic.jpg
--- a/ml21/visualization/.gitkeep
+++ b/ml21/visualization/.gitkeep
@@ -0,0 +1 @@
+
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Carlos A. Iglesias	743c57691f	Delete sna/t.txt	2024-04-17 17:24:12 +02:00
Carlos A. Iglesias	2c53b81299	Uploaded SNA files	2024-04-17 17:23:28 +02:00
Carlos A. Iglesias	dd6c053109	Add files via upload	2024-04-17 17:22:36 +02:00
Carlos A. Iglesias	e35e0a11e9	Create t.txt	2024-04-17 17:22:20 +02:00
Carlos A. Iglesias	7315b681e4	Update README.md	2024-04-17 17:21:21 +02:00
Carlos A. Iglesias	3fac9c6f78	Add files via upload	2024-04-04 18:27:48 +02:00
Carlos A. Iglesias	21819abeae	Added visualization notebooks	2024-04-03 22:53:02 +02:00
Carlos A. Iglesias	0d4c0c706d	Added images	2024-04-03 22:51:58 +02:00
Carlos A. Iglesias	8de629b495	Create .gitkeep	2024-04-03 22:51:19 +02:00
Carlos A. Iglesias	86114b4a56	Added preprocessing notebooks	2024-04-03 22:50:36 +02:00
Carlos A. Iglesias	1a3f618995	Add files via upload	2024-04-03 21:52:25 +02:00
Carlos A. Iglesias	a1121c03a5	Create .gitkeep - Added preprocessing notebooks	2024-04-03 21:51:34 +02:00
Carlos A. Iglesias	715d0cb77f	Create .gitkeep Added new set of exercises	2024-04-03 21:50:50 +02:00
Carlos A. Iglesias	0150ce7cf7	Update 3_7_SVM.ipynb Updated formatted table	2024-02-22 12:23:08 +01:00
Carlos A. Iglesias	08dfe5c147	Update 3_4_Visualisation_Pandas.ipynb Updated code to last version of seaborn	2024-02-22 11:55:35 +01:00
Carlos A. Iglesias	78e62af098	Update 3_3_Data_Munging_with_Pandas.ipynb Updated to last version of scikit	2024-02-21 12:29:04 +01:00
Carlos A. Iglesias	3f5eba3e84	Update 3_2_Pandas.ipynb Updated links	2024-02-21 12:16:12 +01:00
Carlos A. Iglesias	2de1cda8f1	Update 3_1_Read_Data.ipynb Updated links	2024-02-21 12:14:25 +01:00
Carlos A. Iglesias	cc442c35f3	Update 3_0_0_Intro_ML_2.ipynb Updated links	2024-02-21 12:12:14 +01:00
Carlos A. Iglesias	1100c352fa	Update 2_6_Model_Tuning.ipynb updated links	2024-02-21 11:47:34 +01:00
Carlos A. Iglesias	9b573d292d	Update 2_5_2_Decision_Tree_Model.ipynb Updated links	2024-02-21 11:41:42 +01:00
Carlos A. Iglesias	dd8a4f50d8	Update 2_5_2_Decision_Tree_Model.ipynb Updated links	2024-02-21 11:40:59 +01:00
Carlos A. Iglesias	47148f2ccc	Update util_ds.py Updated links	2024-02-21 11:40:06 +01:00
Carlos A. Iglesias	8ffda8123a	Update 2_5_1_kNN_Model.ipynb Updated links	2024-02-21 11:07:38 +01:00
Carlos A. Iglesias	6629837e7d	Update 2_5_0_Machine_Learning.ipynb Updated links	2024-02-21 11:06:21 +01:00
Carlos A. Iglesias	ba08a9a264	Update 2_4_Preprocessing.ipynb Updated links	2024-02-21 11:02:09 +01:00
Carlos A. Iglesias	4b8fd30f42	Update 2_3_1_Advanced_Visualisation.ipynb Updated links	2024-02-21 11:00:53 +01:00
Carlos A. Iglesias	d879369930	Update 2_3_0_Visualisation.ipynb Updated links	2024-02-21 10:57:34 +01:00
Carlos A. Iglesias	4da01f3ae6	Update 2_0_0_Intro_ML.ipynb Updated links	2024-02-21 10:44:43 +01:00
Carlos A. Iglesias	da9a01e26b	Update 2_0_1_Objectives.ipynb Updated links	2024-02-21 10:43:40 +01:00
Carlos A. Iglesias	dc23b178d7	Delete python/plurals.py	2024-02-08 18:32:43 +01:00
Carlos A. Iglesias	5410d6115d	Delete python/catalog.py	2024-02-08 18:32:18 +01:00
Carlos A. Iglesias	6749aa5deb	Added files for modules	2024-02-08 18:26:08 +01:00
Carlos A. Iglesias	c31e6c1676	Update 1_2_Numbers_Strings.ipynb	2024-02-08 17:47:42 +01:00
Carlos A. Iglesias	1c7496c8ac	Update 1_2_Numbers_Strings.ipynb Improved formatting.	2024-02-08 17:46:18 +01:00
Carlos A. Iglesias	35b1ae4ec8	Update 1_8_Classes.ipynb Improved formatting.	2024-02-08 17:43:25 +01:00
Carlos A. Iglesias	58fc6f5e9c	Update 1_4_Sets.ipynb Typo corrected.	2024-02-08 17:42:45 +01:00
Carlos A. Iglesias	91147becee	Update 1_3_Sequences.ipynb Formatting improvement.	2024-02-08 17:41:15 +01:00
Carlos A. Iglesias	1530995243	Update 1_0_Intro_Python.ipynb Updated links.	2024-02-08 17:36:46 +01:00
Carlos A. Iglesias	0c0960cec7	Update 1_7_Variables.ipynb typo in bold markdown Typo in bold markdown	2024-02-08 17:33:48 +01:00
cif	3363c953f4	Borrada versión anterior	2023-04-27 15:43:44 +02:00
cif	542ce2708d	Actualizada práctica a gymnasium y extendida	2023-04-27 15:42:01 +02:00
cif	380340d66d	Updated 4_4 to use get_feature_names_out() instead of get_feature_names	2023-04-23 16:41:53 +02:00
cif	7f49f8990b	Updated 4_4 - using feature_log_prob_ instead of coef_ (deprecated)	2023-04-23 16:37:48 +02:00
Carlos A. Iglesias	419ea57824	Transparencias con Spacy	2023-04-20 18:20:44 +02:00
Carlos A. Iglesias	7d6010114d	Upload data for assignment	2023-04-20 18:17:12 +02:00
Carlos A. Iglesias	f9d8234e14	Added exercise with Spacy	2023-04-20 16:20:28 +02:00
Carlos A. Iglesias	d41fa61c65	Delete 0_2_NLP_Assignment.ipynb	2023-04-20 16:19:57 +02:00
Carlos A. Iglesias	05a4588acf	Exercise with Spacy	2023-04-20 16:18:47 +02:00
Carlos A. Iglesias	50933f6c94	Update 3_7_SVM.ipynb Fixed typo and updated link	2023-03-09 18:04:14 +01:00
J. Fernando Sánchez	68ba528dd7	Fix typos	2023-02-20 19:43:36 +01:00
J. Fernando Sánchez	897bb487b1	Actualizar ejercicios LOD	2023-02-13 18:26:14 +01:00
Oscar Araque	41d3bdea75	minor typos in ml1	2022-09-05 18:20:29 +02:00
Carlos A. Iglesias	0a9cd3bd5e	Update 3_7_SVM.ipynb Fixed typo in a comment	2022-03-17 17:58:09 +01:00
Carlos A. Iglesias	2c7c9e58e0	Update 3_7_SVM.ipynb Fixed bug in ROC curve visualization	2022-03-17 17:50:27 +01:00
cif	f0278aea33	Updated	2022-03-07 14:19:44 +01:00
cif	7bf0fb6479	Updated	2022-03-07 14:17:02 +01:00
cif	4d87b07ed9	Updated visualization	2022-03-07 14:16:14 +01:00
cif	7d71ba5f7a	Updated references	2022-03-07 13:03:48 +01:00
cif	1124c9129c	Fixed URL	2022-03-07 13:01:21 +01:00
cif	df6449b55f	Updated to last version of seaborn	2022-03-07 12:57:17 +01:00
cif	d99eeb733a	Updated median with only numeric values	2022-03-07 12:44:14 +01:00
cif	a43fb4c78c	Updated references	2022-03-07 12:28:10 +01:00
Carlos A. Iglesias	bf21e3ceab	Update 3_1_Read_Data.ipynb Updated references	2022-03-07 11:01:34 +01:00
Carlos A. Iglesias	e41d233828	Update 3_0_0_Intro_ML_2.ipynb Updated bibliography	2022-03-07 10:58:29 +01:00
Carlos A. Iglesias	a7c6be5b96	Update 2_6_Model_Tuning.ipynb Fixed typo.	2022-02-28 12:51:18 +01:00
Carlos A. Iglesias	11a1ea80d3	Update 2_6_Model_Tuning.ipynb Fixed typos.	2022-02-28 12:45:40 +01:00
Carlos A. Iglesias	a209d18a5b	Update 2_5_1_kNN_Model.ipynb Fixed typo.	2022-02-28 12:38:27 +01:00
cif	ffefd8c2e3	Actualizada bibliografía	2022-02-21 13:55:09 +01:00
cif	f43cde73e4	Actualizada bibliografía	2022-02-21 13:51:21 +01:00
cif	8784fdc773	Actualizada bibliografía	2022-02-21 13:39:33 +01:00
cif	a6d5f9ddeb	Actualizada bibliografía	2022-02-21 13:32:07 +01:00
cif	2e72a4d729	Actualizada bibliografía	2022-02-21 13:29:33 +01:00
cif	9426b4c061	Actualizada bibliografía	2022-02-21 13:26:24 +01:00
cif	5e5979d515	Actualizados enlace	2022-02-21 13:22:46 +01:00
cif	270dcec611	Actualizados enlace	2022-02-21 13:09:21 +01:00
Carlos A. Iglesias	e6e52b43ee	Update 2_4_Preprocessing.ipynb Actualizado enlace de Packt.	2022-02-21 12:57:53 +01:00
Carlos A. Iglesias	3b7675fa3f	Update 2_3_0_Visualisation.ipynb Actualizado enlace de bibliografía de packt.	2022-02-21 12:56:22 +01:00
Carlos A. Iglesias	44c63412f9	Update 2_2_Read_Data.ipynb Updated scikit url	2022-02-21 12:26:30 +01:00
Carlos A. Iglesias	5febbc21a4	Update 2_1_Intro_ScikitLearn.ipynb Errata en dimensionality.	2022-02-21 12:22:15 +01:00
J. Fernando Sánchez	66ed4ba258	Minor changes LOD 01 and 03	2022-02-15 20:48:49 +01:00
Carlos A. Iglesias	95cd25aef4	Update 1__10_Modules_Packages.ipynb Fixed link to module tutorial	2022-02-10 17:51:32 +01:00
J. Fernando Sánchez	955e74fc8e	Add requirements Now the dependencies should be automatically installed if you open the repo through Jupyter Binder	2021-11-10 08:48:54 +01:00
cif2cif	6743dad100	Cleaned output	2021-06-07 10:38:53 +02:00
cif2cif	729f7684c2	Cleaned output	2021-06-07 10:36:12 +02:00
cif2cif	ae8d3d3ba2	Updated with the new libraries	2021-05-07 11:10:21 +02:00
cif2cif	2ba0e2f3d9	updated to last version of OpenGym	2021-04-19 19:10:03 +02:00
cif2cif	c9114cc796	Fixed broken link and bug of sklearn-deap with scikit 0.24	2021-04-19 17:47:22 +02:00
cif2cif	b80c097362	Merge branch 'master' of https://github.com/gsi-upm/sitc	2021-04-06 10:21:25 +02:00
cif2cif	161cd8492b	Fixed bug in substrings_in_string and set default df[AgeGroup] to np.nan	2021-04-06 10:20:29 +02:00
Oscar Araque	3d6d96dd8a	updated ml1/2_6: using scorer to avoid traning warnings	2021-03-11 16:28:14 +01:00
cif2cif	44aa3d24fb	Updated joblib import to sklearn 0.23	2021-02-27 21:30:21 +01:00
cif2cif	8925a4a3c1	Clean 2_5_2	2021-02-27 20:51:52 +01:00
cif2cif	23913811df	Clean 2_5_1	2021-02-27 20:50:03 +01:00
cif2cif	7b4391f187	Updated reference to module six modified in scikit-learn 0.23	2021-02-27 20:21:15 +01:00
cif2cif	0c100dbadc	Merge branch 'master' of https://github.com/gsi-upm/sitc	2021-02-27 20:12:26 +01:00
cif2cif	2f7cbe9e45	Updated util_knn.py to new version of scikit	2021-02-27 20:11:17 +01:00
J. Fernando Sánchez	b43125ca59	LOD: minor changes	2021-02-22 17:32:31 +01:00
cif2cif	5144b7f228	Added intro RDF and tutorial	2021-02-22 12:55:40 +01:00
cif2cif	8b6d6de169	added tutorial SPARQL	2021-02-18 18:10:59 +01:00
Carlos A. Iglesias	7271b5e632	Update README.md Added clone comment	2021-02-09 19:54:56 +01:00
Carlos A. Iglesias	bd99321d6b	Update README.md Fixed a typo	2021-02-09 19:53:55 +01:00
Carlos A. Iglesias	91b8f66056	Update 1_1_Notebooks.ipynb Changes URL for Anacoda	2021-02-09 19:53:14 +01:00
Carlos A. Iglesias	242a0a9252	Update 4_7_Exercises.ipynb	2020-04-29 18:46:31 +02:00
Carlos A. Iglesias	d8d25c4dc3	Update 4_7_Exercises.ipynb	2020-04-29 18:11:50 +02:00
Carlos A. Iglesias	4979fe6877	Update 4_7_Exercises.ipynb	2020-04-29 18:10:10 +02:00
Carlos A. Iglesias	c5e0f146c4	Update 4_7_Exercises.ipynb	2020-04-29 18:06:24 +02:00
Carlos A. Iglesias	167475029e	Updated exercise 1 since the code of the previous link was outdated	2020-04-29 18:04:18 +02:00
Carlos A. Iglesias	da79a18bfc	Fixed broken link	2020-04-23 23:25:09 +02:00
Carlos A. Iglesias	47761c11aa	errata en algoritmos	2020-03-05 17:19:56 +01:00
J. Fernando Sánchez	fd5aa4a1fd	fix typo	2020-02-20 17:38:02 +01:00
J. Fernando Sánchez	396a7b17ca	update RDF example	2020-02-20 17:36:07 +01:00
J. Fernando Sánchez	2248188219	Updated URL rdf and LOD	2020-02-20 11:28:55 +01:00
Carlos A. Iglesias	21dc5ec3de	Update 1__10_Modules_Packages.ipynb	2020-02-13 18:10:08 +01:00
Carlos A. Iglesias	db99033727	Update 1__10_Modules_Packages.ipynb	2020-02-13 18:08:25 +01:00
Carlos A. Iglesias	5459e801d5	Update 1_7_Variables.ipynb	2020-02-13 17:56:36 +01:00
Carlos A. Iglesias	75f08ea170	Merge pull request #5 from gsi-upm/dveni-patch-2 Update 4_1_Lexical_Processing.ipynb	2019-11-27 10:19:12 +01:00
Dani Vera	19ea5dff09	Update 4_1_Lexical_Processing.ipynb	2019-11-26 15:14:40 +01:00
Carlos A. Iglesias	e70689072f	Merge pull request #4 from gsi-upm/dveni-patch-1 Update 3_3_Data_Munging_with_Pandas.ipynb	2019-09-19 10:46:19 +02:00