Merge 23dfc8663f into 3363c953f4

2025-01-07 03:31:28 +00:00 · 2023-05-19 23:21:11 +02:00
79 changed files with 62 additions and 26805 deletions
--- a/README.md
+++ b/README.md
@ -1,7 +1,7 @@
 # sitc
 Exercises for Intelligent Systems Course at Universidad Politécnica de Madrid, Telecommunication Engineering School. This material is used in the subjects
- CDAW (Ciencia de datos y aprendizaje en automático en la web de datos) - Master Universitario de Ingeniería de Telecomunicación (MUIT)
- ABID (Analítica de Big Data) - Master Universitario en Ingeniera de Redes y Servicios Telemáticos)
+- SITC (Sistemas Inteligentes y Tecnologías del Conocimiento) - Master Universitario de Ingeniería de Telecomunicación (MUIT)
+- TIAD (Tecnologías Inteligentes de Análisis de Datos) - Master Universitario en Ingeniera de Redes y Servicios Telemáticos)

 For following this course:
 - Follow the instructions to install the environment: https://github.com/gsi-upm/sitc/blob/master/python/1_1_Notebooks.ipynb (Just install 'conda')
@ -9,13 +9,11 @@ For following this course:
 - Run in a terminal in the folder sitc: jupyter notebook (and enjoy)

 Topics
-* Python: a quick introduction to Python
+* Python: quick introduction to Python
 * ML-1: introduction to machine learning with scikit-learn
 * ML-2: introduction to machine learning with pandas and scikit-learn
-* ML-21: preprocessing and visualizatoin
 * ML-3: introduction to machine learning. Neural Computing
 * ML-4: introduction to Evolutionary Computing
 * ML-5: introduction to Reinforcement Learning
 * NLP: introduction to NLP
 * LOD: Linked Open Data, exercises and example code
-* SNA: Social Network Analysis
--- a/images/.p
+++ b/images/.p
@ -1 +0,0 @@
-
--- a/images/EscUpmPolit_p.gif
+++ b/images/EscUpmPolit_p.gif
--- a/images/cart.png
+++ b/images/cart.png
--- a/images/data-chart-type.png
+++ b/images/data-chart-type.png
--- a/images/frozenlake-problem.png
+++ b/images/frozenlake-problem.png
--- a/images/frozenlake-world.png
+++ b/images/frozenlake-world.png
--- a/images/gym-maze.gif
+++ b/images/gym-maze.gif
--- a/images/iris-dataset.jpg
+++ b/images/iris-dataset.jpg
--- a/images/machine-learning-process.jpg
+++ b/images/machine-learning-process.jpg
--- a/images/multilayerperceptron_network.png
+++ b/images/multilayerperceptron_network.png
--- a/images/plot_ML_flow_chart_1.png
+++ b/images/plot_ML_flow_chart_1.png
--- a/images/plot_ML_flow_chart_2.png
+++ b/images/plot_ML_flow_chart_2.png
--- a/images/plot_ML_flow_chart_3.png
+++ b/images/plot_ML_flow_chart_3.png
--- a/images/qlearning-algo.png
+++ b/images/qlearning-algo.png
--- a/images/recording.gif
+++ b/images/recording.gif
--- a/images/titanic.jpg
+++ b/images/titanic.jpg
--- a/ml1/2_0_0_Intro_ML.ipynb
+++ b/ml1/2_0_0_Intro_ML.ipynb
@ -71,6 +71,7 @@
   "source": [
    "* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
+    "* [scikit-learn : Machine Learning Simplified](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), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
   ]
  },
--- a/ml1/2_0_1_Objectives.ipynb
+++ b/ml1/2_0_1_Objectives.ipynb
@ -63,7 +63,9 @@
   "metadata": {},
   "source": [
    "* [Scikit-learn web page](http://scikit-learn.org/stable/)\n",
-    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
+    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
+    "* [scikit-learn : Machine Learning Simplified](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019."
   ]
  },
  {
--- a/ml1/2_3_0_Visualisation.ipynb
+++ b/ml1/2_3_0_Visualisation.ipynb
@ -228,6 +228,7 @@
   "source": [
    "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
    "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
+    "* [Mastering Pandas](https://learning.oreilly.com/library/view/mastering-pandas/9781789343236/), Femi Anthony, Packt Publishing, 2015.\n",
    "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
    "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
    "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n",
--- a/ml1/2_3_1_Advanced_Visualisation.ipynb
+++ b/ml1/2_3_1_Advanced_Visualisation.ipynb
@ -408,6 +408,7 @@
   "source": [
    "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
    "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
+    "* [Mastering Pandas](https://learning.oreilly.com/library/view/mastering-pandas/9781789343236/), Femi Anthony, Packt Publishing, 2015.\n",
    "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
    "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
    "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)\n",
--- a/ml1/2_4_Preprocessing.ipynb
+++ b/ml1/2_4_Preprocessing.ipynb
@ -163,6 +163,7 @@
   "source": [
    "* [Feature selection](http://scikit-learn.org/stable/modules/feature_selection.html)\n",
    "* [Classification probability](http://scikit-learn.org/stable/auto_examples/classification/plot_classification_probability.html)\n",
+    "* [Mastering Pandas](https://learning.oreilly.com/library/view/mastering-pandas/9781789343236/), Femi Anthony, Packt Publishing, 2015.\n",
    "* [Matplotlib web page](http://matplotlib.org/index.html)\n",
    "* [Using matlibplot in IPython](http://ipython.readthedocs.org/en/stable/interactive/plotting.html)\n",
    "* [Seaborn Tutorial](https://stanford.edu/~mwaskom/software/seaborn/tutorial.html)"
--- a/ml1/2_5_0_Machine_Learning.ipynb
+++ b/ml1/2_5_0_Machine_Learning.ipynb
@ -154,7 +154,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "* [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/index.html)\n",
+    "* [General concepts of machine learning with scikit-learn](https://ogrisel.github.io/scikit-learn.org/sklearn-tutorial/auto_examples/tutorial/plot_ML_flow_chart.html)\n",
    "* [A Tour of Machine Learning Algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/)"
   ]
  },
--- a/ml1/2_5_1_kNN_Model.ipynb
+++ b/ml1/2_5_1_kNN_Model.ipynb
@ -379,7 +379,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "* [KNeighborsClassifier API scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html)\n"
+    "* [KNeighborsClassifier API scikit-learn](http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html)\n",
+    "* [Learning scikit-learn: Machine Learning in Python](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n"
   ]
  },
  {
--- a/ml1/2_5_2_Decision_Tree_Model.ipynb
+++ b/ml1/2_5_2_Decision_Tree_Model.ipynb
@ -509,6 +509,8 @@
   "metadata": {},
   "source": [
    "* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
+    "* [scikit-learn : Machine Learning Simplified](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019.\n",
    "* [Parameter estimation using grid search with cross-validation](https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html)\n",
    "* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
   ]
--- a/ml1/2_6_Model_Tuning.ipynb
+++ b/ml1/2_6_Model_Tuning.ipynb
@ -518,6 +518,8 @@
   "metadata": {},
   "source": [
    "* [Plot the decision surface of a decision tree on the iris dataset](https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html)\n",
+    "* [scikit-learn : Machine Learning Simplified](https://learning.oreilly.com/library/view/scikit-learn-machine/9781788833479/), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2017.\n",
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka, Packt Publishing, 2019.\n",
    "* [Hyperparameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
    "* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
   ]
--- a/ml1/util_ds.py
+++ b/ml1/util_ds.py
@ -47,7 +47,7 @@ def get_code(tree, feature_names, target_names,

    recurse(left, right, threshold, features, 0, 0)

-# Taken from https://scikit-learn.org/stable/auto_examples/tree/plot_iris_dtc.html
+# Taken from http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#example-tree-plot-iris-py
 import numpy as np
 import matplotlib.pyplot as plt

@ -114,4 +114,4 @@ def plot_tree_iris():

    plt.suptitle("Decision surface of a decision tree using paired features")
    plt.legend()
-    plt.show()  
+    plt.show()  
--- a/ml2/3_0_0_Intro_ML_2.ipynb
+++ b/ml2/3_0_0_Intro_ML_2.ipynb
@ -74,7 +74,9 @@
   "metadata": {},
   "source": [
    "* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
-    "* [Scikit-learn videos and notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n"
+    "* [Scikit-learn videos and notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
+    "* [Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits](https://learning.oreilly.com/library/view/hands-on-machine-learning/9781838826048/), Tarek Amr, Packt Publishing, 2020.\n",
+    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka and Vahid Mirjalili, Packt Publishing, 2019."
   ]
  },
  {
--- a/ml2/3_1_Read_Data.ipynb
+++ b/ml2/3_1_Read_Data.ipynb
@ -50,30 +50,30 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "In this session, we will work with the Titanic dataset. This dataset is provided by [Kaggle](http://www.kaggle.com). Kaggle is a crowdsourcing platform that organizes competitions where researchers and companies post their data and users compete to obtain the best models.\n",
+    "In this session we will work with the Titanic dataset. This dataset is provided by [Kaggle](http://www.kaggle.com). Kaggle is a crowdsourcing platform that organizes competitions where researchers and companies post their data and users compete to obtain the best models.\n",
    "\n",
    "![Titanic](images/titanic.jpg)\n",
    "\n",
    "\n",
-    "The main objective is to predict which passengers survived the sinking of the Titanic.\n",
+    "The main objective is predicting which passengers survived the sinking of the Titanic.\n",
    "\n",
    "The data is available [here](https://www.kaggle.com/c/titanic/data). There are two files, one for training ([train.csv](files/data-titanic/train.csv)) and another file for testing [test.csv](files/data-titanic/test.csv). A local copy has been included in this notebook under the folder *data-titanic*.\n",
    "\n",
    "\n",
    "Here follows a description of the variables.\n",
    "\n",
-    "|  Variable  |          Description            |       Values    |\n",
-    "|------------|---------------------------------|-----------------|\n",
-    "|  survival  |           Survival              |(0 = No; 1 = Yes)|\n",
-    "|    Pclass  |             Name                |                 |\n",
-    "|     Sex    |              Sex                |   male, female  |\n",
-    "|     Age    |              Age                |                 |\n",
-    "|    SibSp   |Number of Siblings/Spouses Aboard|                 |\n",
-    "|     Parch  |Number of Parents/Children Aboard|                 |\n",
-    "|     Ticket |        Ticket Number            |                 |\n",
-    "|     Fare   |       Passenger Fare            |                 |\n",
-    "|     Cabin  |            Cabin                |                 |\n",
-    "|   Embarked |       Port of Embarkation       | (C = Cherbourg; Q = Queenstown; S = Southampton)|\n",
+    "|Variable | Description| Values|\n",
+    "|-------------------------------|\n",
+    "| survival| Survival| (0 = No; 1 = Yes)|\n",
+    "|Pclass |Name | |\n",
+    "|Sex  |Sex | male, female|\n",
+    "|Age |Age|\n",
+    "|SibSp |Number of Siblings/Spouses Aboard||\n",
+    "|Parch |Number of Parents/Children Aboard||\n",
+    "|Ticket|Ticket Number||\n",
+    "|Fare            |Passenger Fare||\n",
+    "|Cabin           |Cabin||\n",
+    "|Embarked        |Port of Embarkation| (C = Cherbourg; Q = Queenstown; S = Southampton)|\n",
    "\n",
    "\n",
    "The definitions used for SibSp and Parch are:\n",
@ -213,7 +213,8 @@
    "* [Pandas API input-output](http://pandas.pydata.org/pandas-docs/stable/api.html#input-output)\n",
    "* [Pandas API - pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)\n",
    "* [DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html)\n",
-    "* [An introduction to NumPy and Scipy](https://sites.engineering.ucsb.edu/~shell/che210d/numpy.pdf)\n"
+    "* [An introduction to NumPy and Scipy](https://sites.engineering.ucsb.edu/~shell/che210d/numpy.pdf)\n",
+    "* [NumPy tutorial](https://numpy.org/doc/stable/)"
   ]
  },
  {
--- a/ml2/3_2_Pandas.ipynb
+++ b/ml2/3_2_Pandas.ipynb
@ -433,6 +433,7 @@
   "metadata": {},
   "source": [
    "* [Pandas](http://pandas.pydata.org/)\n",
+    "* [Learning Pandas, Michael Heydt, Packt Publishing, 2017](https://learning.oreilly.com/library/view/learning-pandas/9781787123137/)\n",
    "* [Pandas. Introduction to Data Structures](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html)\n",
    "* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
    "* [Boolean Operators in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-operators)"
--- a/ml2/3_3_Data_Munging_with_Pandas.ipynb
+++ b/ml2/3_3_Data_Munging_with_Pandas.ipynb
@ -373,8 +373,8 @@
   "source": [
    "#Mean age of  passengers per Passenger class\n",
    "\n",
-    "#First we calculate the mean for the numeric columns\n",
-    "df.select_dtypes(np.number).groupby('Pclass').mean()"
+    "#First we calculate the mean\n",
+    "df.groupby('Pclass').mean()"
   ]
  },
  {
--- a/ml2/3_4_Visualisation_Pandas.ipynb
+++ b/ml2/3_4_Visualisation_Pandas.ipynb
@ -220,7 +220,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Analise distribution\n",
+    "# Analise distributon\n",
    "df.hist(figsize=(10,10))\n",
    "plt.show()"
   ]
@ -233,7 +233,7 @@
   "source": [
    "# We can see the pairwise correlation between variables. A value near 0 means low correlation\n",
    "# while a value  near -1 or 1 indicates strong correlation.\n",
-    "df.corr(numeric_only = True)"
+    "df.corr()"
   ]
  },
  {
@ -249,10 +249,11 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# General description of relationship between variables uwing Seaborn PairGrid\n",
+    "# General description of relationship betweek variables uwing Seaborn PairGrid\n",
    "# We use df_clean, since the null values of df would gives us an error, you can check it.\n",
    "g = sns.PairGrid(df_clean, hue=\"Survived\")\n",
-    "g.map(sns.scatterplot)\n",
+    "g.map_diag(plt.hist)\n",
+    "g.map_offdiag(plt.scatter)\n",
    "g.add_legend()"
   ]
  },
--- a/ml2/3_7_SVM.ipynb
+++ b/ml2/3_7_SVM.ipynb
@ -351,10 +351,10 @@
    "We can obtain more information from the confussion matrix and the metric F1-score.\n",
    "In a confussion matrix, we can see:\n",
    "\n",
-    "|             |**Predicted**: 0| **Predicted: 1**|\n",
-    "|-------------|----------------|-----------------|\n",
-    "|**Actual: 0**| TN             | FP              |\n",
-    "|**Actual: 1**| FN             | TP              |\n",
+    "||**Predicted**: 0| **Predicted: 1**|\n",
+    "|---------------------------|\n",
+    "|**Actual: 0**| TN | FP |\n",
+    "|**Actual: 1**| FN|TP|\n",
    "\n",
    "* **True negatives (TN)**: actual negatives that were predicted as negatives\n",
    "* **False positives (FP)**: actual negatives that were predicted as positives\n",
--- a/ml21/.gitkeep
+++ b/ml21/.gitkeep
@ -1 +0,0 @@
-
--- a/ml21/preprocessing/.gitkeep
+++ b/ml21/preprocessing/.gitkeep
@ -1 +0,0 @@
-
--- a/ml21/preprocessing/00_Intro_Preprocessing.ipynb
+++ b/ml21/preprocessing/00_Intro_Preprocessing.ipynb
@ -1,157 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Introduction to Preprocessing\n",
-    "In this session, we will get more insight regarding how to preprocess data.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Objectives\n",
-    "The main objectives of this session are:\n",
-    "* Understanding the need for preprocessing\n",
-    "* Understanding different preprocessing techniques\n",
-    "* Experimenting with several environments for preprocessing"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Table of Contents"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "1. [Home](00_Intro_Preprocessing.ipynb)\n",
-    "3. [Initial Check](02_Initial_Check.ipynb)\n",
-    "4. [Filter Data](03_Filter_Data.ipynb)\n",
-    "5. [Unknown values](04_Unknown_Values.ipynb)\n",
-    "6. [Duplicated values](05_Duplicated_Values.ipynb)\n",
-    "7. [Rescaling Data](06_Rescaling_Data.ipynb)\n",
-    "8. [Binarize Data](07_Binarize_Data.ipynb)\n",
-    "9. [Categorial features](08_Categorical.ipynb)\n",
-    "10. [String Data](09_String_Data.ipynb)\n",
-    "12. [Handy libraries for preprocessing](11_0_Handy.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/02_Initial_Check.ipynb
+++ b/ml21/preprocessing/02_Initial_Check.ipynb
@ -1,714 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Initial Check with Pandas\n",
-    "\n",
-    "We can start with a quick quality check."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "## Load and check data\n",
-    "Check which data you are loading."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>PassengerId</th>\n",
-       "      <th>Survived</th>\n",
-       "      <th>Pclass</th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>SibSp</th>\n",
-       "      <th>Parch</th>\n",
-       "      <th>Ticket</th>\n",
-       "      <th>Fare</th>\n",
-       "      <th>Cabin</th>\n",
-       "      <th>Embarked</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Braund, Mr. Owen Harris</td>\n",
-       "      <td>male</td>\n",
-       "      <td>22.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>A/5 21171</td>\n",
-       "      <td>7.2500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
-       "      <td>female</td>\n",
-       "      <td>38.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>PC 17599</td>\n",
-       "      <td>71.2833</td>\n",
-       "      <td>C85</td>\n",
-       "      <td>C</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>3</td>\n",
-       "      <td>1</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Heikkinen, Miss. Laina</td>\n",
-       "      <td>female</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>STON/O2. 3101282</td>\n",
-       "      <td>7.9250</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>4</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
-       "      <td>female</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>113803</td>\n",
-       "      <td>53.1000</td>\n",
-       "      <td>C123</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>5</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Allen, Mr. William Henry</td>\n",
-       "      <td>male</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>373450</td>\n",
-       "      <td>8.0500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>6</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Moran, Mr. James</td>\n",
-       "      <td>male</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>330877</td>\n",
-       "      <td>8.4583</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>Q</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>7</td>\n",
-       "      <td>0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>McCarthy, Mr. Timothy J</td>\n",
-       "      <td>male</td>\n",
-       "      <td>54.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>17463</td>\n",
-       "      <td>51.8625</td>\n",
-       "      <td>E46</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>7</th>\n",
-       "      <td>8</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Palsson, Master. Gosta Leonard</td>\n",
-       "      <td>male</td>\n",
-       "      <td>2.0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>1</td>\n",
-       "      <td>349909</td>\n",
-       "      <td>21.0750</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>8</th>\n",
-       "      <td>9</td>\n",
-       "      <td>1</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)</td>\n",
-       "      <td>female</td>\n",
-       "      <td>27.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "      <td>347742</td>\n",
-       "      <td>11.1333</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>9</th>\n",
-       "      <td>10</td>\n",
-       "      <td>1</td>\n",
-       "      <td>2</td>\n",
-       "      <td>Nasser, Mrs. Nicholas (Adele Achem)</td>\n",
-       "      <td>female</td>\n",
-       "      <td>14.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>237736</td>\n",
-       "      <td>30.0708</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>C</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   PassengerId  Survived  Pclass  \\\n",
-       "0            1         0       3   \n",
-       "1            2         1       1   \n",
-       "2            3         1       3   \n",
-       "3            4         1       1   \n",
-       "4            5         0       3   \n",
-       "5            6         0       3   \n",
-       "6            7         0       1   \n",
-       "7            8         0       3   \n",
-       "8            9         1       3   \n",
-       "9           10         1       2   \n",
-       "\n",
-       "                                                Name     Sex   Age  SibSp  \\\n",
-       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
-       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
-       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
-       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
-       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
-       "5                                   Moran, Mr. James    male   NaN      0   \n",
-       "6                            McCarthy, Mr. Timothy J    male  54.0      0   \n",
-       "7                     Palsson, Master. Gosta Leonard    male   2.0      3   \n",
-       "8  Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)  female  27.0      0   \n",
-       "9                Nasser, Mrs. Nicholas (Adele Achem)  female  14.0      1   \n",
-       "\n",
-       "   Parch            Ticket     Fare Cabin Embarked  \n",
-       "0      0         A/5 21171   7.2500   NaN        S  \n",
-       "1      0          PC 17599  71.2833   C85        C  \n",
-       "2      0  STON/O2. 3101282   7.9250   NaN        S  \n",
-       "3      0            113803  53.1000  C123        S  \n",
-       "4      0            373450   8.0500   NaN        S  \n",
-       "5      0            330877   8.4583   NaN        Q  \n",
-       "6      0             17463  51.8625   E46        S  \n",
-       "7      1            349909  21.0750   NaN        S  \n",
-       "8      2            347742  11.1333   NaN        S  \n",
-       "9      0            237736  30.0708   NaN        C  "
-      ]
-     },
-     "execution_count": 2,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import pandas as pd\n",
-    "df = pd.read_csv('https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv')\n",
-    "df.head(10)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Check number of columns and rows"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "(891, 12)"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df.shape"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "## Check names and types of columns\n",
-    "Check the data and type, for example if dates are of strings or what."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',\n",
-      "       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],\n",
-      "      dtype='object')\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "PassengerId      int64\n",
-       "Survived         int64\n",
-       "Pclass           int64\n",
-       "Name            object\n",
-       "Sex             object\n",
-       "Age            float64\n",
-       "SibSp            int64\n",
-       "Parch            int64\n",
-       "Ticket          object\n",
-       "Fare           float64\n",
-       "Cabin           object\n",
-       "Embarked        object\n",
-       "dtype: object"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Get column names\n",
-    "print(df.columns)\n",
-    "# Get column data types\n",
-    "df.dtypes"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "## Check if the column is unique"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "PassengerId is unique: True\n",
-      "Survived is unique: False\n",
-      "Pclass is unique: False\n",
-      "Name is unique: True\n",
-      "Sex is unique: False\n",
-      "Age is unique: False\n",
-      "SibSp is unique: False\n",
-      "Parch is unique: False\n",
-      "Ticket is unique: False\n",
-      "Fare is unique: False\n",
-      "Cabin is unique: False\n",
-      "Embarked is unique: False\n"
-     ]
-    }
-   ],
-   "source": [
-    "for i in column_names:\n",
-    "    print('{} is unique: {}'.format(i, df[i].is_unique))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Check if the dataframe has an index\n",
-    "We will need it to do joins or merges."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "RangeIndex(start=0, stop=891, step=1)"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# check if there is an index. If not,  you will get 'AtributeError: function object has no atribute index'\n",
-    "df.index"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,\n",
-       "        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,\n",
-       "        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,\n",
-       "        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,\n",
-       "        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,\n",
-       "        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,\n",
-       "        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,\n",
-       "        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,\n",
-       "       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,\n",
-       "       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,\n",
-       "       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,\n",
-       "       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,\n",
-       "       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,\n",
-       "       169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181,\n",
-       "       182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194,\n",
-       "       195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207,\n",
-       "       208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220,\n",
-       "       221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233,\n",
-       "       234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246,\n",
-       "       247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259,\n",
-       "       260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272,\n",
-       "       273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285,\n",
-       "       286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298,\n",
-       "       299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311,\n",
-       "       312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324,\n",
-       "       325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337,\n",
-       "       338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350,\n",
-       "       351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363,\n",
-       "       364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376,\n",
-       "       377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389,\n",
-       "       390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402,\n",
-       "       403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415,\n",
-       "       416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428,\n",
-       "       429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441,\n",
-       "       442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454,\n",
-       "       455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467,\n",
-       "       468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480,\n",
-       "       481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493,\n",
-       "       494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506,\n",
-       "       507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519,\n",
-       "       520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532,\n",
-       "       533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545,\n",
-       "       546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558,\n",
-       "       559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571,\n",
-       "       572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584,\n",
-       "       585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597,\n",
-       "       598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610,\n",
-       "       611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623,\n",
-       "       624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636,\n",
-       "       637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649,\n",
-       "       650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662,\n",
-       "       663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675,\n",
-       "       676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688,\n",
-       "       689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701,\n",
-       "       702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714,\n",
-       "       715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727,\n",
-       "       728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740,\n",
-       "       741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753,\n",
-       "       754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766,\n",
-       "       767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779,\n",
-       "       780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792,\n",
-       "       793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805,\n",
-       "       806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818,\n",
-       "       819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831,\n",
-       "       832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844,\n",
-       "       845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857,\n",
-       "       858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870,\n",
-       "       871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883,\n",
-       "       884, 885, 886, 887, 888, 889, 890])"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# # Check the index values\n",
-    "df.index.values"
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# If index does not exist\n",
-    "df.set_index('column_name_to_use', inplace=True)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "PassengerId      0\n",
-       "Survived         0\n",
-       "Pclass           0\n",
-       "Name             0\n",
-       "Sex              0\n",
-       "Age            177\n",
-       "SibSp            0\n",
-       "Parch            0\n",
-       "Ticket           0\n",
-       "Fare             0\n",
-       "Cabin          687\n",
-       "Embarked         2\n",
-       "dtype: int64"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Count missing vales per column\n",
-    "df.isnull().sum()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/03_Filter_Data.ipynb
+++ b/ml21/preprocessing/03_Filter_Data.ipynb
@ -1,150 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Filter Data\n",
-    "\n",
-    "Select the columns you want and delete the others."
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "source": [
-    "# Create list comprehension of the columns you want to lose\n",
-    "columns_to_drop = [column_names[i] for i in [1, 3, 5]]\n",
-    "# Drop unwanted columns \n",
-    "df.drop(columns_to_drop, inplace=True, axis=1)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.13"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/04_Unknown_Values.ipynb
+++ b/ml21/preprocessing/04_Unknown_Values.ipynb
@ -1,591 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Unknown values\n",
-    "\n",
-    "Two possible approaches are **remove** these rows or **fill** them. It depends on every case."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "import numpy as np"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Filling NaN values\n",
-    "If we need to fill errors or blanks, we can use the methods **fillna()** or **dropna()**.\n",
-    "\n",
-    "* For **string** fields, we can fill NaN with **' '**.\n",
-    "\n",
-    "* For **numbers**, we can fill with the **mean** or **median** value. \n"
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Fill NaN with ' '\n",
-    "df['col'] = df['col'].fillna(' ')\n",
-    "# Fill NaN with 99\n",
-    "df['col'] = df['col'].fillna(99)\n",
-    "# Fill NaN with the mean of the column\n",
-    "df['col'] = df['col'].fillna(df['col'].mean())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Propagate non-null values forward or backward\n",
-    "You can also **propagate** non-null values with these methods:\n",
-    "\n",
-    "* **ffill**: Fill values by propagating the last valid observation to the next valid.\n",
-    "* **bfill**:  Fill values using the following valid observation to fill the gap.\n",
-    "* **interpolate**:  Fill NaN values using interpolation.\n",
-    "\n",
-    "It will fill the next value in the dataframe with the previous non-NaN value. \n",
-    "\n",
-    "You may want to fill in one value (**limit=1**) or all the values. You can also indicate inplace=True to fill in-place."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "df = pd.DataFrame(data={'col1':[np.nan, np.nan, 2,3,4, np.nan, np.nan]})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "0   NaN\n",
-       "1   NaN\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0\n",
-       "5   NaN\n",
-       "6   NaN"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We fill forward the value 4.0 and fill the next one (limit = 1)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "0   NaN\n",
-       "1   NaN\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0\n",
-       "5   4.0\n",
-       "6   NaN"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    " df.ffill(limit = 1)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "df.ffill()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "We can also backfilling with **bfill**. Since we do not include *limit*, we fill all the values."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "0   2.0\n",
-       "1   2.0\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0\n",
-       "5   NaN\n",
-       "6   NaN"
-      ]
-     },
-     "execution_count": 13,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df.bfill()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Removing NaN values\n",
-    "We can remove them by row or column (use inplace=True if you want to modify the DataFrame)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 26,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0"
-      ]
-     },
-     "execution_count": 26,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Drop any rows which have any nans\n",
-    "df1 = df.dropna()\n",
-    "# Drop columns that have any nans (axis = 1 -> drop columns, axis = 0 -> drop rows)\n",
-    "df2 = df.dropna(axis=1)\n",
-    "# Only drop columns which have at least 90% non-NaNs \n",
-    "df3 = df.dropna(thresh=int(df.shape[0] * .9), axis=1)\n",
-    "df1"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.13"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/05_Duplicated_Values.ipynb
+++ b/ml21/preprocessing/05_Duplicated_Values.ipynb
--- a/ml21/preprocessing/06_Rescaling_Data.ipynb
+++ b/ml21/preprocessing/06_Rescaling_Data.ipynb
--- a/ml21/preprocessing/07_Binarize_Data.ipynb
+++ b/ml21/preprocessing/07_Binarize_Data.ipynb
@ -1,198 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Binarize Data\n",
-    "* We can transform our data using a binary threshold. All values above the threshold are marked 1, and all values equal to or below are marked 0.\n",
-    "* This is called binarizing your data or thresholding your data. \n",
-    "\n",
-    "* It can be helpful when you have probabilities that you want to make crisp values."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Binarize Data with Scikit-Learn\n",
-    "We can create new binary attributes in Python using Scikit-learn with the Binarizer class.\n",
-    "I"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "from sklearn.preprocessing import Binarizer\n",
-    "\n",
-    "X = [[ 1., -1.,  2.],\n",
-    "     [ 2.,  0.,  0.],\n",
-    "     [ 0.,  1.1, -1.]]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "transformer = Binarizer(threshold=1.0).fit(X) # threshold 1.0"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "array([[0., 0., 1.],\n",
-       "       [1., 0., 0.],\n",
-       "       [0., 1., 0.]])"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "transformer.transform(X)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Binarizer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Binarizer.html), Scikit Learn"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.13"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/08_Categorical.ipynb
+++ b/ml21/preprocessing/08_Categorical.ipynb
@ -1,812 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Categorical Data\n",
-    "\n",
-    "For many ML algorithms, we need to transform categorical data into numbers.\n",
-    "\n",
-    "For example:\n",
-    "* **'Sex'** with values *'M'*, *'F'*, *'Unknown'*. \n",
-    "* **'Position'** with values 'phD', *'Professor'*, *'TA'*, *'graduate'*.\n",
-    "* **'Temperature'** with values *'low'*, *'medium'*, *'high'*.\n",
-    "\n",
-    "There are two main approaches:\n",
-    "* Integer encoding\n",
-    "* One hot encoding"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Integer Encoding\n",
-    "We assign a number to every value:\n",
-    "\n",
-    "['M', 'F', 'Unknown', 'M'] --> [0, 1, 2, 0]\n",
-    "\n",
-    "['phD', 'Professor', 'TA','graduate', 'phD'] --> [0, 1, 2, 3, 0]\n",
-    "\n",
-    "['low', 'medium', 'high', 'low'] --> [0, 1, 2, 0]\n",
-    "\n",
-    "The main problem with this representation is integers have a natural order, and some ML algorithms can be confused. \n",
-    "\n",
-    "In our examples, this representation can be suitable for **temperature**, but not for the other two."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## One Hot Encoding\n",
-    "A binary column is created for each value of the categorical variable."
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "source": [
-    "Sex                                               M  F U\n",
-    "-----                                            ---------\n",
-    "M                                                 1  0 0\n",
-    "F                     is transformed into         0  1 0\n",
-    "Unknown                                           0  0 1\n",
-    "M                                                 1  0 0 "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Transforming categorical data  with Scikit-Learn\n",
-    "\n",
-    "We can use:\n",
-    "* **get_dummies()** (one hot encoding)\n",
-    "* **LabelEncoder** (integer encoding) and **OneHotEncoder** (one hot encoding). \n",
-    "\n",
-    "We are going to learn the first approach."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### One Hot Encoding\n",
-    "We can use Pandas (*get_dummies*) or Scikit-Learn (*OneHotEncoder*)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "     Name  Age     Sex   Position\n",
-      "0  Marius   18    Male   graduate\n",
-      "1   Maria   19  Female  professor\n",
-      "2    John   20    Male         TA\n",
-      "3   Carla   30  Female        phD\n"
-     ]
-    }
-   ],
-   "source": [
-    "import pandas as pd\n",
-    "\n",
-    "data = {\"Name\": [\"Marius\", \"Maria\", \"John\", \"Carla\"],\n",
-    "        \"Age\": [18, 19, 20, 30],\n",
-    "\t\t\"Sex\": [\"Male\", \"Female\", \"Male\", \"Female\"],\n",
-    "        \"Position\": [\"graduate\", \"professor\", \"TA\", \"phD\"]\n",
-    "       }\n",
-    "df = pd.DataFrame(data)\n",
-    "print(df)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>sex_encoded</th>\n",
-       "      <th>position_encoded</th>\n",
-       "      <th>Sex_Female</th>\n",
-       "      <th>Sex_Male</th>\n",
-       "      <th>Position_TA</th>\n",
-       "      <th>Position_graduate</th>\n",
-       "      <th>Position_phD</th>\n",
-       "      <th>Position_professor</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>Marius</td>\n",
-       "      <td>18</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>Maria</td>\n",
-       "      <td>19</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>John</td>\n",
-       "      <td>20</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>Carla</td>\n",
-       "      <td>30</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     Name  Age  sex_encoded  position_encoded  Sex_Female  Sex_Male  \\\n",
-       "0  Marius   18            1                 1       False      True   \n",
-       "1   Maria   19            0                 3        True     False   \n",
-       "2    John   20            1                 0       False      True   \n",
-       "3   Carla   30            0                 2        True     False   \n",
-       "\n",
-       "   Position_TA  Position_graduate  Position_phD  Position_professor  \n",
-       "0        False               True         False               False  \n",
-       "1        False              False         False                True  \n",
-       "2         True              False         False               False  \n",
-       "3        False              False          True               False  "
-      ]
-     },
-     "execution_count": 18,
-     "metadata": {},
-     "output_type": "execute_result"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.\n"
-     ]
-    }
-   ],
-   "source": [
-    "df_onehot = pd.get_dummies(df, columns=['Sex', 'Position'])\n",
-    "df_onehot"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We can also use *OneHotEncoder* from Scikit."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 27,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Sex_Female</th>\n",
-       "      <th>Sex_Male</th>\n",
-       "      <th>Position_TA</th>\n",
-       "      <th>Position_graduate</th>\n",
-       "      <th>Position_phD</th>\n",
-       "      <th>Position_professor</th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>sex_encoded</th>\n",
-       "      <th>position_encoded</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>0.0</td>\n",
-       "      <td>1.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>1.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>Marius</td>\n",
-       "      <td>18</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>1.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>1.0</td>\n",
-       "      <td>Maria</td>\n",
-       "      <td>19</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>0.0</td>\n",
-       "      <td>1.0</td>\n",
-       "      <td>1.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>John</td>\n",
-       "      <td>20</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>1.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>1.0</td>\n",
-       "      <td>0.0</td>\n",
-       "      <td>Carla</td>\n",
-       "      <td>30</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "  Sex_Female Sex_Male Position_TA Position_graduate Position_phD  \\\n",
-       "0        0.0      1.0         0.0               1.0          0.0   \n",
-       "1        1.0      0.0         0.0               0.0          0.0   \n",
-       "2        0.0      1.0         1.0               0.0          0.0   \n",
-       "3        1.0      0.0         0.0               0.0          1.0   \n",
-       "\n",
-       "  Position_professor    Name Age sex_encoded position_encoded  \n",
-       "0                0.0  Marius  18           1                1  \n",
-       "1                1.0   Maria  19           0                3  \n",
-       "2                0.0    John  20           1                0  \n",
-       "3                0.0   Carla  30           0                2  "
-      ]
-     },
-     "execution_count": 27,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from sklearn.preprocessing import OneHotEncoder\n",
-    "from sklearn.compose import make_column_transformer\n",
-    "\n",
-    "df_onehotencoder = df\n",
-    "# create OneHotEncoder object\n",
-    "encoder = OneHotEncoder()\n",
-    "\n",
-    "# Transformer for several columns\n",
-    "transformer = make_column_transformer(\n",
-    "  (OneHotEncoder(), ['Sex', 'Position']),\n",
-    "  remainder='passthrough',\n",
-    "  verbose_feature_names_out=False)\n",
-    "\n",
-    "# transform\n",
-    "transformed = transformer.fit_transform(df_onehotencoder)\n",
-    "\n",
-    "df_onehotencoder = pd.DataFrame(\n",
-    "  transformed,\n",
-    "  columns=transformer.get_feature_names_out())\n",
-    "df_onehotencoder"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Pandas' get_dummy is easier for transforming DataFrames. OneHotEncoder is more efficient and can be good for integrating the step in a machine learning pipeline."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Integer encoding\n",
-    "We will use **LabelEncoder**. It is possible to get the original values with *inverse_transform*. See [LabelEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Position</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>Marius</td>\n",
-       "      <td>18</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>graduate</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>Maria</td>\n",
-       "      <td>19</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>professor</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>John</td>\n",
-       "      <td>20</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>TA</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>Carla</td>\n",
-       "      <td>30</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>phD</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     Name  Age     Sex   Position\n",
-       "0  Marius   18    Male   graduate\n",
-       "1   Maria   19  Female  professor\n",
-       "2    John   20    Male         TA\n",
-       "3   Carla   30  Female        phD"
-      ]
-     },
-     "execution_count": 14,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from sklearn.preprocessing import LabelEncoder\n",
-    "# creating instance of labelencoder\n",
-    "labelencoder = LabelEncoder()\n",
-    "df_encoded = df\n",
-    "# Assigning numerical values and storing in another column\n",
-    "sex_values = ('Male', 'Female')\n",
-    "position_values = ('graduate', 'professor', 'TA', 'phD')\n",
-    "df_encoded"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Position</th>\n",
-       "      <th>sex_encoded</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>Marius</td>\n",
-       "      <td>18</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>graduate</td>\n",
-       "      <td>1</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>Maria</td>\n",
-       "      <td>19</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>professor</td>\n",
-       "      <td>0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>John</td>\n",
-       "      <td>20</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>TA</td>\n",
-       "      <td>1</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>Carla</td>\n",
-       "      <td>30</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>phD</td>\n",
-       "      <td>0</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     Name  Age     Sex   Position  sex_encoded\n",
-       "0  Marius   18    Male   graduate            1\n",
-       "1   Maria   19  Female  professor            0\n",
-       "2    John   20    Male         TA            1\n",
-       "3   Carla   30  Female        phD            0"
-      ]
-     },
-     "execution_count": 16,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df_encoded['sex_encoded'] = labelencoder.fit_transform(df_encoded['Sex'])\n",
-    "df_encoded"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Position</th>\n",
-       "      <th>sex_encoded</th>\n",
-       "      <th>position_encoded</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>Marius</td>\n",
-       "      <td>18</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>graduate</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>Maria</td>\n",
-       "      <td>19</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>professor</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>John</td>\n",
-       "      <td>20</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>TA</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>Carla</td>\n",
-       "      <td>30</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>phD</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     Name  Age     Sex   Position  sex_encoded  position_encoded\n",
-       "0  Marius   18    Male   graduate            1                 1\n",
-       "1   Maria   19  Female  professor            0                 3\n",
-       "2    John   20    Male         TA            1                 0\n",
-       "3   Carla   30  Female        phD            0                 2"
-      ]
-     },
-     "execution_count": 17,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df_encoded['position_encoded'] = labelencoder.fit_transform(df_encoded['Position'])\n",
-    "df_encoded"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Binarizer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Binarizer.html), Scikit Learn"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.13"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/09_String_Data.ipynb
+++ b/ml21/preprocessing/09_String_Data.ipynb
@ -1,652 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# String Data\n",
-    "It is widespread to clean string columns to follow a predefined format (e.g., emails, URLs, ...).\n",
-    "\n",
-    "We can do it using regular expressions or specific libraries."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Beautifier\n",
-    "A simple [library](https://github.com/labtocat/beautifier) to cleanup and prettify URL patterns, domains, and so on. The library helps to clean Unicode, special characters, and unnecessary redirection patterns from the URLs and gives you a clean date.\n",
-    "\n",
-    "Install with **'pip install beautifier'**."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Email cleanup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "from beautifier import Email\n",
-    "email = Email('me@imsach.in')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'imsach.in'"
-      ]
-     },
-     "execution_count": 2,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email.domain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'me'"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email.username"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "False"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email.is_free_email"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "email2 = Email('This my address')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "False"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email2.is_valid"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "email3 = Email('pepe@gmail.com')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email3.is_valid"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email3.is_free_email"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## URL cleanup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "from beautifier import Url\n",
-    "url = Url('https://in.linkedin.com/in/sachinphilip?authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'https://in.linkedin.com/in/sachinphilip'"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.cleanup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'in.linkedin.com'"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.domain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "['authtoken=887nasdadasd6hasdtg21', 'secret=98jy766yhhuhnjk']"
-      ]
-     },
-     "execution_count": 13,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.param"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk'"
-      ]
-     },
-     "execution_count": 14,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.parameters"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'sachinphilip'"
-      ]
-     },
-     "execution_count": 15,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.username"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Unicode\n",
-    "Problem: Some unicode code has been broken. We see the character in a different character dataset.\n",
-    "\n",
-    "A **mojibake** is a character displayed in an unintended character encoding. Example:  \"<22>\").\n",
-    "\n",
-    "We will use the library **ftfy** (fixed text for you) to fix it.\n",
-    "\n",
-    "First, you should install the library: **conda install ftfy** (or **pip install ftfy**)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "¯\\_(ツ)_/¯\n",
-      "Party\n",
-      "I'm\n"
-     ]
-    }
-   ],
-   "source": [
-    "import ftfy\n",
-    "foo = '&macr;\\\\_(ã\\x83\\x84)_/&macr;'\n",
-    "bar = '\\ufeffParty'\n",
-    "baz = '\\001\\033[36;44mI&#x92;m'\n",
-    "print(ftfy.fix_text(foo))\n",
-    "print(ftfy.fix_text(bar))\n",
-    "print(ftfy.fix_text(baz))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "We can understand which heuristics ftfy is using."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "U+0026  &       [Po] AMPERSAND\n",
-      "U+006D  m       [Ll] LATIN SMALL LETTER M\n",
-      "U+0061  a       [Ll] LATIN SMALL LETTER A\n",
-      "U+0063  c       [Ll] LATIN SMALL LETTER C\n",
-      "U+0072  r       [Ll] LATIN SMALL LETTER R\n",
-      "U+003B  ;       [Po] SEMICOLON\n",
-      "U+005C  \\       [Po] REVERSE SOLIDUS\n",
-      "U+005F  _       [Pc] LOW LINE\n",
-      "U+0028  (       [Ps] LEFT PARENTHESIS\n",
-      "U+00E3  ã       [Ll] LATIN SMALL LETTER A WITH TILDE\n",
-      "U+0083  \\x83    [Cc] <unknown>\n",
-      "U+0084  \\x84    [Cc] <unknown>\n",
-      "U+0029  )       [Pe] RIGHT PARENTHESIS\n",
-      "U+005F  _       [Pc] LOW LINE\n",
-      "U+002F  /       [Po] SOLIDUS\n",
-      "U+0026  &       [Po] AMPERSAND\n",
-      "U+006D  m       [Ll] LATIN SMALL LETTER M\n",
-      "U+0061  a       [Ll] LATIN SMALL LETTER A\n",
-      "U+0063  c       [Ll] LATIN SMALL LETTER C\n",
-      "U+0072  r       [Ll] LATIN SMALL LETTER R\n",
-      "U+003B  ;       [Po] SEMICOLON\n"
-     ]
-    }
-   ],
-   "source": [
-    "ftfy.explain_unicode(foo)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Dates\n",
-    "Sometimes we want to extract date from text. We can use regular expressions or handy packages, such as [**python-dateutil**](https://dateutil.readthedocs.io/en/stable/). An alternative is [arrow](https://arrow.readthedocs.io/en/latest/).\n",
-    "\n",
-    "Install the library: **pip install python-dateutil**."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "2019-08-22 10:22:46+00:00\n"
-     ]
-    }
-   ],
-   "source": [
-    "from dateutil.parser import parse\n",
-    "now = parse(\"Thu Aug 22 10:22:46 UTC 2019\")\n",
-    "print(now)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "2019-08-08 10:20:00\n"
-     ]
-    }
-   ],
-   "source": [
-    "dt = parse(\"Today is Thursday 8, 2019 at 10:20:00AM\", fuzzy=True)\n",
-    "print(dt)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/), , A. Sharma, 2018.\n",
-    "* [Beautifier](https://github.com/labtocat/beautifier) package\n",
-    "* [Ftfy](https://ftfy.readthedocs.io/en/latest/) package\n",
-    "* [python-dateutil](https://dateutil.readthedocs.io/en/stable/)package"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.13"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/11_0_Handy.ipynb
+++ b/ml21/preprocessing/11_0_Handy.ipynb
@ -1,139 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "#  Handy libraries\n",
-    "Libraries that help in several preprocessing tasks.\n",
-    "\n",
-    "* [datacleaner](11_1_datacleaner.ipynb)\n",
-    "* [autoclean](11_3_autoclean.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/), A. Sharma, 2018.\n",
-    "* [Handy Python Libraries for Formatting and Cleaning Data](https://mode.com/blog/python-data-cleaning-libraries),  M. Bierly, 2016\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/11_1_datacleaner.ipynb
+++ b/ml21/preprocessing/11_1_datacleaner.ipynb
@ -1,673 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Datacleaner\n",
-    "[Datacleaner](https://github.com/rhiever/datacleaner) supports:\n",
-    "\n",
-    "* drop rows with missing values\n",
-    "* replace missing values with the mode or median on a column-by-column basis\n",
-    "* encode non-numeric variables with numerical equivalents\n",
-    "\n",
-    "\n",
-    "Install with\n",
-    "\n",
-    "**pip install datacleaner**"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>PassengerId</th>\n",
-       "      <th>Survived</th>\n",
-       "      <th>Pclass</th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>SibSp</th>\n",
-       "      <th>Parch</th>\n",
-       "      <th>Ticket</th>\n",
-       "      <th>Fare</th>\n",
-       "      <th>Cabin</th>\n",
-       "      <th>Embarked</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Braund, Mr. Owen Harris</td>\n",
-       "      <td>male</td>\n",
-       "      <td>22.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>A/5 21171</td>\n",
-       "      <td>7.2500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
-       "      <td>female</td>\n",
-       "      <td>38.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>PC 17599</td>\n",
-       "      <td>71.2833</td>\n",
-       "      <td>C85</td>\n",
-       "      <td>C</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>3</td>\n",
-       "      <td>1</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Heikkinen, Miss. Laina</td>\n",
-       "      <td>female</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>STON/O2. 3101282</td>\n",
-       "      <td>7.9250</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>4</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
-       "      <td>female</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>113803</td>\n",
-       "      <td>53.1000</td>\n",
-       "      <td>C123</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>5</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Allen, Mr. William Henry</td>\n",
-       "      <td>male</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>373450</td>\n",
-       "      <td>8.0500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>...</th>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>886</th>\n",
-       "      <td>887</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "      <td>Montvila, Rev. Juozas</td>\n",
-       "      <td>male</td>\n",
-       "      <td>27.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>211536</td>\n",
-       "      <td>13.0000</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>887</th>\n",
-       "      <td>888</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Graham, Miss. Margaret Edith</td>\n",
-       "      <td>female</td>\n",
-       "      <td>19.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>112053</td>\n",
-       "      <td>30.0000</td>\n",
-       "      <td>B42</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>888</th>\n",
-       "      <td>889</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
-       "      <td>female</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>1</td>\n",
-       "      <td>2</td>\n",
-       "      <td>W./C. 6607</td>\n",
-       "      <td>23.4500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>889</th>\n",
-       "      <td>890</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Behr, Mr. Karl Howell</td>\n",
-       "      <td>male</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>111369</td>\n",
-       "      <td>30.0000</td>\n",
-       "      <td>C148</td>\n",
-       "      <td>C</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>890</th>\n",
-       "      <td>891</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Dooley, Mr. Patrick</td>\n",
-       "      <td>male</td>\n",
-       "      <td>32.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>370376</td>\n",
-       "      <td>7.7500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>Q</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "<p>891 rows × 12 columns</p>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     PassengerId  Survived  Pclass  \\\n",
-       "0              1         0       3   \n",
-       "1              2         1       1   \n",
-       "2              3         1       3   \n",
-       "3              4         1       1   \n",
-       "4              5         0       3   \n",
-       "..           ...       ...     ...   \n",
-       "886          887         0       2   \n",
-       "887          888         1       1   \n",
-       "888          889         0       3   \n",
-       "889          890         1       1   \n",
-       "890          891         0       3   \n",
-       "\n",
-       "                                                  Name     Sex   Age  SibSp  \\\n",
-       "0                              Braund, Mr. Owen Harris    male  22.0      1   \n",
-       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
-       "2                               Heikkinen, Miss. Laina  female  26.0      0   \n",
-       "3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
-       "4                             Allen, Mr. William Henry    male  35.0      0   \n",
-       "..                                                 ...     ...   ...    ...   \n",
-       "886                              Montvila, Rev. Juozas    male  27.0      0   \n",
-       "887                       Graham, Miss. Margaret Edith  female  19.0      0   \n",
-       "888           Johnston, Miss. Catherine Helen \"Carrie\"  female   NaN      1   \n",
-       "889                              Behr, Mr. Karl Howell    male  26.0      0   \n",
-       "890                                Dooley, Mr. Patrick    male  32.0      0   \n",
-       "\n",
-       "     Parch            Ticket     Fare Cabin Embarked  \n",
-       "0        0         A/5 21171   7.2500   NaN        S  \n",
-       "1        0          PC 17599  71.2833   C85        C  \n",
-       "2        0  STON/O2. 3101282   7.9250   NaN        S  \n",
-       "3        0            113803  53.1000  C123        S  \n",
-       "4        0            373450   8.0500   NaN        S  \n",
-       "..     ...               ...      ...   ...      ...  \n",
-       "886      0            211536  13.0000   NaN        S  \n",
-       "887      0            112053  30.0000   B42        S  \n",
-       "888      2        W./C. 6607  23.4500   NaN        S  \n",
-       "889      0            111369  30.0000  C148        C  \n",
-       "890      0            370376   7.7500   NaN        Q  \n",
-       "\n",
-       "[891 rows x 12 columns]"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import pandas as pd\n",
-    "import numpy as np\n",
-    "\n",
-    "from datacleaner import autoclean\n",
-    "\n",
-    "df = pd.read_csv('https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv')\n",
-    "df"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>PassengerId</th>\n",
-       "      <th>Survived</th>\n",
-       "      <th>Pclass</th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>SibSp</th>\n",
-       "      <th>Parch</th>\n",
-       "      <th>Ticket</th>\n",
-       "      <th>Fare</th>\n",
-       "      <th>Cabin</th>\n",
-       "      <th>Embarked</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>108</td>\n",
-       "      <td>1</td>\n",
-       "      <td>22.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>523</td>\n",
-       "      <td>7.2500</td>\n",
-       "      <td>47</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>190</td>\n",
-       "      <td>0</td>\n",
-       "      <td>38.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>596</td>\n",
-       "      <td>71.2833</td>\n",
-       "      <td>81</td>\n",
-       "      <td>0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>3</td>\n",
-       "      <td>1</td>\n",
-       "      <td>3</td>\n",
-       "      <td>353</td>\n",
-       "      <td>0</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>669</td>\n",
-       "      <td>7.9250</td>\n",
-       "      <td>47</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>4</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>272</td>\n",
-       "      <td>0</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>49</td>\n",
-       "      <td>53.1000</td>\n",
-       "      <td>55</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>5</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>15</td>\n",
-       "      <td>1</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>472</td>\n",
-       "      <td>8.0500</td>\n",
-       "      <td>47</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>...</th>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>886</th>\n",
-       "      <td>887</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "      <td>548</td>\n",
-       "      <td>1</td>\n",
-       "      <td>27.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>101</td>\n",
-       "      <td>13.0000</td>\n",
-       "      <td>47</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>887</th>\n",
-       "      <td>888</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>303</td>\n",
-       "      <td>0</td>\n",
-       "      <td>19.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>14</td>\n",
-       "      <td>30.0000</td>\n",
-       "      <td>30</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>888</th>\n",
-       "      <td>889</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>413</td>\n",
-       "      <td>0</td>\n",
-       "      <td>28.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>2</td>\n",
-       "      <td>675</td>\n",
-       "      <td>23.4500</td>\n",
-       "      <td>47</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>889</th>\n",
-       "      <td>890</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>81</td>\n",
-       "      <td>1</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>8</td>\n",
-       "      <td>30.0000</td>\n",
-       "      <td>60</td>\n",
-       "      <td>0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>890</th>\n",
-       "      <td>891</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>220</td>\n",
-       "      <td>1</td>\n",
-       "      <td>32.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>466</td>\n",
-       "      <td>7.7500</td>\n",
-       "      <td>47</td>\n",
-       "      <td>1</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "<p>891 rows × 12 columns</p>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     PassengerId  Survived  Pclass  Name  Sex   Age  SibSp  Parch  Ticket  \\\n",
-       "0              1         0       3   108    1  22.0      1      0     523   \n",
-       "1              2         1       1   190    0  38.0      1      0     596   \n",
-       "2              3         1       3   353    0  26.0      0      0     669   \n",
-       "3              4         1       1   272    0  35.0      1      0      49   \n",
-       "4              5         0       3    15    1  35.0      0      0     472   \n",
-       "..           ...       ...     ...   ...  ...   ...    ...    ...     ...   \n",
-       "886          887         0       2   548    1  27.0      0      0     101   \n",
-       "887          888         1       1   303    0  19.0      0      0      14   \n",
-       "888          889         0       3   413    0  28.0      1      2     675   \n",
-       "889          890         1       1    81    1  26.0      0      0       8   \n",
-       "890          891         0       3   220    1  32.0      0      0     466   \n",
-       "\n",
-       "        Fare  Cabin  Embarked  \n",
-       "0     7.2500     47         2  \n",
-       "1    71.2833     81         0  \n",
-       "2     7.9250     47         2  \n",
-       "3    53.1000     55         2  \n",
-       "4     8.0500     47         2  \n",
-       "..       ...    ...       ...  \n",
-       "886  13.0000     47         2  \n",
-       "887  30.0000     30         2  \n",
-       "888  23.4500     47         2  \n",
-       "889  30.0000     60         0  \n",
-       "890   7.7500     47         1  \n",
-       "\n",
-       "[891 rows x 12 columns]"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df_clean = autoclean(df, copy=True)\n",
-    "df_clean"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/), A. Sharma, 2018.\n",
-    "* [Handy Python Libraries for Formatting and Cleaning Data](https://mode.com/blog/python-data-cleaning-libraries),  M. Bierly, 2016\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": true
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/preprocessing/11_3_autoclean.ipynb
+++ b/ml21/preprocessing/11_3_autoclean.ipynb
@ -1,578 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "849ad57e-6adb-4c2e-afd6-73db37eef572",
-   "metadata": {},
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "179cc802-9f1d-40b0-bf0c-9d4fb7ea1262",
-   "metadata": {},
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9858d815-0390-4e77-a5ff-a8d2a1960981",
-   "metadata": {},
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "238bab60-75f0-4d29-ab05-66afc463b506",
-   "metadata": {},
-   "source": [
-    "# Autoclean\n",
-    "A simple library to clean data. [Autoclean](https://github.com/elisemercury/AutoClean) supports:\n",
-    "AutoClean supports:\n",
-    "\n",
-    "* Handling of duplicates\n",
-    "* Various imputation methods for missing values\n",
-    "* Handling of outliers\n",
-    "* Encoding of categorical data (OneHot, Label)\n",
-    "* Extraction of data time values\n",
-    "\n",
-    "Install the package: **pip install py-AutoClean**.\n",
-    "\n",
-    "Parameters:\n",
-    "\n",
-    "* **duplicates**\n",
-    "    *  default: False,\n",
-    "    *  other values: 'auto', True\n",
-    "* **missing_num**\n",
-    "    * default:False,\n",
-    "    * other values:\t'auto', 'linreg', 'knn', 'mean', 'median', 'most_frequent', 'delete', False\n",
-    "* **missing_categ**\n",
-    "    * default: False,\n",
-    "    * other values:\t'auto', 'logreg', 'knn', 'most_frequent', 'delete', False\n",
-    "* **encode_categ**\n",
-    "    * default: False,\n",
-    "    * other values:\t'auto', ['onehot'], ['label'], False ; to encode only specific columns add a list of column names or indexes: ['auto', ['col1', 2]]\n",
-    "* **extract_datetime**\n",
-    "    * default:\tFalse,\n",
-    "    * other values:\t'auto', 'D', 'M', 'Y', 'h', 'm', 's'\n",
-    "* **outliers**\n",
-    "    * default:\tFalse,\n",
-    "    * other values:\t'auto', 'winz', 'delete'\n",
-    "* **outlier_param**\tdefault:\t1.5,  other values:\tany int or float, False\n",
-    "* **logfile**\n",
-    "    * default: True,\n",
-    "    * other values:\tFalse\n",
-    "* **verbose**\n",
-    "    * default: False,\n",
-    "    * other values:\tTrue"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 29,
-   "id": "491b034b-994e-4f06-b4bc-df0590a62aab",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>PassengerId</th>\n",
-       "      <th>Survived</th>\n",
-       "      <th>Pclass</th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>SibSp</th>\n",
-       "      <th>Parch</th>\n",
-       "      <th>Ticket</th>\n",
-       "      <th>Fare</th>\n",
-       "      <th>Cabin</th>\n",
-       "      <th>Embarked</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Braund, Mr. Owen Harris</td>\n",
-       "      <td>male</td>\n",
-       "      <td>22.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>A/5 21171</td>\n",
-       "      <td>7.2500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
-       "      <td>female</td>\n",
-       "      <td>38.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>PC 17599</td>\n",
-       "      <td>71.2833</td>\n",
-       "      <td>C85</td>\n",
-       "      <td>C</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>3</td>\n",
-       "      <td>1</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Heikkinen, Miss. Laina</td>\n",
-       "      <td>female</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>STON/O2. 3101282</td>\n",
-       "      <td>7.9250</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>4</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
-       "      <td>female</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>113803</td>\n",
-       "      <td>53.1000</td>\n",
-       "      <td>C123</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>5</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Allen, Mr. William Henry</td>\n",
-       "      <td>male</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>373450</td>\n",
-       "      <td>8.0500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>...</th>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>886</th>\n",
-       "      <td>887</td>\n",
-       "      <td>0</td>\n",
-       "      <td>2</td>\n",
-       "      <td>Montvila, Rev. Juozas</td>\n",
-       "      <td>male</td>\n",
-       "      <td>27.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>211536</td>\n",
-       "      <td>13.0000</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>887</th>\n",
-       "      <td>888</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Graham, Miss. Margaret Edith</td>\n",
-       "      <td>female</td>\n",
-       "      <td>19.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>112053</td>\n",
-       "      <td>30.0000</td>\n",
-       "      <td>B42</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>888</th>\n",
-       "      <td>889</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
-       "      <td>female</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>1</td>\n",
-       "      <td>2</td>\n",
-       "      <td>W./C. 6607</td>\n",
-       "      <td>23.4500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>S</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>889</th>\n",
-       "      <td>890</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Behr, Mr. Karl Howell</td>\n",
-       "      <td>male</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>111369</td>\n",
-       "      <td>30.0000</td>\n",
-       "      <td>C148</td>\n",
-       "      <td>C</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>890</th>\n",
-       "      <td>891</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Dooley, Mr. Patrick</td>\n",
-       "      <td>male</td>\n",
-       "      <td>32.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>370376</td>\n",
-       "      <td>7.7500</td>\n",
-       "      <td>NaN</td>\n",
-       "      <td>Q</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "<p>891 rows × 12 columns</p>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "     PassengerId  Survived  Pclass  \\\n",
-       "0              1         0       3   \n",
-       "1              2         1       1   \n",
-       "2              3         1       3   \n",
-       "3              4         1       1   \n",
-       "4              5         0       3   \n",
-       "..           ...       ...     ...   \n",
-       "886          887         0       2   \n",
-       "887          888         1       1   \n",
-       "888          889         0       3   \n",
-       "889          890         1       1   \n",
-       "890          891         0       3   \n",
-       "\n",
-       "                                                  Name     Sex   Age  SibSp  \\\n",
-       "0                              Braund, Mr. Owen Harris    male  22.0      1   \n",
-       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
-       "2                               Heikkinen, Miss. Laina  female  26.0      0   \n",
-       "3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
-       "4                             Allen, Mr. William Henry    male  35.0      0   \n",
-       "..                                                 ...     ...   ...    ...   \n",
-       "886                              Montvila, Rev. Juozas    male  27.0      0   \n",
-       "887                       Graham, Miss. Margaret Edith  female  19.0      0   \n",
-       "888           Johnston, Miss. Catherine Helen \"Carrie\"  female   NaN      1   \n",
-       "889                              Behr, Mr. Karl Howell    male  26.0      0   \n",
-       "890                                Dooley, Mr. Patrick    male  32.0      0   \n",
-       "\n",
-       "     Parch            Ticket     Fare Cabin Embarked  \n",
-       "0        0         A/5 21171   7.2500   NaN        S  \n",
-       "1        0          PC 17599  71.2833   C85        C  \n",
-       "2        0  STON/O2. 3101282   7.9250   NaN        S  \n",
-       "3        0            113803  53.1000  C123        S  \n",
-       "4        0            373450   8.0500   NaN        S  \n",
-       "..     ...               ...      ...   ...      ...  \n",
-       "886      0            211536  13.0000   NaN        S  \n",
-       "887      0            112053  30.0000   B42        S  \n",
-       "888      2        W./C. 6607  23.4500   NaN        S  \n",
-       "889      0            111369  30.0000  C148        C  \n",
-       "890      0            370376   7.7500   NaN        Q  \n",
-       "\n",
-       "[891 rows x 12 columns]"
-      ]
-     },
-     "execution_count": 29,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import pandas as pd\n",
-    "import numpy as np\n",
-    "\n",
-    "from AutoClean import AutoClean\n",
-    "\n",
-    "df = pd.read_csv('https://raw.githubusercontent.com/gsi-upm/sitc/master/ml2/data-titanic/train.csv')\n",
-    "df"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 36,
-   "id": "d842eedf-3971-4966-a8b4-543bb56dd60d",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "AutoClean process completed in 0.289385 seconds\n",
-      "Logfile saved to: /home/cif/GoogleDrive/cursos/summer-school-romania/2019/notebooks/preprocessing/autoclean.log\n"
-     ]
-    }
-   ],
-   "source": [
-    "autoclean = AutoClean(df, mode='auto')\n",
-    "\n",
-    "# We can control the preprocessing\n",
-    "#autoclean = AutoClean(df, mode='auto', duplicates=False, missing_num=False, missing_categ=False, encode_categ=False, extract_datetime=False, outliers=False, outlier_param=1.5, logfile=True, verbose=False)\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 38,
-   "id": "4ede7c55-475a-4748-8cc4-788f46c88b26",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>PassengerId</th>\n",
-       "      <th>Survived</th>\n",
-       "      <th>Pclass</th>\n",
-       "      <th>Name</th>\n",
-       "      <th>Sex</th>\n",
-       "      <th>Age</th>\n",
-       "      <th>SibSp</th>\n",
-       "      <th>Parch</th>\n",
-       "      <th>Ticket</th>\n",
-       "      <th>Fare</th>\n",
-       "      <th>Cabin</th>\n",
-       "      <th>Embarked</th>\n",
-       "      <th>Sex_female</th>\n",
-       "      <th>Sex_male</th>\n",
-       "      <th>Embarked_C</th>\n",
-       "      <th>Embarked_Q</th>\n",
-       "      <th>Embarked_S</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Braund, Mr. Owen Harris</td>\n",
-       "      <td>male</td>\n",
-       "      <td>22.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>A/5 21171</td>\n",
-       "      <td>7.2500</td>\n",
-       "      <td>C128</td>\n",
-       "      <td>S</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
-       "      <td>female</td>\n",
-       "      <td>38.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>PC 17599</td>\n",
-       "      <td>65.6344</td>\n",
-       "      <td>C85</td>\n",
-       "      <td>C</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>3</td>\n",
-       "      <td>1</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Heikkinen, Miss. Laina</td>\n",
-       "      <td>female</td>\n",
-       "      <td>26.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>STON/O2. 3101282</td>\n",
-       "      <td>7.9250</td>\n",
-       "      <td>C128</td>\n",
-       "      <td>S</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>4</td>\n",
-       "      <td>1</td>\n",
-       "      <td>1</td>\n",
-       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
-       "      <td>female</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>1</td>\n",
-       "      <td>0</td>\n",
-       "      <td>113803</td>\n",
-       "      <td>53.1000</td>\n",
-       "      <td>C123</td>\n",
-       "      <td>S</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>5</td>\n",
-       "      <td>0</td>\n",
-       "      <td>3</td>\n",
-       "      <td>Allen, Mr. William Henry</td>\n",
-       "      <td>male</td>\n",
-       "      <td>35.0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>0</td>\n",
-       "      <td>373450</td>\n",
-       "      <td>8.0500</td>\n",
-       "      <td>C128</td>\n",
-       "      <td>S</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "      <td>False</td>\n",
-       "      <td>False</td>\n",
-       "      <td>True</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   PassengerId  Survived  Pclass  \\\n",
-       "0            1         0       3   \n",
-       "1            2         1       1   \n",
-       "2            3         1       3   \n",
-       "3            4         1       1   \n",
-       "4            5         0       3   \n",
-       "\n",
-       "                                                Name     Sex   Age  SibSp  \\\n",
-       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
-       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
-       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
-       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
-       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
-       "\n",
-       "   Parch            Ticket     Fare Cabin Embarked  Sex_female  Sex_male  \\\n",
-       "0      0         A/5 21171   7.2500  C128        S       False      True   \n",
-       "1      0          PC 17599  65.6344   C85        C        True     False   \n",
-       "2      0  STON/O2. 3101282   7.9250  C128        S        True     False   \n",
-       "3      0            113803  53.1000  C123        S        True     False   \n",
-       "4      0            373450   8.0500  C128        S       False      True   \n",
-       "\n",
-       "   Embarked_C  Embarked_Q  Embarked_S  \n",
-       "0       False       False        True  \n",
-       "1        True       False       False  \n",
-       "2       False       False        True  \n",
-       "3       False       False        True  \n",
-       "4       False       False        True  "
-      ]
-     },
-     "execution_count": 38,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df_clean = autoclean.output\n",
-    "df_clean[0:5]"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/ml21/preprocessing/5_Duplicated_Values.ipynb
+++ b/ml21/preprocessing/5_Duplicated_Values.ipynb
@ -1,502 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Duplicated values\n",
-    "\n",
-    "There are two possible approaches: **remove** these rows or **filling** them. It depends on every case.\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "import numpy as np"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Filling NaN values\n",
-    "If we need to fill errors or blanks, we can use the methods **fillna()** or **dropna()**.\n",
-    "\n",
-    "* For **string** fields, we can fill NaN with **' '**.\n",
-    "\n",
-    "* For **numbers**, we can fill with the **mean** or **median** value. \n"
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "source": [
-    "# Fill NaN with ' '\n",
-    "df['col'] = df['col'].fillna(' ')\n",
-    "# Fill NaN with 99\n",
-    "df['col'] = df['col'].fillna(99)\n",
-    "# Fill NaN with the mean of the column\n",
-    "df['col'] = df['col'].fillna(df['col'].mean())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Propagate non-null values forward or backwards\n",
-    "You can also propagate non-null values forward or backwards by putting\n",
-    "method=’pad’ as the method argument. It will fill the next value in the\n",
-    "dataframe with the previous non-NaN value. Maybe you just want to fill one\n",
-    "value ( limit=1 )or you want to fill all the values."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "df = pd.DataFrame(data={'col1':[np.nan, np.nan, 2,3,4, np.nan, np.nan]})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "0   NaN\n",
-       "1   NaN\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0\n",
-       "5   NaN\n",
-       "6   NaN"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "0   NaN\n",
-       "1   NaN\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0\n",
-       "5   4.0\n",
-       "6   NaN"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# We fill forward the value 4.0 and fill the next one (limit = 1)\n",
-    "df.fillna(method='pad', limit=1)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "We can also backfilling with **bfill**."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>col1</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>2.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>3.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>4.0</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>NaN</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   col1\n",
-       "0   2.0\n",
-       "1   2.0\n",
-       "2   2.0\n",
-       "3   3.0\n",
-       "4   4.0\n",
-       "5   NaN\n",
-       "6   NaN"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# Fill the first two NaN values with the first available value\n",
-    "df.fillna(method='bfill')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Removing NaN values\n",
-    "We can remove them by row or column."
-   ]
-  },
-  {
-   "cell_type": "raw",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "source": [
-    "/# Drop any rows which have any nans\n",
-    "df.dropna()\n",
-    "/# Drop columns that have any nans\n",
-    "df.dropna(axis=1)\n",
-    "/# Only drop columns which have at least 90% non-NaNs\n",
-    "df.dropna(thresh=int(df.shape[0] * .9), axis=1)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.7.4"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 1
-}
--- a/ml21/preprocessing/9_String_Data.ipynb
+++ b/ml21/preprocessing/9_String_Data.ipynb
@ -1,619 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Preprocessing](00_Intro_Preprocessing.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# String Data\n",
-    "It is common to clean string columns so that they follow a predefined format (e.g. emails, URLs, ...).\n",
-    "\n",
-    "We can do it using regular expressions or specific libraries."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Beautifier\n",
-    "Simple [library](https://github.com/labtocat/beautifier) to cleanup and prettify url patterns, domains and so on. Library helps to clean unicodes, special characters and unnecessary redirection patterns from the urls and gives you clean date.\n",
-    "\n",
-    "Install with **'pip install beautifier'**."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Email cleanup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "from beautifier import Email\n",
-    "email = Email('me@imsach.in')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'imsach.in'"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email.domain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'me'"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email.username"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "False"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email.is_free_email"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "email2 = Email('This my address')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "False"
-      ]
-     },
-     "execution_count": 15,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email2.is_valid"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 23,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "email3 = Email('pepe@gmail.com')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 18,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email3.is_valid"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 27,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 27,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "email3.is_free_email"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## URL cleanup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 29,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "from beautifier import Url\n",
-    "url = Url('https://in.linkedin.com/in/sachinphilip?authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 31,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'https://in.linkedin.com/in/sachinphilip'"
-      ]
-     },
-     "execution_count": 31,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.cleanup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 33,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'in.linkedin.com'"
-      ]
-     },
-     "execution_count": 33,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.domain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 35,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "['authtoken=887nasdadasd6hasdtg21', 'secret=98jy766yhhuhnjk']"
-      ]
-     },
-     "execution_count": 35,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.param"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 37,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'authtoken=887nasdadasd6hasdtg21&secret=98jy766yhhuhnjk'"
-      ]
-     },
-     "execution_count": 37,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.parameters"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 39,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'sachinphilip'"
-      ]
-     },
-     "execution_count": 39,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "url.username"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Unicode\n",
-    "Problem: Some unicode code has been broken. We see the character in a different character dataset.\n",
-    "\n",
-    "A **mojibake** is a character displayed in an unintended character enconding. Example:  \"<22>\").\n",
-    "\n",
-    "We will use the library **ftfy** (fixed text for you) to fix it.\n",
-    "\n",
-    "First, you should install the library: ***conda install ftfy**. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 41,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "¯\\_(ツ)_/¯\n",
-      "Party\n",
-      "I'm\n"
-     ]
-    }
-   ],
-   "source": [
-    "import ftfy\n",
-    "foo = '&macr;\\\\_(ã\\x83\\x84)_/&macr;'\n",
-    "bar = '\\ufeffParty'\n",
-    "baz = '\\001\\033[36;44mI&#x92;m'\n",
-    "print(ftfy.fix_text(foo))\n",
-    "print(ftfy.fix_text(bar))\n",
-    "print(ftfy.fix_text(baz))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "We can understand which heuristics ftfy is using."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "ename": "NameError",
-     "evalue": "name 'ftfy' is not defined",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
-      "\u001b[0;32m<ipython-input-1-4030b963ff0a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mftfy\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexplain_unicode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfoo\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
-      "\u001b[0;31mNameError\u001b[0m: name 'ftfy' is not defined"
-     ]
-    }
-   ],
-   "source": [
-    "ftfy.explain_unicode(foo)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Dates\n",
-    "Sometimes we want to extract date from text. We can use regular expressions or handy packages, such as **python-dateutil**.\n",
-    "\n",
-    "Install the library: **pip install python-dateutil**."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "2019-08-22 10:22:46+00:00\n"
-     ]
-    }
-   ],
-   "source": [
-    "from dateutil.parser import parse\n",
-    "now = parse(\"Thu Aug 22 10:22:46 UTC 2019\")\n",
-    "print(now)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "2019-08-22 10:20:00\n"
-     ]
-    }
-   ],
-   "source": [
-    "dt = parse(\"Today is Thursday 8, 2019 at 10:20:00AM\", fuzzy=True)\n",
-    "print(dt)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Cleaning and Prepping Data with Python for Data Science — Best Practices and Helpful Packages](https://medium.com/@rrfd/cleaning-and-prepping-data-with-python-for-data-science-best-practices-and-helpful-packages-af1edfbe2a3), DeFilippi, 2019, \n",
-    "* [Data Preprocessing for Machine learning in Python, GeeksForGeeks](https://www.geeksforgeeks.org/data-preprocessing-machine-learning-python/)\n",
-    "* Beautifier https://github.com/labtocat/beautifier\n",
-    "* Ftfy https://ftfy.readthedocs.io/en/latest/\n",
-    "* python-dateutil https://dateutil.readthedocs.io/en/stable/"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.7.4"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 1
-}
--- a/ml21/preprocessing/images/EscUpmPolit_p.gif
+++ b/ml21/preprocessing/images/EscUpmPolit_p.gif
--- a/ml21/preprocessing/images/titanic.jpg
+++ b/ml21/preprocessing/images/titanic.jpg
--- a/ml21/visualization/.gitkeep
+++ b/ml21/visualization/.gitkeep
@ -1 +0,0 @@
-
--- a/ml21/visualization/00_Intro_Visualization.ipynb
+++ b/ml21/visualization/00_Intro_Visualization.ipynb
@ -1,185 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Introduction to Visualization\n",
-    " \n",
-    "In this session, we will get more insight regarding how to visualize data.\n",
-    "\n",
-    "# Objectives\n",
-    "\n",
-    "The main objectives of this session are:\n",
-    "* Understanding how to visualize data\n",
-    "* Understanding the purpose of different charts \n",
-    "* Experimenting with several environments for visualizing data\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Seaborn\n",
-    "\n",
-    "Seaborn is a library that visualizes data in Python. The main characteristics are:\n",
-    "\n",
-    "* A dataset-oriented API for examining relationships between multiple variables\n",
-    "* Specialized support for using categorical variables to show observations or aggregate statistics\n",
-    "* Options for visualizing univariate or bivariate distributions and for comparing them between subsets of data\n",
-    "* Automatic estimation and plotting of linear regression models for different kinds of dependent variables\n",
-    "* Convenient views of the overall structure of complex datasets\n",
-    "* High-level abstractions for structuring multi-plot grids that let you quickly build complex visualizations\n",
-    "* Concise control over matplotlib figure styling with several built-in themes\n",
-    "* Tools for choosing color palettes that faithfully reveal patterns in your data\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "## Install\n",
-    "Use:\n",
-    "\n",
-    "**conda install seaborn**\n",
-    "\n",
-    "or \n",
-    "\n",
-    "**pip install seaborn**"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Table of Contents"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "source": [
-    "1. [Home](00_Intro_Visualization.ipynb)\n",
-    "2. [Dataset](01_Dataset.ipynb)\n",
-    "3. [Comparison Charts](02_Comparison_Charts.ipynb)\n",
-    "     1. [More Comparison Charts](02_01_More_Comparison_Charts.ipynb)\n",
-    "4. [Distribution Charts](03_Distribution_Charts.ipynb)\n",
-    "5. [Hierarchical charts](04_Hierarchical_Charts.ipynb)\n",
-    "6. [Relational charts](05_Relational_Charts.ipynb)\n",
-    "7. [Spatial charts](06_Spatial_Charts.ipynb)\n",
-    "8. [Temporal charts](07_Temporal_Charts.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/visualization/01_Dataset.ipynb
+++ b/ml21/visualization/01_Dataset.ipynb
@ -1,363 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Visualization](00_Intro_Visualization.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "# Dataset\n",
-    "Seaborn includes several datasets. We can consult the available datasets and load them. \n",
-    "\n",
-    "The datasets are also available at https://github.com/mwaskom/seaborn-data."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "fragment"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "from matplotlib import pyplot as plt\n",
-    "import seaborn as sns"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "['anagrams',\n",
-       " 'anscombe',\n",
-       " 'attention',\n",
-       " 'brain_networks',\n",
-       " 'car_crashes',\n",
-       " 'diamonds',\n",
-       " 'dots',\n",
-       " 'dowjones',\n",
-       " 'exercise',\n",
-       " 'flights',\n",
-       " 'fmri',\n",
-       " 'geyser',\n",
-       " 'glue',\n",
-       " 'healthexp',\n",
-       " 'iris',\n",
-       " 'mpg',\n",
-       " 'penguins',\n",
-       " 'planets',\n",
-       " 'seaice',\n",
-       " 'taxis',\n",
-       " 'tips',\n",
-       " 'titanic']"
-      ]
-     },
-     "execution_count": 2,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "sns.get_dataset_names()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<div>\n",
-       "<style scoped>\n",
-       "    .dataframe tbody tr th:only-of-type {\n",
-       "        vertical-align: middle;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe tbody tr th {\n",
-       "        vertical-align: top;\n",
-       "    }\n",
-       "\n",
-       "    .dataframe thead th {\n",
-       "        text-align: right;\n",
-       "    }\n",
-       "</style>\n",
-       "<table border=\"1\" class=\"dataframe\">\n",
-       "  <thead>\n",
-       "    <tr style=\"text-align: right;\">\n",
-       "      <th></th>\n",
-       "      <th>total_bill</th>\n",
-       "      <th>tip</th>\n",
-       "      <th>sex</th>\n",
-       "      <th>smoker</th>\n",
-       "      <th>day</th>\n",
-       "      <th>time</th>\n",
-       "      <th>size</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th>0</th>\n",
-       "      <td>16.99</td>\n",
-       "      <td>1.01</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>1</th>\n",
-       "      <td>10.34</td>\n",
-       "      <td>1.66</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>3</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2</th>\n",
-       "      <td>21.01</td>\n",
-       "      <td>3.50</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>3</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>3</th>\n",
-       "      <td>23.68</td>\n",
-       "      <td>3.31</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>4</th>\n",
-       "      <td>24.59</td>\n",
-       "      <td>3.61</td>\n",
-       "      <td>Female</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>4</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>5</th>\n",
-       "      <td>25.29</td>\n",
-       "      <td>4.71</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>4</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>6</th>\n",
-       "      <td>8.77</td>\n",
-       "      <td>2.00</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>7</th>\n",
-       "      <td>26.88</td>\n",
-       "      <td>3.12</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>4</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>8</th>\n",
-       "      <td>15.04</td>\n",
-       "      <td>1.96</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>9</th>\n",
-       "      <td>14.78</td>\n",
-       "      <td>3.23</td>\n",
-       "      <td>Male</td>\n",
-       "      <td>No</td>\n",
-       "      <td>Sun</td>\n",
-       "      <td>Dinner</td>\n",
-       "      <td>2</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n",
-       "</div>"
-      ],
-      "text/plain": [
-       "   total_bill   tip     sex smoker  day    time  size\n",
-       "0       16.99  1.01  Female     No  Sun  Dinner     2\n",
-       "1       10.34  1.66    Male     No  Sun  Dinner     3\n",
-       "2       21.01  3.50    Male     No  Sun  Dinner     3\n",
-       "3       23.68  3.31    Male     No  Sun  Dinner     2\n",
-       "4       24.59  3.61  Female     No  Sun  Dinner     4\n",
-       "5       25.29  4.71    Male     No  Sun  Dinner     4\n",
-       "6        8.77  2.00    Male     No  Sun  Dinner     2\n",
-       "7       26.88  3.12    Male     No  Sun  Dinner     4\n",
-       "8       15.04  1.96    Male     No  Sun  Dinner     2\n",
-       "9       14.78  3.23    Male     No  Sun  Dinner     2"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "df = sns.load_dataset('tips')\n",
-    "df.head(10)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# References\n",
-    "* [Seaborn](http://seaborn.pydata.org/index.html) documentation"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.13"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/ml21/visualization/02_01_More_Comparison_Charts.ipynb
+++ b/ml21/visualization/02_01_More_Comparison_Charts.ipynb
--- a/ml21/visualization/02_Comparison_Charts.ipynb
+++ b/ml21/visualization/02_Comparison_Charts.ipynb
--- a/ml21/visualization/03_Distribution_Charts.ipynb
+++ b/ml21/visualization/03_Distribution_Charts.ipynb
--- a/ml21/visualization/04_Hierarchical_Charts.ipynb
+++ b/ml21/visualization/04_Hierarchical_Charts.ipynb
--- a/ml21/visualization/05_Relational_Charts.ipynb
+++ b/ml21/visualization/05_Relational_Charts.ipynb
--- a/ml21/visualization/06_Spatial_Charts.ipynb
+++ b/ml21/visualization/06_Spatial_Charts.ipynb
--- a/ml21/visualization/07_Temporal_Charts.ipynb
+++ b/ml21/visualization/07_Temporal_Charts.ipynb
--- a/ml21/visualization/images/EscUpmPolit_p.gif
+++ b/ml21/visualization/images/EscUpmPolit_p.gif
--- a/ml4/2_5_1_Exercise.ipynb
+++ b/ml4/2_5_1_Exercise.ipynb
@ -187,9 +187,9 @@
   "metadata": {},
   "source": [
    "## Comparing\n",
-    "Your task is to modify the previous code to canonical GA configuration from Holland (look at the lesson's slides). In addition you should consult the [DEAP API](http://deap.readthedocs.io/en/master/api/tools.html#operators).\n",
+    "Your task is modify the previous code to canonical GA configuration from Holland (look at the lesson's slides). In addition you should consult the [DEAP API](http://deap.readthedocs.io/en/master/api/tools.html#operators).\n",
    "\n",
-    "Submit your notebook and include a modified code and a comparison of the effects of these changes. \n",
+    "Submit your notebook and include a the modified code, and a comparison of the effects of these changes. \n",
    "\n",
    "Discuss your findings."
   ]
@ -198,7 +198,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Optional. Optimizing ML hyperparameters\n",
+    "## Optimizing ML hyperparameters\n",
    "\n",
    "One of the applications of Genetic Algorithms is the optimization of ML hyperparameters. Previously we have used GridSearch from Scikit. Using (sklearn-deap)[[References](#References)], optimize the Titatic hyperparameters using both GridSearch and Genetic Algorithms. \n",
    "\n",
@ -206,7 +206,7 @@
    "\n",
    "Submit a notebook where you include well-crafted conclusions about the exercises, discussing the pros and cons of using genetic algorithms for this purpose.\n",
    "\n",
-    "Note: There is a problem with Scikit version 0.24. Comment on the different approaches."
+    "Note: There is a problem with the version 0.24 of scikit. Just comment the different approaches."
   ]
  },
  {
@ -222,7 +222,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Optional. Optimizing an ML pipeline with a genetic algorithm\n",
+    "## Optimizing a ML pipeline with a genetic algorithm\n",
    "\n",
    "The library [TPOT](#References) optimizes ML pipelines and comes with a lot of (examples)[https://epistasislab.github.io/tpot/examples/] and even notebooks, for example for the [iris dataset](https://github.com/EpistasisLab/tpot/blob/master/tutorials/IRIS.ipynb).\n",
    "\n",
--- a/ml5/2_6_1_Q-Learning_Basic.ipynb
+++ b/ml5/2_6_1_Q-Learning_Basic.ipynb
@ -44,7 +44,7 @@
    "First of all, install the Gymnasium library, which is a fork of the OpenAI Gym  library:\n",
    "\n",
    "```console\n",
-    "foo@bar:~$ pip install gymnasium[classic-control]\n",
+    "foo@bar:~$ conda install gymnasium\n",
    "```\n",
    "\n",
    "If you get an error 'No module named 'Box2D', install 'pybox2d'.\n"
--- a/nlp/0_1_LLM.ipynb
+++ b/nlp/0_1_LLM.ipynb
--- a/python/1_0_Intro_Python.ipynb
+++ b/python/1_0_Intro_Python.ipynb
@ -74,6 +74,7 @@
    "* [The Python tutorial](https://docs.python.org/3/tutorial/)\n",
    "* [Object-Oriented Programming in Python](http://python-textbok.readthedocs.org/en/latest/index.html)\n",
    "* [Python3 tutorial](http://www.python-course.eu/python3_course.php)\n",
+    "* [Python for the Busy Java Developer, Deepak Sarda, 2014](http://antrix.net/static/pages/python-for-java/online/)\n",
    "* [Style Guide for Python Code (PEP-0008)](https://www.python.org/dev/peps/pep-0008/)\n",
    "* [Python Slides](http://tdc-www.harvard.edu/Python.pdf)\n",
    "* [Python for Programmers - 1 day course](http://www.ucs.cam.ac.uk/docs/course-notes/unix-courses/archived/archived-python-courses/PythonProgIntro/files/notes.pdf)\n",
--- a/python/1_2_Numbers_Strings.ipynb
+++ b/python/1_2_Numbers_Strings.ipynb
@ -85,7 +85,7 @@
    "In Python3, there are the following [numeric types](https://docs.python.org/3/library/stdtypes.html#typesnumeric):\n",
    "* integers (int): 1, -1, ...\n",
    "* floating point numbers (float): 0.1, 1E2\n",
-    "* complex numbers (complex): 2 + 3j\n.",
+    "* complex numbers (complex): 2 + 3j\n",
    "Let's play a bit"
   ]
  },
--- a/python/1_3_Sequences.ipynb
+++ b/python/1_3_Sequences.ipynb
@ -377,7 +377,7 @@
    "\n",
    "Tuples are faster than lists. Its main usage is when the collection is constant, or you do not want it can be changed (write protected). \n",
    "\n",
-    "Tuples can be converted into lists and vice-versa, with the methods *list()* and *tuple()*."
+    "Tuples can be converted into lists and vice-versa, with the methods list() and tuple()."
   ]
  },
  {
--- a/python/1_4_Sets.ipynb
+++ b/python/1_4_Sets.ipynb
@ -37,7 +37,7 @@
    "\n",
    "A set object is an unordered collection of distinct objects. There are two built-in set types: **set** (mutable) and **frozenset** (inmutable).\n",
    "\n",
-    "A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is only one builtin mapping type: **dictionary**."
+    "A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is only one bultin mapping type: **dictionary**."
   ]
  },
  {
--- a/python/1_7_Variables.ipynb
+++ b/python/1_7_Variables.ipynb
@ -65,7 +65,7 @@
    "Python is a **strongly typed** language and **dynamically typed** language.\n",
    "\n",
    "This means:\n",
-    "* **dynamically typed**: variables do not declare a static type (as in Java int a = 2;). Variables have no type themselves, they are just names that hold a reference to some object. The type of the variable is changed dynamically when you change the type of the assigned data object. \n",
+    "* ** dynamically typed**: variables do not declare a static type (as in Java int a = 2;). Variables have no type themselves, they are just names that hold a reference to some object. The type of the variable is changed dynamically when you change the type of the assigned data object. \n",
    "* **strongly typed**: the interpreter tracks variable types. There is no implicit type conversion. This means that all the type variables should be converted manually, preventing from unexpected behaviour. "
   ]
  },
--- a/python/1_8_Classes.ipynb
+++ b/python/1_8_Classes.ipynb
@ -41,7 +41,7 @@
    "The first argument of instance class method is self, that refers to the current instance of the class.\n",
    "There is a special method, __init__ that initializes the object. It is like a constructor, but the object is already created when __init__ is called.\n",
    "\n",
-    "Instance attributes are define as *self.variables*. (self is the same than this in Java)."
+    "Instance attributes are define as self.variables. (self is the same than this in Java)."
   ]
  },
  {
--- a/sna/0_Intro_Network_Analysis.ipynb
+++ b/sna/0_Intro_Network_Analysis.ipynb
@ -1,154 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Introduction to Network Analysis\n",
-    " \n",
-    "In this session, we are going to get more insight regarding how to analyze and visualize social networks.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Objectives\n",
-    "\n",
-    "The main objectives of this session are:\n",
-    "* Understanding why networks are important in data science\n",
-    "* Experimenting with network analysis with networkx."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Table of Contents"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "subslide"
-    }
-   },
-   "source": [
-    "1. [Home](0_Intro_Network_Analysis.ipynb)\n",
-    "2. [First Steps](1_First_Steps.ipynb)\n",
-    "3. [Working_with_Graphs](2_Working_with_Graphs.ipynb)\n",
-    "4. [Network Analysis](3_Network_Analysis.ipynb)\n",
-    "5. [Social Networks](4_Social_Networks.ipynb)\n",
-    "6. [Pandas integration](5_Pandas.ipynb)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence\n",
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "datacleaner": {
-   "position": {
-    "top": "50px"
-   },
-   "python": {
-    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
-   },
-   "window_display": false
-  },
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/sna/1_First_Steps.ipynb
+++ b/sna/1_First_Steps.ipynb
--- a/sna/2_Working_with_Graphs.ipynb
+++ b/sna/2_Working_with_Graphs.ipynb
--- a/sna/2a_Florentine_Families_Star_Wars.ipynb
+++ b/sna/2a_Florentine_Families_Star_Wars.ipynb
@ -1,374 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "![](images/EscUpmPolit_p.gif \"UPM\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "# Course Notes for Learning Intelligent Systems"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## [Introduction to  Network Analysis](0_Intro_Network_Analysis.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Exercise: Florentine families"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import networkx as nx\n",
-    "import warnings\n",
-    "warnings.simplefilter(action='ignore', category=FutureWarning)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "G_florentine = nx.florentine_families_graph()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "slide"
-    }
-   },
-   "source": [
-    "# Exercise: Star Wars"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import networkx as nx\n",
-    "\n",
-    "# Taken from https://gist.github.com/codingthat/be03565bd97e789a3835b50235ad562f\n",
-    "# The original dataset is from:\n",
-    "# Gabasova, E. (2016). Star Wars social network. DOI: https://doi.org/10.5281/zenodo.1411479\n",
-    "# \n",
-    "# Simplified by Federico Albanese.\n",
-    "\n",
-    "characters = [\"R2-D2\",\n",
-    "                \"CHEWBACCA\",\n",
-    "                \"C-3PO\",\n",
-    "                \"LUKE\",\n",
-    "                \"DARTH VADER\",\n",
-    "                \"CAMIE\",\n",
-    "                \"BIGGS\",\n",
-    "                \"LEIA\",\n",
-    "                \"BERU\",\n",
-    "                \"OWEN\",\n",
-    "                \"OBI-WAN\",\n",
-    "                \"MOTTI\",\n",
-    "                \"TARKIN\",\n",
-    "                \"HAN\",\n",
-    "                \"DODONNA\",\n",
-    "                \"GOLD LEADER\",\n",
-    "                \"WEDGE\",\n",
-    "                \"RED LEADER\",\n",
-    "                \"RED TEN\"]\n",
-    "\n",
-    "\n",
-    "edges = [(\"CHEWBACCA\", \"R2-D2\"),\n",
-    "        (\"C-3PO\",\"R2-D2\"),\n",
-    "        (\"BERU\", \"R2-D2\"),\n",
-    "        (\"LUKE\", \"R2-D2\"),\n",
-    "        (\"OWEN\", \"R2-D2\"),\n",
-    "        (\"OBI-WAN\", \"R2-D2\"),\n",
-    "        (\"LEIA\", \"R2-D2\"),\n",
-    "        (\"BIGGS\", \"R2-D2\"),\n",
-    "        (\"HAN\", \"R2-D2\"),\n",
-    "        (\"CHEWBACCA\", \"OBI-WAN\"),\n",
-    "        (\"C-3PO\", \"CHEWBACCA\"),\n",
-    "        (\"CHEWBACCA\", \"LUKE\"),\n",
-    "        (\"CHEWBACCA\", \"HAN\"),\n",
-    "        (\"CHEWBACCA\", \"LEIA\"),\n",
-    "        (\"CAMIE\", \"LUKE\"),\n",
-    "        (\"BIGGS\", \"CAMIE\"),\n",
-    "        (\"BIGGS\", \"LUKE\"),\n",
-    "        (\"DARTH VADER\", \"LEIA\"),\n",
-    "        (\"BERU\", \"LUKE\"),\n",
-    "        (\"BERU\", \"OWEN\"),\n",
-    "        (\"BERU\", \"C-3PO\"),\n",
-    "        (\"LUKE\", \"OWEN\"),\n",
-    "        (\"C-3PO\", \"LUKE\"),\n",
-    "        (\"C-3PO\", \"OWEN\"),\n",
-    "        (\"C-3PO\", \"LEIA\"),\n",
-    "        (\"LEIA\", \"LUKE\"),\n",
-    "        (\"BERU\", \"LEIA\"),\n",
-    "        (\"LUKE\", \"OBI-WAN\"),\n",
-    "        (\"C-3PO\", \"OBI-WAN\"),\n",
-    "        (\"LEIA\", \"OBI-WAN\"),\n",
-    "        (\"MOTTI\", \"TARKIN\"),\n",
-    "        (\"DARTH VADER\", \"MOTTI\"),\n",
-    "        (\"DARTH VADER\", \"TARKIN\"),\n",
-    "        (\"HAN\", \"OBI-WAN\"),\n",
-    "        (\"HAN\", \"LUKE\"),\n",
-    "        (\"C-3PO\", \"HAN\"),\n",
-    "        (\"LEIA\", \"MOTT\"),\n",
-    "        (\"LEIA\", \"TARKIN\"),\n",
-    "        (\"HAN\", \"LEIA\"),\n",
-    "        (\"DARTH VADER\", \"OBI-WAN\"),\n",
-    "        (\"DODONNA\", \"GOLD LEADER\"),\n",
-    "        (\"DODONNA\", \"WEDGE\"),\n",
-    "        (\"DODONNA\", \"LUKE\"),\n",
-    "        (\"GOLD LEADER\", \"WEDGE\"),\n",
-    "        (\"GOLD LEADER\", \"LUKE\"),\n",
-    "        (\"LUKE\", \"WEDGE\"),\n",
-    "        (\"BIGGS\", \"LEIA\"),\n",
-    "        (\"LEIA\", \"RED LEADER\"),\n",
-    "        (\"LUKE\", \"RED LEADER\"),\n",
-    "        (\"BIGGS\", \"RED LEADER\"),\n",
-    "        (\"BIGGS\", \"C-3PO\"),\n",
-    "        (\"C-3PO\", \"RED LEADER\"),\n",
-    "        (\"RED LEADER\", \"WEDGE\"),\n",
-    "        (\"GOLD LEADER\", \"RED LEADER\"),\n",
-    "        (\"BIGGS\", \"WEDGE\"),\n",
-    "        (\"RED LEADER\", \"RED TEN\"),\n",
-    "        (\"BIGGS\", \"GOLD LEADER\"),\n",
-    "        (\"LUKE\", \"RED TEN\")]\n",
-    "\n",
-    "G_starWars = nx.Graph()\n",
-    "\n",
-    "\n",
-    "G_starWars.add_nodes_from(characters)\n",
-    "G_starWars.add_edges_from(edges)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Exercise\n",
-    "In this exercise we are going to practice some of the concepts of the session.\n",
-    "\n",
-    "Answer the following questions using the object *G_starWars* and *G_florentine*."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Number of nodes"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Number of edges"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Get the list of nodes"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Get the list of edges"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Draw the graph\n",
-    "\n",
-    "Hint.  Use different layouts (circular, spring, ...)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Think of interesting micro, meso and macro metrics"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Analyze ego networks of interesting nodes."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Analyze communities"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "## Licence"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "slideshow": {
-     "slide_type": "skip"
-    }
-   },
-   "source": [
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
-    "\n",
-    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
-   ]
-  }
- ],
- "metadata": {
-  "celltoolbar": "Slideshow",
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.7"
-  },
-  "latex_envs": {
-   "LaTeX_envs_menu_present": true,
-   "autocomplete": true,
-   "bibliofile": "biblio.bib",
-   "cite_by": "apalike",
-   "current_citInitial": 1,
-   "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
-   "hotkeys": {
-    "equation": "Ctrl-E",
-    "itemize": "Ctrl-I"
-   },
-   "labels_anchors": false,
-   "latex_user_defs": false,
-   "report_style_numbering": false,
-   "user_envs_cfg": false
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/sna/3_Network_Analysis.ipynb
+++ b/sna/3_Network_Analysis.ipynb
--- a/sna/4_Social_Networks.ipynb
+++ b/sna/4_Social_Networks.ipynb
--- a/sna/5_Pandas.ipynb
+++ b/sna/5_Pandas.ipynb
--- a/sna/images/EscUpmPolit_p.gif
+++ b/sna/images/EscUpmPolit_p.gif