1
0
mirror of https://github.com/gsi-upm/sitc synced 2025-06-12 11:22:20 +00:00

Compare commits

...

32 Commits

Author SHA1 Message Date
Carlos A. Iglesias
b58370a19a
Update .gitignore 2025-06-02 17:23:44 +03:00
Carlos A. Iglesias
5c203b0884
Update spiral.py
Fixed typo
2025-06-02 17:22:55 +03:00
Carlos A. Iglesias
5bf815f60f
Update 2_4_2_Exercise_Optional.ipynb
Changed image path
2025-06-02 17:22:16 +03:00
Carlos A. Iglesias
90a3ff098b
Update 2_4_1_Exercise.ipynb
Changed image path
2025-06-02 17:21:25 +03:00
Carlos A. Iglesias
945a8a7fb6
Update 2_4_0_Intro_NN.ipynb
Changed image path
2025-06-02 17:19:19 +03:00
Carlos A. Iglesias
6532ef1b27
Update 2_8_Conclusions.ipynb
Changed image path
2025-06-02 17:18:31 +03:00
Carlos A. Iglesias
3a73b2b286
Update 2_7_Model_Persistence.ipynb
Changed image path
2025-06-02 17:17:43 +03:00
Carlos A. Iglesias
2e4ec3cfdc
Update 2_6_Model_Tuning.ipynb 2025-06-02 17:16:53 +03:00
Carlos A. Iglesias
21e7ae2f57
Update 2_5_2_Decision_Tree_Model.ipynb
Changed image path
2025-06-02 17:13:49 +03:00
Carlos A. Iglesias
7b4d16964d
Update 2_5_1_kNN_Model.ipynb
Changed image path
2025-06-02 17:11:45 +03:00
Carlos A. Iglesias
c5967746ea
Update 2_5_0_Machine_Learning.ipynb 2025-06-02 17:09:42 +03:00
Carlos A. Iglesias
ed7f0f3e1c
Update 2_5_0_Machine_Learning.ipynb 2025-06-02 17:09:13 +03:00
Carlos A. Iglesias
9324516c19
Update 2_5_0_Machine_Learning.ipynb
Changed image path
2025-06-02 17:08:03 +03:00
Carlos A. Iglesias
6fc5565ea0
Update 2_2_Read_Data.ipynb 2025-06-02 17:05:17 +03:00
Carlos A. Iglesias
1113485833
Add files via upload 2025-06-02 17:03:20 +03:00
Carlos A. Iglesias
0c3f317a85
Add files via upload 2025-06-02 17:02:46 +03:00
Carlos A. Iglesias
0b550c837b
Update 2_2_Read_Data.ipynb
Added figures
2025-06-02 17:00:58 +03:00
Carlos A. Iglesias
d7ce6df7fe
Update 2_2_Read_Data.ipynb 2025-06-02 16:57:54 +03:00
Carlos A. Iglesias
e2edae6049
Update 2_2_Read_Data.ipynb 2025-06-02 16:54:37 +03:00
Carlos A. Iglesias
4ea0146def
Update 2_2_Read_Data.ipynb 2025-06-02 16:54:06 +03:00
Carlos A. Iglesias
e7b2cee795
Add files via upload 2025-06-02 16:31:20 +03:00
Carlos A. Iglesias
9e1d0e5534
Add files via upload 2025-06-02 16:30:13 +03:00
Carlos A. Iglesias
f82203f371
Update 2_4_Preprocessing.ipynb
Changed image path
2025-06-02 16:29:26 +03:00
Carlos A. Iglesias
b9ecccdeab
Update 2_3_1_Advanced_Visualisation.ipynb 2025-06-02 16:28:06 +03:00
Carlos A. Iglesias
44a555ac2d
Update 2_3_1_Advanced_Visualisation.ipynb
Changed image path
2025-06-02 16:09:55 +03:00
Carlos A. Iglesias
ec11ff2d5e
Update 2_3_0_Visualisation.ipynb
Changed image path
2025-06-02 16:06:53 +03:00
Carlos A. Iglesias
ec02125396
Update 2_2_Read_Data.ipynb 2025-06-02 16:04:57 +03:00
Carlos A. Iglesias
b5f1a7dd22
Update 2_0_0_Intro_ML.ipynb 2025-06-02 16:03:03 +03:00
Carlos A. Iglesias
1cc1e45673
Update 2_2_Read_Data.ipynb
Changed image path
2025-06-02 16:02:45 +03:00
Carlos A. Iglesias
a2ad2c0e92
Update 2_1_Intro_ScikitLearn.ipynb
Changed images path
2025-06-02 16:00:59 +03:00
Carlos A. Iglesias
1add6a4c8e
Update 2_0_1_Objectives.ipynb
Changed image path
2025-06-02 15:58:32 +03:00
Carlos A. Iglesias
af78e6480d
Update 2_0_0_Intro_ML.ipynb
changed path to image
2025-06-02 15:57:25 +03:00
23 changed files with 134 additions and 125 deletions

BIN
images/iris-classes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

BIN
images/iris-features.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 944 KiB

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -79,7 +79,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -40,10 +40,10 @@
"\n",
"* Learn to use scikit-learn\n",
"* Learn the basic steps to apply machine learning techniques: dataset analysis, load, preprocessing, training, validation, optimization and persistence.\n",
"* Learn how to do a exploratory data analysis\n",
"* Learn how to do an exploratory data analysis\n",
"* Learn how to visualise a dataset\n",
"* Learn how to load a bundled dataset\n",
"* Learn how to separate the dataset into traning and testing datasets\n",
"* Learn how to separate the dataset into training and testing datasets\n",
"* Learn how to train a classifier\n",
"* Learn how to predict with a trained classifier\n",
"* Learn how to evaluate the predictions\n",
@ -71,7 +71,7 @@
"metadata": {},
"source": [
"## LIcence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -87,7 +87,7 @@
"metadata": {},
"source": [
"Scikit-learn provides algorithms for solving the following problems:\n",
"* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, C4.5, ...), kNN, SVM, Random forest, Perceptron, etc. \n",
"* **Classification**: Identifying to which category an object belongs. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, C4.5, ...), kNN, SVM, Random forest, Perceptron, etc. \n",
"* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n",
"* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n",
"* **Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
@ -105,7 +105,7 @@
"metadata": {},
"source": [
"In addition, scikit-learn helps in several tasks:\n",
"* **Model selection**: Comparing, validating, choosing parameters and models, and persisting models. Some of the [available functionalities](http://scikit-learn.org/stable/model_selection.html#model-selection) are cross-validation or grid search for optimizing the parameters. \n",
"* **Model selection**: Comparing, validating, choosing parameters and models, and persisting models. Some [available functionalities](http://scikit-learn.org/stable/model_selection.html#model-selection) are cross-validation or grid search for optimizing the parameters. \n",
"* **Preprocessing**: Several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. Some of the available [preprocessing functions](http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing) are scaling and normalizing data, or imputing missing values."
]
},
@ -128,9 +128,9 @@
"\n",
"If it is not installed, install it with conda: `conda install scikit-learn`.\n",
"\n",
"If you have installed scipy and numpy, you can also installed using pip: `pip install -U scikit-learn`.\n",
"If you have installed scipy and numpy, you can also install using pip: `pip install -U scikit-learn`.\n",
"\n",
"It is not recommended to use pip for installing scipy and numpy. Instead, use conda or install the linux package *python-sklearn*."
"It is not recommended to use pip to install scipy and numpy. Instead, use conda or install the Linux package *python-sklearn*."
]
},
{
@ -156,7 +156,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")\n",
"![](./images/EscUpmPolit_p.gif \"UPM\")\n",
"\n",
"# Course Notes for Learning Intelligent Systems\n",
"\n",
@ -34,11 +34,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal of this notebook is to learn how to read and load a sample dataset.\n",
"This notebook aims to learn how to read and load a sample dataset.\n",
"\n",
"Scikit-learn comes with some bundled [datasets](https://scikit-learn.org/stable/datasets.html): iris, digits, boston, etc.\n",
"\n",
"In this notebook we are going to use the Iris dataset."
"In this notebook, we will use the Iris dataset."
]
},
{
@ -54,16 +54,25 @@
"source": [
"The [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set), available at [UCI dataset repository](https://archive.ics.uci.edu/ml/datasets/Iris), is a classic dataset for classification.\n",
"\n",
"The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features, a machine learning model will learn to differentiate the species of Iris.\n",
"The dataset consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. Based on the combination of these four features, a machine learning model will learn to differentiate the species of Iris.\n",
"\n",
"![Iris](files/images/iris-dataset.jpg)"
"![Iris dataset](./images/iris-dataset.jpg \"Iris\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to read the dataset, we import the datasets bundle and then load the Iris dataset. "
"Here you can see the species and the features.\n",
"![Iris features](./images/iris-features.png \"Iris features\")\n",
"![Iris classes](./images/iris-classes.png \"Iris classes\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To read the dataset, we import the datasets bundle and then load the Iris dataset. "
]
},
{
@ -180,7 +189,7 @@
"metadata": {},
"outputs": [],
"source": [
"#Using numpy, I can print the dimensions (here we are working with 2D matriz)\n",
"#Using numpy, I can print the dimensions (here we are working with a 2D matrix)\n",
"print(iris.data.ndim)"
]
},
@ -218,7 +227,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In following sessions we will learn how to load a dataset from a file (csv, excel, ...) using the pandas library."
"In the following sessions, we will learn how to load a dataset from a file (CSV, Excel, ...) using the pandas library."
]
},
{
@ -246,7 +255,7 @@
"source": [
"## Licence\n",
"\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -49,7 +49,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal of this notebook is to learn how to analyse a dataset. We will cover other tasks such as cleaning or munging (changing the format) the dataset in other sessions."
"This notebook aims to learn how to analyse a dataset. We will cover other tasks such as cleaning or munging (changing the format) the dataset in other sessions."
]
},
{
@ -65,13 +65,13 @@
"source": [
"This section covers different ways to inspect the distribution of samples per feature.\n",
"\n",
"First of all, let's see how many samples of each class we have, using a [histogram](https://en.wikipedia.org/wiki/Histogram). \n",
"First of all, let's see how many samples we have in each class using a [histogram](https://en.wikipedia.org/wiki/Histogram). \n",
"\n",
"A histogram is a graphical representation of the distribution of numerical data. It is an estimation of the probability distribution of a continuous variable (quantitative variable). \n",
"A histogram is a graphical representation of the distribution of numerical data. It estimates the probability distribution of a continuous variable (quantitative variable). \n",
"\n",
"For building a histogram, we need first to 'bin' the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. \n",
"For building a histogram, we need to 'bin' the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. \n",
"\n",
"In our case, since the values are not continuous and we have only three values, we do not need to bin them."
"Since the values are not continuous and we have only three values, we do not need to bin them."
]
},
{
@ -115,7 +115,7 @@
"metadata": {},
"source": [
"As can be seen, we have the same distribution of samples for every class.\n",
"The next step is to see the distribution of the features"
"The next step is to see the distribution of the features."
]
},
{
@ -184,7 +184,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As we can see, the Setosa class seems to be linearly separable with these two features.\n",
"As we can see, the Setosa class seems linearly separable with these two features.\n",
"\n",
"Another nice visualisation is given below."
]
@ -241,7 +241,7 @@
"source": [
"## Licence\n",
"\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -52,11 +52,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In the previous notebook we developed plots with the [matplotlib](http://matplotlib.org/) plotting library.\n",
"In the previous notebook, we developed plots with the [matplotlib](http://matplotlib.org/) plotting library.\n",
"\n",
"This notebook introduces another plotting library, [**seaborn**](https://stanford.edu/~mwaskom/software/seaborn/), which provides advanced facilities for data visualization.\n",
"\n",
"*Seaborn* is a library for making attractive and informative statistical graphics in Python. It is built on top of *matplotlib* and tightly integrated with the *PyData* stack, including support for *numpy* and *pandas* data structures and statistical routines from *scipy* and *statsmodels*.\n",
"*Seaborn* is a library that makes attractive and informative statistical graphics in Python. It is built on top of *matplotlib* and tightly integrated with the *PyData* stack, including support for *numpy* and *pandas* data structures and statistical routines from *scipy* and *statsmodels*.\n",
"\n",
"*Seaborn* requires its input to be *DataFrames* (a structure created with the library *pandas*)."
]
@ -197,9 +197,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"A very common way to use this plot colors the observations by a separate categorical variable. For example, the iris dataset has four measurements for each of the three different species of iris flowers.\n",
"A widespread way to use this plot colors the observations by a separate categorical variable. For example, the iris dataset has four measurements for each of the three different species of iris flowers.\n",
"\n",
"We are going to color each class, so that we can easily identify **clustering** and **linear relationships**."
"We are going to color each class, so we can easily identify **clustering** and **linear relationships**."
]
},
{
@ -220,7 +220,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"By default every numeric column in the dataset is used, but you can focus on particular relationships if you want."
"By default, every numeric column in the dataset is used, but you can focus on particular relationships if you want."
]
},
{
@ -321,7 +321,7 @@
"metadata": {},
"outputs": [],
"source": [
"# One way we can extend this plot is adding a layer of individual points on top of\n",
"# One way we can extend this plot is by adding a layer of individual points on top of\n",
"# it through Seaborn's striplot\n",
"# \n",
"# We'll use jitter=True so that all the points don't fall in single vertical lines\n",
@ -347,7 +347,7 @@
"outputs": [],
"source": [
"# A violin plot combines the benefits of the previous two plots and simplifies them\n",
"# Denser regions of the data are fatter, and sparser thiner in a violin plot\n",
"# Denser regions of the data are fatter, and sparser thinner in a violin plot\n",
"sns.violinplot(x=\"species\", y=\"petal length (cm)\", data=iris_df, size=6)"
]
},
@ -389,10 +389,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Depending on the data, we can choose which visualisation suits better. the following [diagram](http://www.labnol.org/software/find-right-chart-type-for-your-data/6523/) guides this selection.\n",
"Depending on the data, we can choose which visualisation suits us better. the following [diagram](http://www.labnol.org/software/find-right-chart-type-for-your-data/6523/) guides this selection.\n",
"\n",
"\n",
"![](files/images/data-chart-type.png \"Graphs\")"
"![](./images/data-chart-type.png \"Graphs\")"
]
},
{
@ -421,7 +421,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -76,7 +76,7 @@
"source": [
"A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n",
"\n",
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
"We will use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
]
},
{
@ -122,9 +122,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might misbehave if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
"\n",
"The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set. Later, the same transformation will be applied on the testing set."
"The preprocessing module further provides a utility class `StandardScaler` to compute a training set's mean and standard deviation. Later, the same transformation will be applied on the testing set."
]
},
{
@ -173,7 +173,7 @@
"metadata": {},
"source": [
"### Licences\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -53,9 +53,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This is an introduction of general ideas about machine learning and the interface of scikit-learn, taken from the [scikit-learn tutorial](http://www.astroml.org/sklearn_tutorial/general_concepts.html). \n",
"This is an introduction to general ideas about machine learning and the interface of scikit-learn, taken from the [scikit-learn tutorial](http://www.astroml.org/sklearn_tutorial/general_concepts.html). \n",
"\n",
"You can skip it during the lab session and read it later,"
"You can skip it during the lab session and read it later."
]
},
{
@ -69,21 +69,21 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Machine learning algorithms are programs that learn a model from a dataset with the aim of making predictions or learning structures to organize the data.\n",
"Machine learning algorithms are programs that learn a model from a dataset to make predictions or learn structures to organize the data.\n",
"\n",
"In scikit-learn, machine learning algorithms take as an input a *numpy* array (n_samples, n_features), where\n",
"* **n_samples**: number of samples. Each sample is an item to process (i.e. classify). A sample can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits.\n",
"* **n_features**: The number of features or distinct traits that can be used to describe each item in a quantitative manner.\n",
"In scikit-learn, machine learning algorithms take as input a *numpy* array (n_samples, n_features), where\n",
"* **n_samples**: number of samples. Each sample is an item to process (i.e., classify). A sample can be a document, a picture, a sound, a video, a row in a database or CSV file, or whatever you can describe with a fixed set of quantitative traits.\n",
"* **n_features**: The number of features or distinct traits that can be used to describe each item quantitatively.\n",
"\n",
"The number of features should be defined in advance. There is a specific type of feature sets that are high dimensional (e.g. millions of features), but most of the values are zero for a given sample. Using (numpy) arrays, all those values that are zero would also take up memory. For this reason, these feature sets are often represented with sparse matrices (scipy.sparse) instead of (numpy) arrays.\n",
"The number of features should be defined in advance. A specific type of feature set is high-dimensional (e.g., millions of features), but most values are zero for a given sample. Using (numpy) arrays, all those zero values would also take up memory. For this reason, these feature sets are often represented with sparse matrices (scipy.sparse) instead of (numpy) arrays.\n",
"\n",
"The first step in machine learning is **identifying the relevant features** from the input data, and the second step is **extracting the features** from the input data. \n",
"\n",
"[Machine learning algorithms](http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/) can be classified according to learning style into:\n",
"* **Supervised learning**: input data (training dataset) has a known label or result. Example problems are classification and regression. A model is prepared through a training process where it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data.\n",
"* **Unsupervised learning**: input data is not labeled. A model is prepared by deducing structures present in the input data. This may be to extract general rules. Example problems are clustering, dimensionality reduction and association rule learning.\n",
"* **Semi-supervised learning**:i nput data is a mixture of labeled and unlabeled examples. There is a desired prediction problem but the model must learn the structures to organize the data as well as make predictions. Example problems are classification and regression."
]
"* **Unsupervised learning**: input data is not labeled. A model is prepared by deducing structures present in the input data. This may be to extract general rules. Example problems are clustering, dimensionality reduction, and association rule learning.\n",
"* **Semi-supervised learning**: input data is a mixture of labeled and unlabeled examples. There is a desired prediction problem, but the model must learn the structures to organize the data and make predictions. Example problems are classification and regression."
]
},
{
"cell_type": "markdown",
@ -96,8 +96,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In *supervised machine learning models*, the machine learning algorithm takes as an input a training dataset, composed of feature vectors and labels, and produces a predictive model which is used for make prediction on new data.\n",
"![](files/images/plot_ML_flow_chart_1.png)"
"In *supervised machine learning models*, the machine learning algorithm takes as input a training dataset, composed of feature vectors and labels, and produces a predictive model used to predict new data.\n",
"![](./images/plot_ML_flow_chart_1.png)"
]
},
{
@ -111,7 +111,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In *unsupervised machine learning models*, the machine learning model algorithm takes as an input the feature vectors and produces a predictive model that is used to fit its parameters so as to best summarize regularities found in the data.\n",
"In *unsupervised machine learning models*, the machine learning model algorithm takes as input the feature vectors. It produces a predictive model that is used to fit its parameters to summarize the best regularities found in the data.\n",
"![](files/images/plot_ML_flow_chart_3.png)"
]
},
@ -129,15 +129,15 @@
"scikit-learn has a uniform interface for all the estimators, some methods are only available if the estimator is supervised or unsupervised:\n",
"\n",
"* Available in *all estimators*:\n",
" * **model.fit()**: fit training data. For supervised learning applications, this accepts two arguments: the data X and the labels y (e.g. model.fit(X, y)). For unsupervised learning applications, this accepts only a single argument, the data X (e.g. model.fit(X)).\n",
" * **model.fit()**: fit training data. For supervised learning applications, this accepts two arguments: the data X and the labels y (e.g., model.fit(X, y)). For unsupervised learning applications, this accepts only a single argument, the data X (e.g. model.fit(X)).\n",
"\n",
"* Available in *supervised estimators*:\n",
" * **model.predict()**: given a trained model, predict the label of a new set of data. This method accepts one argument, the new data X_new (e.g. model.predict(X_new)), and returns the learned label for each object in the array.\n",
" * **model.predict()**: given a trained model, predict the label of a new dataset. This method accepts one argument, the new data X_new (e.g., model.predict(X_new)), and returns the learned label for each object in the array.\n",
" * **model.predict_proba()**: For classification problems, some estimators also provide this method, which returns the probability that a new observation has each categorical label. In this case, the label with the highest probability is returned by model.predict().\n",
"\n",
"* Available in *unsupervised estimators*:\n",
" * **model.transform()**: given an unsupervised model, transform new data into the new basis. This also accepts one argument X_new, and returns the new representation of the data based on the unsupervised model.\n",
" * **model.fit_transform()**: some estimators implement this method, which performs a fit and a transform on the same input data.\n",
" * **model.fit_transform()**: Some estimators implement this method, which performs a fit and a transform on the same input data.\n",
"\n",
"\n",
"![](files/images/plot_ML_flow_chart_2.png)"
@ -169,7 +169,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -55,7 +55,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal of this notebook is to learn how to train a model, make predictions with that model and evaluate these predictions.\n",
"The goal of this notebook is to learn how to train a model, make predictions with that model, and evaluate these predictions.\n",
"\n",
"The notebook uses the [kNN (k nearest neighbors) algorithm](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm)."
]
@ -212,14 +212,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Precision, recall and f-score"
"### Precision, recall, and f-score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall and F1-score\n",
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall, and F1-score\n",
"\n",
"* **Precision**: This computes the proportion of instances predicted as positives that were correctly evaluated (it measures how right our classifier is when it says that an instance is positive).\n",
"* **Recall**: This counts the proportion of positive instances that were correctly evaluated (measuring how right our classifier is when faced with a positive instance).\n",
@ -246,7 +246,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Another useful metric is the confusion matrix"
"Another useful metric is the confusion matrix."
]
},
{
@ -262,7 +262,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We see we classify well all the 'setosa' and 'versicolor' samples. "
"We classify all the 'setosa' and 'versicolor' samples well. "
]
},
{
@ -276,7 +276,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**."
"To avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**."
]
},
{
@ -298,7 +298,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n",
"\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
"# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"print(scores)"
]
@ -307,7 +307,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure"
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure."
]
},
{
@ -340,7 +340,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We are going to tune the algorithm, and calculate which is the best value for the k hyperparameter."
"We will tune the algorithm and calculate the best value for the k hyperparameter."
]
},
{
@ -365,7 +365,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The result is very dependent of the input data. Execute again the train_test_split and test again how the result changes with k."
"The result is very dependent on the input data. Execute the train_test_split again and test how the result changes with k."
]
},
{
@ -387,7 +387,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -56,9 +56,9 @@
"source": [
"The goal of this notebook is to learn how to create a classification object using a [decision tree learning algorithm](https://en.wikipedia.org/wiki/Decision_tree_learning). \n",
"\n",
"There are a number of well known machine learning algorithms for decision tree learning, such as ID3, C4.5, C5.0 and CART. The scikit-learn uses an optimised version of the [CART (Classification and Regression Trees) algorithm](https://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees).\n",
"There are several well-known machine learning algorithms for decision tree learning, such as ID3, C4.5, C5.0, and CART. The scikit-learn uses an optimised version of the [CART (Classification and Regression Trees) algorithm](https://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees).\n",
"\n",
"This notebook will follow the same steps that the previous notebook for learning using the [kNN Model](2_5_1_kNN_Model.ipynb), and details some peculiarities of the decision tree algorithms.\n",
"This notebook will follow the same steps as the previous notebook for learning using the [kNN Model](2_5_1_kNN_Model.ipynb), and details some peculiarities of the decision tree algorithms.\n",
"\n",
"You need to install pydotplus: `conda install pydotplus` for the visualization."
]
@ -69,7 +69,7 @@
"source": [
"## Load data and preprocessing\n",
"\n",
"Here we repeat the same operations for loading data and preprocessing than in the previous notebooks."
"Here we repeat the same operations for loading data and preprocessing as in the previous notebooks."
]
},
{
@ -262,8 +262,8 @@
"The current version of pydot does not work well in Python 3.\n",
"For obtaining an image, you need to install `pip install pydotplus` and then `conda install graphviz`.\n",
"\n",
"You can skip this example. Since it can require installing additional packages, we include here the result.\n",
"![Decision Tree](files/images/cart.png)"
"You can skip this example. Since it can require installing additional packages, we have included the result here.\n",
"![Decision Tree](./images/cart.png)"
]
},
{
@ -330,7 +330,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we are going to export the pseudocode of the the learnt decision tree."
"Next, we will export the pseudocode of the learnt decision tree."
]
},
{
@ -378,14 +378,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Precision, recall and f-score"
"### Precision, recall, and f-score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall and F1-score\n",
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall, and F1-score\n",
"\n",
"* **Precision**: This computes the proportion of instances predicted as positives that were correctly evaluated (it measures how right our classifier is when it says that an instance is positive).\n",
"* **Recall**: This counts the proportion of positive instances that were correctly evaluated (measuring how right our classifier is when faced with a positive instance).\n",
@ -412,7 +412,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Another useful metric is the confusion matrix"
"Another useful metric is the confusion matrix."
]
},
{
@ -428,7 +428,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We see we classify well all the 'setosa' and 'versicolor' samples. "
"We classify all the 'setosa' and 'versicolor' samples well. "
]
},
{
@ -442,7 +442,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**.\n",
"To avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**.\n",
"\n",
"Sklearn comes with other strategies for [cross validation](http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation), such as stratified K-fold, label k-fold, Leave-One-Out, Leave-P-Out, Leave-One-Label-Out, Leave-P-Label-Out or Shuffle & Split."
]
@ -466,7 +466,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n",
"\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
"# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"print(scores)"
]
@ -475,7 +475,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure"
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure."
]
},
{
@ -518,7 +518,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -58,7 +58,7 @@
"source": [
"In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the hyperparameters of the estimator?\n",
"\n",
"The goal of this notebook is to learn how to tune an algorithm by opimizing its hyperparameters using grid search."
"This notebook aims to learn how to tune an algorithm by optimizing its hyperparameters using grid search."
]
},
{
@ -137,7 +137,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n",
"\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
"# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"\n",
"from scipy.stats import sem\n",
@ -189,7 +189,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can get the list of parameters of the model. As you will observe, the parameters of the estimators in the pipeline can be accessed using the <estimator>__<parameter> syntax. We will use this for tuning the parameters."
"We can get the list of model parameters. As you will observe, the parameters of the estimators in the pipeline can be accessed using the <estimator>__<parameter> syntax. We will use this for tuning the parameters."
]
},
{
@ -205,7 +205,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see what happens if we change a parameter"
"Let's see what happens if we change a parameter."
]
},
{
@ -284,7 +284,7 @@
"\n",
"Look at the [API](http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) of *scikit-learn* to understand better the algorithm, as well as which parameters can be tuned. As you see, we can change several ones, such as *criterion*, *splitter*, *max_features*, *max_depth*, *min_samples_split*, *class_weight*, etc.\n",
"\n",
"We can get the full list parameters of an estimator with the method *get_params()*. "
"We can get an estimator's full list of parameters with the method *get_params()*. "
]
},
{
@ -314,16 +314,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Changing manually the hyperparameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the hyperparameters as an *optimization problem*. \n",
"Changing manually the hyperparameters to find their optimal values is not practical. Instead, we can consider finding the optimal value of the hyperparameters as an *optimization problem*. \n",
"\n",
"The sklearn comes with several optimization techniques for this purpose, such as **grid search** and **randomized search**. In this notebook we are going to introduce the former one."
"Sklearn has several optimization techniques, such as **grid search** and **randomized search**. In this notebook, we are going to introduce the former one."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The sklearn provides an object that, given data, computes the score during the fit of an estimator on a hyperparameter grid and chooses the hyperparameters to maximize the cross-validation score. "
"Sklearn provides an object that, given data, computes the score during the fit of an estimator on a hyperparameter grid and chooses the hyperparameters to maximize the cross-validation score. "
]
},
{
@ -351,7 +351,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are going to show the results of grid search"
"Now we are going to show the results of the grid search"
]
},
{
@ -392,7 +392,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n",
"\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
"# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"def mean_score(scores):\n",
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
@ -405,7 +405,7 @@
"source": [
"We have got an *improvement* from 0.947 to 0.953 with k-fold.\n",
"\n",
"We are now to try to fit the best combination of the hyperparameters of the algorithm. It can take some time to compute it."
"We are now trying to fit the best combination of the hyperparameters of the algorithm. It can take some time to compute it."
]
},
{
@ -492,7 +492,7 @@
"# create a k-fold cross validation iterator of k=10 folds\n",
"cv = KFold(10, shuffle=True, random_state=33)\n",
"\n",
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
"# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
"def mean_score(scores):\n",
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
@ -533,7 +533,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -48,9 +48,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal of this notebook is to learn how to save a model in the the scikit by using Pythons built-in persistence model, namely pickle\n",
"The goal of this notebook is to learn how to save a model in the scikit by using Pythons built-in persistence model, namely pickle\n",
"\n",
"First we recap the previous tasks: load data, preprocess and train the model."
"First, we recap the previous tasks: load data, preprocess, and train the model."
]
},
{
@ -107,7 +107,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"A more efficient alternative to pickle is joblib, especially for big data problems. In this case the model can only be saved to a file and not to a string."
"A more efficient alternative to pickle is joblib, especially for big data problems. In this case, the model can only be saved to a file and not to a string."
]
},
{
@ -146,7 +146,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -52,7 +52,7 @@
"\n",
"Particularly in high-dimensional spaces, data can more easily be separated linearly and the simplicity of classifiers such as naive Bayes and linear SVMs might lead to better generalization than is achieved by other classifiers.\n",
"\n",
"The plots show training points in solid colors and testing points semi-transparent. The lower right shows the classification accuracy on the test set.\n",
"The plots show training points in solid colors and testing points in semi-transparent colors. The lower right shows the classification accuracy on the test set.\n",
"\n",
"The [DummyClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html#sklearn.dummy.DummyClassifier) is a classifier that makes predictions using simple rules. It is useful as a simple baseline to compare with other (real) classifiers. \n",
"\n",
@ -94,7 +94,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

BIN
ml1/images/iris-classes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 944 KiB

BIN
ml2/images/iris-classes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 944 KiB

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -27,14 +27,14 @@
"source": [
"# Introduction to Neural Networks\n",
" \n",
"In this lab session, we are going to learn how to train a neural network.\n",
"In this lab session, we will learn how to train a neural network.\n",
"\n",
"# Objectives\n",
"\n",
"The main objectives of this session are:\n",
"* Put in practice the notions learn in class about neural computing\n",
"* Understand what an MLP is\n",
"* Learn to use some libraries, such as scikit-learn "
"* Learn to use some libraries, such as Scikit-learn."
]
},
{
@ -58,7 +58,7 @@
"metadata": {},
"source": [
"## Licence\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -39,7 +39,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Multilayer perceptrons, also called feedforward neural networks or deep feedforward networks, are the most basic deep learning models."
"Multilayer perceptrons, called feedforward neural networks or deep feedforward networks, are the most basic deep learning models."
]
},
{
@ -58,7 +58,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook we are going to try the spiral dataset with different algorthms. In particular, we are going to focus our attention on the MLP classifier.\n",
"In this notebook, we will try the spiral dataset with different algorithms. In particular, we are going to focus our attention on the MLP classifier.\n",
"\n",
"\n",
"Answer directly in your copy of the exercise and submit it as a moodle task."

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -39,10 +39,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook we are going to apply a MLP to a simple regression task: learning the Fresnel functions.\n",
"In this notebook, we are going to apply an MLP to a simple regression task: learning the Fresnel functions.\n",
"\n",
"\n",
"Answer directly in your copy of the exercise and submit it as a moodle task."
"Answer directly in your copy of the exercise and submit it as a Moodle task."
]
},
{
@ -92,7 +92,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Change this variables to change the train and test dataset."
"Change these variables to change the train and test dataset."
]
},
{

View File

@ -15,7 +15,7 @@ def gen_spiral_dataset(n_examples=500, n_classes=2, a=None, b=None, pi_space=3):
theta = np.linspace(0,pi_space*pi, num=n_examples)
xy = np.zeros((n_examples,2))
# logaritmic spirals
# logarithmic spirals
x_golden_parametric = lambda a, b, theta: a**(theta*b) * cos(theta)
y_golden_parametric = lambda a, b, theta: a**(theta*b) * sin(theta)
x_golden_parametric = np.vectorize(x_golden_parametric)