Not done reviewing ml2 yet

2025-12-30 23:58:15 +00:00 · 2016-03-28 14:03:08 +02:00
parent 67bf2f7360
commit 3165eac23c
15 changed files with 17215 additions and 419 deletions
--- a/ml2/3_0_0_Intro_ML_2.ipynb
+++ b/ml2/3_0_0_Intro_ML_2.ipynb
@@ -0,0 +1,114 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Introduction to Machine Learning II\n",
+    " \n",
+    "In this lab session, we will go deeper in some aspects that were introduced in the previous session. This time we will delve into a little bit more detail about reading datasets, analysing data and selecting features. In addition, we will explore two additional machine learning algorithms: perceptron and SVM in a binary classification problem provided by the Titanic dataset.\n",
+    "\n",
+    "# Objectives\n",
+    "\n",
+    "In this lecture we are going to introduce some more details about machine learning aspects. \n",
+    "\n",
+    "The main objectives of this session are:\n",
+    "* Learn how to read data from a file or URL with pandas\n",
+    "* Learn how to use the pandas DataFrame data structure\n",
+    "* Learn how to select features\n",
+    "* Understand better the Perceptron and SVM machine learning algorithms"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Table of Contents"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. [Home](3_0_0_Intro_ML_2.ipynb)\n",
+    "1. [The Titanic Dataset. Reading Data](3_1_Read_Data.ipynb)\n",
+    "1. [Introduction to Pandas](3_2_Pandas.ipynb)\n",
+    "1. [Preprocessing: Data Munging with DataFrames](3_3_Data_Munging_with_Pandas.ipynb)\n",
+    "2. [Preprocessing: Visualisation and for DataFrames](3_4_Visualisation_Pandas.ipynb)\n",
+    "3. [Exercise 1](3_5_Exercise_1.ipynb)\n",
+    "1. [Machine Learning](3_6_Machine_Learning.ipynb)\n",
+    "   1. [SVM](3_7_SVM.ipynb)\n",
+    "5.  [Exercise 2](3_8_Exercise_2.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## References"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "* [IPython Notebook Tutorial for Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic/forums/t/5105/ipython-notebook-tutorial-for-titanic-machine-learning-from-disaster)\n",
+    "* [Scikit-learn videos](http://blog.kaggle.com/author/kevin-markham/) and [notebooks](https://github.com/justmarkham/scikit-learn-videos) by Kevin Marham\n",
+    "* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
+    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence\n",
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© 2016 Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1+"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/ml2/3_1_Read_Data.ipynb
+++ b/ml2/3_1_Read_Data.ipynb
--- a/ml2/3_2_Pandas.ipynb
+++ b/ml2/3_2_Pandas.ipynb
@@ -0,0 +1,932 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## [Introduction to Machine Learning](2_0_0_Intro_ML.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Table of Contents\n",
+    "\n",
+    "* [Introduction to Pandas](#Introduction-to-Pandas)\n",
+    "* [Series](#Series)\n",
+    "* [DataFrame](#DataFrame)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Introduction to Pandas\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook provides an overview of the *pandas* library. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "[Pandas](http://pandas.pydata.org/) is a Python library that provides easy-to-use data structures and data analysis tools.\n",
+    "\n",
+    "The main advantage of *Pandas* is that provides extensive facilities for grouping, merging and querying  pandas data structures, and also includes facilities for time series analysis, as well as i/o and visualisation facilities.\n",
+    "\n",
+    "Pandas in built on top of *NumPy*, so we will have usually to import both libraries.\n",
+    "\n",
+    "Pandas provides two main data structures:\n",
+    "* **Series** is a one dimensional labelled object, capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.).. It is similar to an array, a list, a dictionary or a column in a table. Every value in a Series object has an index.\n",
+    "* **DataFrame** is a two dimensional labelled object with columns of potentially different types. It is similar to a database table, or a spreadsheet. It can be seen as a dictionary of Series that share the same index.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Series"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We are not going to use Series objects directly as frequently as DataFrames. Here we provide a short introduction"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0     5\n",
+       "1    10\n",
+       "2    15\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import numpy as pd\n",
+    "import pandas as pd\n",
+    "from pandas import Series, DataFrame\n",
+    "\n",
+    "# create series object from an array\n",
+    "s = Series([5, 10, 15])\n",
+    "s"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We see each value has an associated label starting with 0 if no index is specified when the Series object is created. \n",
+    "\n",
+    "It is similar to a dictionary. In fact, we can also create a Series object from a dictionary as follows. In this case, the indexes are the keys of the dictionary."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "a     5\n",
+       "b    10\n",
+       "c    15\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "d = {'a': 5, 'b': 10, 'c': 15}\n",
+    "s = Series(d)\n",
+    "s"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Index(['a', 'b', 'c'], dtype='object')"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# We can get the list of indexes\n",
+    "s.index"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array([ 5, 10, 15])"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# and the values\n",
+    "s.values"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Another option is to create the Series object from two lists, for  values and indexes."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3141991\n",
+       "Barcelona    1604555\n",
+       "Valencia      786189\n",
+       "Sevilla       693878\n",
+       "Zaragoza      664953\n",
+       "Malaga        569130\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Series with population in 2015 of more populated cities in Spain\n",
+    "s = Series([3141991, 1604555, 786189, 693878, 664953, 569130], index=['Madrid', 'Barcelona', 'Valencia', 'Sevilla', \n",
+    "                                                                      'Zaragoza', 'Malaga'])\n",
+    "s"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "3141991"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Population of Madrid\n",
+    "s['Madrid']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Indexing and slicing"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Until now, we have not seen any advantage in using Panda Series. we are going to show now some examples of their possibilities."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid        True\n",
+       "Barcelona     True\n",
+       "Valencia     False\n",
+       "Sevilla      False\n",
+       "Zaragoza     False\n",
+       "Malaga       False\n",
+       "dtype: bool"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "#Boolean condition\n",
+    "s > 1000000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3141991\n",
+       "Barcelona    1604555\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Cities with population greater than 1.000.000\n",
+    "s[s > 1000000]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Observe that (s > 1000000) returns a Series object. We can use this boolean vector as a filter to get a *slice* of the original series that contains only the elements where the value of the filter is True. The original Series s is not modified. This selection is called *boolean indexing*."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3141991\n",
+       "Barcelona    1604555\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Cities with population greater than the mean\n",
+    "s[s > s.mean()]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3141991\n",
+       "Barcelona    1604555\n",
+       "Valencia      786189\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Cities with population greater than the median\n",
+    "s[s > s.median()]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid        True\n",
+       "Barcelona     True\n",
+       "Valencia      True\n",
+       "Sevilla      False\n",
+       "Zaragoza     False\n",
+       "Malaga       False\n",
+       "dtype: bool"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Check cities with a population greater than 700.000\n",
+    "s > 700000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3141991\n",
+       "Barcelona    1604555\n",
+       "Valencia      786189\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# List cities with a population greater than 700.000\n",
+    "s[s > 700000]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid        True\n",
+       "Barcelona     True\n",
+       "Valencia      True\n",
+       "Sevilla      False\n",
+       "Zaragoza     False\n",
+       "Malaga       False\n",
+       "dtype: bool"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "#Another way to write the same boolean indexing selection\n",
+    "bigger_than_700000 = s > 700000\n",
+    "bigger_than_700000"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3141991\n",
+       "Barcelona    1604555\n",
+       "Valencia      786189\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "#Cities with population > 700000\n",
+    "s[bigger_than_700000]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Operations on series"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also carry out other mathematical operations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       1570995.5\n",
+       "Barcelona     802277.5\n",
+       "Valencia      393094.5\n",
+       "Sevilla       346939.0\n",
+       "Zaragoza      332476.5\n",
+       "Malaga        284565.0\n",
+       "dtype: float64"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Divide population by 2\n",
+    "s / 2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "1243449.3333333333"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Get the average population\n",
+    "s.mean()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "3141991"
+      ]
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Get the highest population\n",
+    "s.max()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Item assignment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also change values directly or based on a condition. You can consult additional feautures in the manual."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3320000\n",
+       "Barcelona    1604555\n",
+       "Valencia      786189\n",
+       "Sevilla       693878\n",
+       "Zaragoza      664953\n",
+       "Malaga        569130\n",
+       "dtype: int64"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Change population of one city\n",
+    "s['Madrid'] = 3320000\n",
+    "s"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Madrid       3652000.0\n",
+       "Barcelona    1765010.5\n",
+       "Valencia      864807.9\n",
+       "Sevilla       693878.0\n",
+       "Zaragoza      664953.0\n",
+       "Malaga        569130.0\n",
+       "dtype: float64"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Increase by 10% cities with population greater than 700000\n",
+    "s[s > 700000] = 1.1 * s[s > 700000]\n",
+    "s"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# DataFrame"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As we said previously, **DataFrames** are two-dimensional data structures. You can see like a dict of Series that share the index."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>one</th>\n",
+       "      <th>two</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>a</th>\n",
+       "      <td>1.0</td>\n",
+       "      <td>1.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>b</th>\n",
+       "      <td>2.0</td>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>c</th>\n",
+       "      <td>3.0</td>\n",
+       "      <td>3.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>d</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   one  two\n",
+       "a  1.0  1.0\n",
+       "b  2.0  2.0\n",
+       "c  3.0  3.0\n",
+       "d  NaN  4.0"
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# We are going to create a DataFrame from a dict of Series\n",
+    "d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),\n",
+    "    'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}\n",
+    "df = DataFrame(d)\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this dataframe, the *indexes* (row labels) are *a*, *b*, *c* and *d* and the *columns* (column labels) are *one* and *two*.\n",
+    "\n",
+    "We see that the resulting DataFrame is the union of indexes, and missing values are included as NaN (to write this value we will use *np.nan*).\n",
+    "\n",
+    "If we specify an index, the dictionary is filtered."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>one</th>\n",
+       "      <th>two</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>d</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>4.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>b</th>\n",
+       "      <td>2.0</td>\n",
+       "      <td>2.0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>a</th>\n",
+       "      <td>1.0</td>\n",
+       "      <td>1.0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   one  two\n",
+       "d  NaN  4.0\n",
+       "b  2.0  2.0\n",
+       "a  1.0  1.0"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# We can filter\n",
+    "df = DataFrame(d, index=['d', 'b', 'a'])\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Another option is to use the constructor with *index* and *columns*."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>two</th>\n",
+       "      <th>three</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>d</th>\n",
+       "      <td>4.0</td>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>b</th>\n",
+       "      <td>2.0</td>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>a</th>\n",
+       "      <td>1.0</td>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   two three\n",
+       "d  4.0   NaN\n",
+       "b  2.0   NaN\n",
+       "a  1.0   NaN"
+      ]
+     },
+     "execution_count": 22,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = DataFrame(d, index=['d', 'b', 'a'], columns=['two', 'three'])\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the next notebook we are going to learn more about dataframes."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## References"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "* [Pandas](http://pandas.pydata.org/)\n",
+    "* [Learning Pandas, Michael Heydt, Packt Publishing, 2015](http://proquest.safaribooksonline.com/book/programming/python/9781783985128)\n",
+    "* [Pandas. Introduction to Data Structures](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dsintro)\n",
+    "* [Introducing Pandas Objects](https://www.oreilly.com/learning/introducing-pandas-objects)\n",
+    "* [Boolean Operators in Pandas](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-operators)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© 2016 Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1+"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/ml2/3_3_Data_Munging_with_Pandas.ipynb
+++ b/ml2/3_3_Data_Munging_with_Pandas.ipynb
--- a/ml2/3_4_Visualisation_Pandas.ipynb
+++ b/ml2/3_4_Visualisation_Pandas.ipynb
--- a/ml2/3_5_Exercise_1.ipynb
+++ b/ml2/3_5_Exercise_1.ipynb
@@ -0,0 +1,539 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## [Introduction to Machine Learning II](3_0_0_Intro_ML_2.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Exercise - The Titanic Dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this exercise we are going to put in practice what we have learnt in the notebooks of the session. \n",
+    "\n",
+    "Answer directly in your copy of the exercise and submit it as a moodle task."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "import seaborn as sns\n",
+    "import matplotlib.pyplot as plt\n",
+    "import numpy as np\n",
+    "sns.set(color_codes=True)\n",
+    "\n",
+    "# if matplotlib is not set inline, you will not see plots\n",
+    "%matplotlib inline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Reading Data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Assign the variable *df* a Dataframe with the Titanic Dataset from the URL https://raw.githubusercontent.com/cif2cif/sitc/master/ml2/data-titanic/train.csv\"\n",
+    "\n",
+    "Print *df*."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Munging and Exploratory visualisation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Obtain number of passengers and features of the dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Obtain general statistics (count, mean, std, min, max, 25%, 50%, 75%) about the column Age"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Obtain the median of the age of the passengers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Obtain number of missing values per feature"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "How many passsengers have survived? List them grouped by Sex and Pclass.\n",
+    "\n",
+    "Assign the result to a variable df_1 and print it"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": false
+   },
+   "source": [
+    "Visualise df_1 as an histogram."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "# Feature Engineering"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here you can find some features that have been proposed for this dataset. Your task is to analyse them and provide some insights. \n",
+    "\n",
+    "Use pandas and visualisation to justify your conclusions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature FamilySize "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Regarding SbSp and Parch, we can define a new feature, 'FamilySize' that is the combination of both."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "df['FamilySize'] = df['SibSp'] + df['Parch']\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature Alone"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "It seems many people who went alone survived. We can define a new feature 'Alone'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "df['Alone'] = (df.FamilySize == 0)\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature Salutation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we observe well in the name variable, there is a 'title' (Mr., Miss., Mrs.). We can add a feature wit this title."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "#Taken from http://www.analyticsvidhya.com/blog/2014/09/data-munging-python-using-pandas-baby-steps-python/\n",
+    "def name_extract(word):\n",
+    "    return word.split(',')[1].split('.')[0].strip()\n",
+    "\n",
+    "df['Salutation'] = df['Name'].apply(name_extract)\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can list the different salutations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "df['Salutation'].unique()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "df.groupby(['Salutation']).size()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "There only 4 main salutations, so we combine the rest of salutations in 'Others'."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "def group_salutation(old_salutation):\n",
+    "    if old_salutation == 'Mr':\n",
+    "        return('Mr')\n",
+    "    else:\n",
+    "        if old_salutation == 'Mrs':\n",
+    "            return('Mrs')\n",
+    "        else:\n",
+    "            if old_salutation == 'Master':\n",
+    "                return('Master')\n",
+    "            else: \n",
+    "                if old_salutation == 'Miss':\n",
+    "                    return('Miss')\n",
+    "                else:\n",
+    "                    return('Others')\n",
+    "df['Salutation'] = df['Salutation'].apply(group_salutation)\n",
+    "df.groupby(['Salutation']).size()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "# Distribution\n",
+    "colors_sex = ['#ff69b4', 'b', 'r', 'y', 'm', 'c']\n",
+    "df.groupby('Salutation').size().plot(kind='bar', color=colors_sex)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "df.boxplot(column='Age', by = 'Salutation', sym='k.')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Features Children and Female"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "# Specific features for Children and Female since there are more survivors\n",
+    "df['Children']   = df['Age'].map(lambda x: 1 if x < 6.0 else 0)\n",
+    "df['Female']     = df['Gender'].map(lambda x: 1 if x == 0 else 0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature AgeGroup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# Group ages to simplify machine learning algorithms.  0: 0-5, 1: 6-10, 2: 11-15, 3: 16-59 and 4: 60-80\n",
+    "df['AgeGroup'] = 0\n",
+    "df.loc[(.AgeFill<6),'AgeGroup'] = 0\n",
+    "df.loc[(df.AgeFill>=6) & (df.AgeFill < 11),'AgeGroup'] = 1\n",
+    "df.loc[(df.AgeFill>=11) & (df.AgeFill < 16),'AgeGroup'] = 2\n",
+    "df.loc[(df.AgeFill>=16) & (df.AgeFill < 60),'AgeGroup'] = 3\n",
+    "df.loc[(df.AgeFill>=60),'AgeGroup'] = 4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature Deck\n",
+    "Only 1st class passengers have cabins, the rest are ‘Unknown’. A cabin number looks like ‘C123’. The letter refers to the deck."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "#Turning cabin number into Deck\n",
+    "cabin_list = ['A', 'B', 'C', 'D', 'E', 'F', 'T', 'G', 'Unknown']\n",
+    "df['Deck']=df['Cabin'].map(lambda x: substrings_in_string(x, cabin_list))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature FarePerPerson"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This feature is created from two previous features: Fare and FamilySize."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "df['FarePerPerson']= df['Fare'] / (df['FamilySize'] + 1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feature AgeClass"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Since age and class are both numbers we can just multiply them and get a new feature.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "df['AgeClass']=df['Age']*df['Pclass']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© 2016 Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1+"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/ml2/3_6_Machine_Learning.ipynb
+++ b/ml2/3_6_Machine_Learning.ipynb
@@ -0,0 +1,122 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## [Introduction to Machine Learning II](3_0_0_Intro_ML_2.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Machine Learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the previous session, we learnt how to apply machine learning algorithms to the Iris dataset.\n",
+    "\n",
+    "We are going now to review the full process. As probably you have notice, data preparation, cleaning and transformation takes more than 90 % of data mining effort.\n",
+    "\n",
+    "The phases are:\n",
+    "\n",
+    "* **Data ingestion**: reading the data from the data lake\n",
+    "* **Preprocessing**: \n",
+    "    * **Data cleaning (munging)**:  fill missing values, smooth noisy data (binning methods), identify or remove outlier, and resolve inconsistencies \n",
+    "    * **Data integration**: Integrate multiple datasets\n",
+    "    * **Data transformation**: normalization (rescale numeric values between 0 and 1), standardisation (rescale values to have mean of 0 and std of 1), transformation for smoothing a variable (e.g. square toot, ...), aggregation of data from several datasets\n",
+    "    * **Data reduction**: dimensionality reduction, clustering and sampling. \n",
+    "    * **Data discretization**: for numerical values and algorithms that do not accept continuous variables\n",
+    "    * **Feature engineering**: selection of most relevant features, creation of new features and delete non relevant features\n",
+    "    * Apply  Sampling for dividing the dataset into training and test datasets.\n",
+    "* **Machine learning**: apply machine learning algorithms and obtain an estimator, tuning its parameters.\n",
+    "* **Evaluation** of the model\n",
+    "* **Prediction**: use the model for new data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "![Machine Learning Process from *Python Machine Learning* book](images/machine-learning-process.jpg)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Licence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© 2016 Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1+"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/ml2/3_7_SVM.ipynb
+++ b/ml2/3_7_SVM.ipynb
--- a/ml2/3_8_Exercise_2.ipynb
+++ b/ml2/3_8_Exercise_2.ipynb
@@ -0,0 +1,89 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "![](images/EscUpmPolit_p.gif \"UPM\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Course Notes for Learning Intelligent Systems"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## [Introduction to Machine Learning II](3_0_0_Intro_ML_2.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Exercise 2 - The Titanic Dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this exercise we are going to put in practice what we have learnt in the notebooks of the session. \n",
+    "\n",
+    "In the previous notebook we have been applying the SVM machine learning algorithm.\n",
+    "\n",
+    "Your task is to apply other machine learning algorithms (at least 2) that you have seen in theory or others you are interested in.\n",
+    "\n",
+    "You should compare the algorithms and describe your experiments."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Licence"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "\n",
+    "© 2016 Carlos A. Iglesias, Universidad Politécnica de Madrid."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1+"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/ml2/data-titanic/test.csv
+++ b/ml2/data-titanic/test.csv
@@ -1,419 +0,0 @@
-PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
-892,3,"Kelly, Mr. James",male,34.5,0,0,330911,7.8292,,Q
-893,3,"Wilkes, Mrs. James (Ellen Needs)",female,47,1,0,363272,7,,S
-894,2,"Myles, Mr. Thomas Francis",male,62,0,0,240276,9.6875,,Q
-895,3,"Wirz, Mr. Albert",male,27,0,0,315154,8.6625,,S
-896,3,"Hirvonen, Mrs. Alexander (Helga E Lindqvist)",female,22,1,1,3101298,12.2875,,S
-897,3,"Svensson, Mr. Johan Cervin",male,14,0,0,7538,9.225,,S
-898,3,"Connolly, Miss. Kate",female,30,0,0,330972,7.6292,,Q
-899,2,"Caldwell, Mr. Albert Francis",male,26,1,1,248738,29,,S
-900,3,"Abrahim, Mrs. Joseph (Sophie Halaut Easu)",female,18,0,0,2657,7.2292,,C
-901,3,"Davies, Mr. John Samuel",male,21,2,0,A/4 48871,24.15,,S
-902,3,"Ilieff, Mr. Ylio",male,,0,0,349220,7.8958,,S
-903,1,"Jones, Mr. Charles Cresson",male,46,0,0,694,26,,S
-904,1,"Snyder, Mrs. John Pillsbury (Nelle Stevenson)",female,23,1,0,21228,82.2667,B45,S
-905,2,"Howard, Mr. Benjamin",male,63,1,0,24065,26,,S
-906,1,"Chaffee, Mrs. Herbert Fuller (Carrie Constance Toogood)",female,47,1,0,W.E.P. 5734,61.175,E31,S
-907,2,"del Carlo, Mrs. Sebastiano (Argenia Genovesi)",female,24,1,0,SC/PARIS 2167,27.7208,,C
-908,2,"Keane, Mr. Daniel",male,35,0,0,233734,12.35,,Q
-909,3,"Assaf, Mr. Gerios",male,21,0,0,2692,7.225,,C
-910,3,"Ilmakangas, Miss. Ida Livija",female,27,1,0,STON/O2. 3101270,7.925,,S
-911,3,"Assaf Khalil, Mrs. Mariana (Miriam"")""",female,45,0,0,2696,7.225,,C
-912,1,"Rothschild, Mr. Martin",male,55,1,0,PC 17603,59.4,,C
-913,3,"Olsen, Master. Artur Karl",male,9,0,1,C 17368,3.1708,,S
-914,1,"Flegenheim, Mrs. Alfred (Antoinette)",female,,0,0,PC 17598,31.6833,,S
-915,1,"Williams, Mr. Richard Norris II",male,21,0,1,PC 17597,61.3792,,C
-916,1,"Ryerson, Mrs. Arthur Larned (Emily Maria Borie)",female,48,1,3,PC 17608,262.375,B57 B59 B63 B66,C
-917,3,"Robins, Mr. Alexander A",male,50,1,0,A/5. 3337,14.5,,S
-918,1,"Ostby, Miss. Helene Ragnhild",female,22,0,1,113509,61.9792,B36,C
-919,3,"Daher, Mr. Shedid",male,22.5,0,0,2698,7.225,,C
-920,1,"Brady, Mr. John Bertram",male,41,0,0,113054,30.5,A21,S
-921,3,"Samaan, Mr. Elias",male,,2,0,2662,21.6792,,C
-922,2,"Louch, Mr. Charles Alexander",male,50,1,0,SC/AH 3085,26,,S
-923,2,"Jefferys, Mr. Clifford Thomas",male,24,2,0,C.A. 31029,31.5,,S
-924,3,"Dean, Mrs. Bertram (Eva Georgetta Light)",female,33,1,2,C.A. 2315,20.575,,S
-925,3,"Johnston, Mrs. Andrew G (Elizabeth Lily"" Watson)""",female,,1,2,W./C. 6607,23.45,,S
-926,1,"Mock, Mr. Philipp Edmund",male,30,1,0,13236,57.75,C78,C
-927,3,"Katavelas, Mr. Vassilios (Catavelas Vassilios"")""",male,18.5,0,0,2682,7.2292,,C
-928,3,"Roth, Miss. Sarah A",female,,0,0,342712,8.05,,S
-929,3,"Cacic, Miss. Manda",female,21,0,0,315087,8.6625,,S
-930,3,"Sap, Mr. Julius",male,25,0,0,345768,9.5,,S
-931,3,"Hee, Mr. Ling",male,,0,0,1601,56.4958,,S
-932,3,"Karun, Mr. Franz",male,39,0,1,349256,13.4167,,C
-933,1,"Franklin, Mr. Thomas Parham",male,,0,0,113778,26.55,D34,S
-934,3,"Goldsmith, Mr. Nathan",male,41,0,0,SOTON/O.Q. 3101263,7.85,,S
-935,2,"Corbett, Mrs. Walter H (Irene Colvin)",female,30,0,0,237249,13,,S
-936,1,"Kimball, Mrs. Edwin Nelson Jr (Gertrude Parsons)",female,45,1,0,11753,52.5542,D19,S
-937,3,"Peltomaki, Mr. Nikolai Johannes",male,25,0,0,STON/O 2. 3101291,7.925,,S
-938,1,"Chevre, Mr. Paul Romaine",male,45,0,0,PC 17594,29.7,A9,C
-939,3,"Shaughnessy, Mr. Patrick",male,,0,0,370374,7.75,,Q
-940,1,"Bucknell, Mrs. William Robert (Emma Eliza Ward)",female,60,0,0,11813,76.2917,D15,C
-941,3,"Coutts, Mrs. William (Winnie Minnie"" Treanor)""",female,36,0,2,C.A. 37671,15.9,,S
-942,1,"Smith, Mr. Lucien Philip",male,24,1,0,13695,60,C31,S
-943,2,"Pulbaum, Mr. Franz",male,27,0,0,SC/PARIS 2168,15.0333,,C
-944,2,"Hocking, Miss. Ellen Nellie""""",female,20,2,1,29105,23,,S
-945,1,"Fortune, Miss. Ethel Flora",female,28,3,2,19950,263,C23 C25 C27,S
-946,2,"Mangiavacchi, Mr. Serafino Emilio",male,,0,0,SC/A.3 2861,15.5792,,C
-947,3,"Rice, Master. Albert",male,10,4,1,382652,29.125,,Q
-948,3,"Cor, Mr. Bartol",male,35,0,0,349230,7.8958,,S
-949,3,"Abelseth, Mr. Olaus Jorgensen",male,25,0,0,348122,7.65,F G63,S
-950,3,"Davison, Mr. Thomas Henry",male,,1,0,386525,16.1,,S
-951,1,"Chaudanson, Miss. Victorine",female,36,0,0,PC 17608,262.375,B61,C
-952,3,"Dika, Mr. Mirko",male,17,0,0,349232,7.8958,,S
-953,2,"McCrae, Mr. Arthur Gordon",male,32,0,0,237216,13.5,,S
-954,3,"Bjorklund, Mr. Ernst Herbert",male,18,0,0,347090,7.75,,S
-955,3,"Bradley, Miss. Bridget Delia",female,22,0,0,334914,7.725,,Q
-956,1,"Ryerson, Master. John Borie",male,13,2,2,PC 17608,262.375,B57 B59 B63 B66,C
-957,2,"Corey, Mrs. Percy C (Mary Phyllis Elizabeth Miller)",female,,0,0,F.C.C. 13534,21,,S
-958,3,"Burns, Miss. Mary Delia",female,18,0,0,330963,7.8792,,Q
-959,1,"Moore, Mr. Clarence Bloomfield",male,47,0,0,113796,42.4,,S
-960,1,"Tucker, Mr. Gilbert Milligan Jr",male,31,0,0,2543,28.5375,C53,C
-961,1,"Fortune, Mrs. Mark (Mary McDougald)",female,60,1,4,19950,263,C23 C25 C27,S
-962,3,"Mulvihill, Miss. Bertha E",female,24,0,0,382653,7.75,,Q
-963,3,"Minkoff, Mr. Lazar",male,21,0,0,349211,7.8958,,S
-964,3,"Nieminen, Miss. Manta Josefina",female,29,0,0,3101297,7.925,,S
-965,1,"Ovies y Rodriguez, Mr. Servando",male,28.5,0,0,PC 17562,27.7208,D43,C
-966,1,"Geiger, Miss. Amalie",female,35,0,0,113503,211.5,C130,C
-967,1,"Keeping, Mr. Edwin",male,32.5,0,0,113503,211.5,C132,C
-968,3,"Miles, Mr. Frank",male,,0,0,359306,8.05,,S
-969,1,"Cornell, Mrs. Robert Clifford (Malvina Helen Lamson)",female,55,2,0,11770,25.7,C101,S
-970,2,"Aldworth, Mr. Charles Augustus",male,30,0,0,248744,13,,S
-971,3,"Doyle, Miss. Elizabeth",female,24,0,0,368702,7.75,,Q
-972,3,"Boulos, Master. Akar",male,6,1,1,2678,15.2458,,C
-973,1,"Straus, Mr. Isidor",male,67,1,0,PC 17483,221.7792,C55 C57,S
-974,1,"Case, Mr. Howard Brown",male,49,0,0,19924,26,,S
-975,3,"Demetri, Mr. Marinko",male,,0,0,349238,7.8958,,S
-976,2,"Lamb, Mr. John Joseph",male,,0,0,240261,10.7083,,Q
-977,3,"Khalil, Mr. Betros",male,,1,0,2660,14.4542,,C
-978,3,"Barry, Miss. Julia",female,27,0,0,330844,7.8792,,Q
-979,3,"Badman, Miss. Emily Louisa",female,18,0,0,A/4 31416,8.05,,S
-980,3,"O'Donoghue, Ms. Bridget",female,,0,0,364856,7.75,,Q
-981,2,"Wells, Master. Ralph Lester",male,2,1,1,29103,23,,S
-982,3,"Dyker, Mrs. Adolf Fredrik (Anna Elisabeth Judith Andersson)",female,22,1,0,347072,13.9,,S
-983,3,"Pedersen, Mr. Olaf",male,,0,0,345498,7.775,,S
-984,1,"Davidson, Mrs. Thornton (Orian Hays)",female,27,1,2,F.C. 12750,52,B71,S
-985,3,"Guest, Mr. Robert",male,,0,0,376563,8.05,,S
-986,1,"Birnbaum, Mr. Jakob",male,25,0,0,13905,26,,C
-987,3,"Tenglin, Mr. Gunnar Isidor",male,25,0,0,350033,7.7958,,S
-988,1,"Cavendish, Mrs. Tyrell William (Julia Florence Siegel)",female,76,1,0,19877,78.85,C46,S
-989,3,"Makinen, Mr. Kalle Edvard",male,29,0,0,STON/O 2. 3101268,7.925,,S
-990,3,"Braf, Miss. Elin Ester Maria",female,20,0,0,347471,7.8542,,S
-991,3,"Nancarrow, Mr. William Henry",male,33,0,0,A./5. 3338,8.05,,S
-992,1,"Stengel, Mrs. Charles Emil Henry (Annie May Morris)",female,43,1,0,11778,55.4417,C116,C
-993,2,"Weisz, Mr. Leopold",male,27,1,0,228414,26,,S
-994,3,"Foley, Mr. William",male,,0,0,365235,7.75,,Q
-995,3,"Johansson Palmquist, Mr. Oskar Leander",male,26,0,0,347070,7.775,,S
-996,3,"Thomas, Mrs. Alexander (Thamine Thelma"")""",female,16,1,1,2625,8.5167,,C
-997,3,"Holthen, Mr. Johan Martin",male,28,0,0,C 4001,22.525,,S
-998,3,"Buckley, Mr. Daniel",male,21,0,0,330920,7.8208,,Q
-999,3,"Ryan, Mr. Edward",male,,0,0,383162,7.75,,Q
-1000,3,"Willer, Mr. Aaron (Abi Weller"")""",male,,0,0,3410,8.7125,,S
-1001,2,"Swane, Mr. George",male,18.5,0,0,248734,13,F,S
-1002,2,"Stanton, Mr. Samuel Ward",male,41,0,0,237734,15.0458,,C
-1003,3,"Shine, Miss. Ellen Natalia",female,,0,0,330968,7.7792,,Q
-1004,1,"Evans, Miss. Edith Corse",female,36,0,0,PC 17531,31.6792,A29,C
-1005,3,"Buckley, Miss. Katherine",female,18.5,0,0,329944,7.2833,,Q
-1006,1,"Straus, Mrs. Isidor (Rosalie Ida Blun)",female,63,1,0,PC 17483,221.7792,C55 C57,S
-1007,3,"Chronopoulos, Mr. Demetrios",male,18,1,0,2680,14.4542,,C
-1008,3,"Thomas, Mr. John",male,,0,0,2681,6.4375,,C
-1009,3,"Sandstrom, Miss. Beatrice Irene",female,1,1,1,PP 9549,16.7,G6,S
-1010,1,"Beattie, Mr. Thomson",male,36,0,0,13050,75.2417,C6,C
-1011,2,"Chapman, Mrs. John Henry (Sara Elizabeth Lawry)",female,29,1,0,SC/AH 29037,26,,S
-1012,2,"Watt, Miss. Bertha J",female,12,0,0,C.A. 33595,15.75,,S
-1013,3,"Kiernan, Mr. John",male,,1,0,367227,7.75,,Q
-1014,1,"Schabert, Mrs. Paul (Emma Mock)",female,35,1,0,13236,57.75,C28,C
-1015,3,"Carver, Mr. Alfred John",male,28,0,0,392095,7.25,,S
-1016,3,"Kennedy, Mr. John",male,,0,0,368783,7.75,,Q
-1017,3,"Cribb, Miss. Laura Alice",female,17,0,1,371362,16.1,,S
-1018,3,"Brobeck, Mr. Karl Rudolf",male,22,0,0,350045,7.7958,,S
-1019,3,"McCoy, Miss. Alicia",female,,2,0,367226,23.25,,Q
-1020,2,"Bowenur, Mr. Solomon",male,42,0,0,211535,13,,S
-1021,3,"Petersen, Mr. Marius",male,24,0,0,342441,8.05,,S
-1022,3,"Spinner, Mr. Henry John",male,32,0,0,STON/OQ. 369943,8.05,,S
-1023,1,"Gracie, Col. Archibald IV",male,53,0,0,113780,28.5,C51,C
-1024,3,"Lefebre, Mrs. Frank (Frances)",female,,0,4,4133,25.4667,,S
-1025,3,"Thomas, Mr. Charles P",male,,1,0,2621,6.4375,,C
-1026,3,"Dintcheff, Mr. Valtcho",male,43,0,0,349226,7.8958,,S
-1027,3,"Carlsson, Mr. Carl Robert",male,24,0,0,350409,7.8542,,S
-1028,3,"Zakarian, Mr. Mapriededer",male,26.5,0,0,2656,7.225,,C
-1029,2,"Schmidt, Mr. August",male,26,0,0,248659,13,,S
-1030,3,"Drapkin, Miss. Jennie",female,23,0,0,SOTON/OQ 392083,8.05,,S
-1031,3,"Goodwin, Mr. Charles Frederick",male,40,1,6,CA 2144,46.9,,S
-1032,3,"Goodwin, Miss. Jessie Allis",female,10,5,2,CA 2144,46.9,,S
-1033,1,"Daniels, Miss. Sarah",female,33,0,0,113781,151.55,,S
-1034,1,"Ryerson, Mr. Arthur Larned",male,61,1,3,PC 17608,262.375,B57 B59 B63 B66,C
-1035,2,"Beauchamp, Mr. Henry James",male,28,0,0,244358,26,,S
-1036,1,"Lindeberg-Lind, Mr. Erik Gustaf (Mr Edward Lingrey"")""",male,42,0,0,17475,26.55,,S
-1037,3,"Vander Planke, Mr. Julius",male,31,3,0,345763,18,,S
-1038,1,"Hilliard, Mr. Herbert Henry",male,,0,0,17463,51.8625,E46,S
-1039,3,"Davies, Mr. Evan",male,22,0,0,SC/A4 23568,8.05,,S
-1040,1,"Crafton, Mr. John Bertram",male,,0,0,113791,26.55,,S
-1041,2,"Lahtinen, Rev. William",male,30,1,1,250651,26,,S
-1042,1,"Earnshaw, Mrs. Boulton (Olive Potter)",female,23,0,1,11767,83.1583,C54,C
-1043,3,"Matinoff, Mr. Nicola",male,,0,0,349255,7.8958,,C
-1044,3,"Storey, Mr. Thomas",male,60.5,0,0,3701,,,S
-1045,3,"Klasen, Mrs. (Hulda Kristina Eugenia Lofqvist)",female,36,0,2,350405,12.1833,,S
-1046,3,"Asplund, Master. Filip Oscar",male,13,4,2,347077,31.3875,,S
-1047,3,"Duquemin, Mr. Joseph",male,24,0,0,S.O./P.P. 752,7.55,,S
-1048,1,"Bird, Miss. Ellen",female,29,0,0,PC 17483,221.7792,C97,S
-1049,3,"Lundin, Miss. Olga Elida",female,23,0,0,347469,7.8542,,S
-1050,1,"Borebank, Mr. John James",male,42,0,0,110489,26.55,D22,S
-1051,3,"Peacock, Mrs. Benjamin (Edith Nile)",female,26,0,2,SOTON/O.Q. 3101315,13.775,,S
-1052,3,"Smyth, Miss. Julia",female,,0,0,335432,7.7333,,Q
-1053,3,"Touma, Master. Georges Youssef",male,7,1,1,2650,15.2458,,C
-1054,2,"Wright, Miss. Marion",female,26,0,0,220844,13.5,,S
-1055,3,"Pearce, Mr. Ernest",male,,0,0,343271,7,,S
-1056,2,"Peruschitz, Rev. Joseph Maria",male,41,0,0,237393,13,,S
-1057,3,"Kink-Heilmann, Mrs. Anton (Luise Heilmann)",female,26,1,1,315153,22.025,,S
-1058,1,"Brandeis, Mr. Emil",male,48,0,0,PC 17591,50.4958,B10,C
-1059,3,"Ford, Mr. Edward Watson",male,18,2,2,W./C. 6608,34.375,,S
-1060,1,"Cassebeer, Mrs. Henry Arthur Jr (Eleanor Genevieve Fosdick)",female,,0,0,17770,27.7208,,C
-1061,3,"Hellstrom, Miss. Hilda Maria",female,22,0,0,7548,8.9625,,S
-1062,3,"Lithman, Mr. Simon",male,,0,0,S.O./P.P. 251,7.55,,S
-1063,3,"Zakarian, Mr. Ortin",male,27,0,0,2670,7.225,,C
-1064,3,"Dyker, Mr. Adolf Fredrik",male,23,1,0,347072,13.9,,S
-1065,3,"Torfa, Mr. Assad",male,,0,0,2673,7.2292,,C
-1066,3,"Asplund, Mr. Carl Oscar Vilhelm Gustafsson",male,40,1,5,347077,31.3875,,S
-1067,2,"Brown, Miss. Edith Eileen",female,15,0,2,29750,39,,S
-1068,2,"Sincock, Miss. Maude",female,20,0,0,C.A. 33112,36.75,,S
-1069,1,"Stengel, Mr. Charles Emil Henry",male,54,1,0,11778,55.4417,C116,C
-1070,2,"Becker, Mrs. Allen Oliver (Nellie E Baumgardner)",female,36,0,3,230136,39,F4,S
-1071,1,"Compton, Mrs. Alexander Taylor (Mary Eliza Ingersoll)",female,64,0,2,PC 17756,83.1583,E45,C
-1072,2,"McCrie, Mr. James Matthew",male,30,0,0,233478,13,,S
-1073,1,"Compton, Mr. Alexander Taylor Jr",male,37,1,1,PC 17756,83.1583,E52,C
-1074,1,"Marvin, Mrs. Daniel Warner (Mary Graham Carmichael Farquarson)",female,18,1,0,113773,53.1,D30,S
-1075,3,"Lane, Mr. Patrick",male,,0,0,7935,7.75,,Q
-1076,1,"Douglas, Mrs. Frederick Charles (Mary Helene Baxter)",female,27,1,1,PC 17558,247.5208,B58 B60,C
-1077,2,"Maybery, Mr. Frank Hubert",male,40,0,0,239059,16,,S
-1078,2,"Phillips, Miss. Alice Frances Louisa",female,21,0,1,S.O./P.P. 2,21,,S
-1079,3,"Davies, Mr. Joseph",male,17,2,0,A/4 48873,8.05,,S
-1080,3,"Sage, Miss. Ada",female,,8,2,CA. 2343,69.55,,S
-1081,2,"Veal, Mr. James",male,40,0,0,28221,13,,S
-1082,2,"Angle, Mr. William A",male,34,1,0,226875,26,,S
-1083,1,"Salomon, Mr. Abraham L",male,,0,0,111163,26,,S
-1084,3,"van Billiard, Master. Walter John",male,11.5,1,1,A/5. 851,14.5,,S
-1085,2,"Lingane, Mr. John",male,61,0,0,235509,12.35,,Q
-1086,2,"Drew, Master. Marshall Brines",male,8,0,2,28220,32.5,,S
-1087,3,"Karlsson, Mr. Julius Konrad Eugen",male,33,0,0,347465,7.8542,,S
-1088,1,"Spedden, Master. Robert Douglas",male,6,0,2,16966,134.5,E34,C
-1089,3,"Nilsson, Miss. Berta Olivia",female,18,0,0,347066,7.775,,S
-1090,2,"Baimbrigge, Mr. Charles Robert",male,23,0,0,C.A. 31030,10.5,,S
-1091,3,"Rasmussen, Mrs. (Lena Jacobsen Solvang)",female,,0,0,65305,8.1125,,S
-1092,3,"Murphy, Miss. Nora",female,,0,0,36568,15.5,,Q
-1093,3,"Danbom, Master. Gilbert Sigvard Emanuel",male,0.33,0,2,347080,14.4,,S
-1094,1,"Astor, Col. John Jacob",male,47,1,0,PC 17757,227.525,C62 C64,C
-1095,2,"Quick, Miss. Winifred Vera",female,8,1,1,26360,26,,S
-1096,2,"Andrew, Mr. Frank Thomas",male,25,0,0,C.A. 34050,10.5,,S
-1097,1,"Omont, Mr. Alfred Fernand",male,,0,0,F.C. 12998,25.7417,,C
-1098,3,"McGowan, Miss. Katherine",female,35,0,0,9232,7.75,,Q
-1099,2,"Collett, Mr. Sidney C Stuart",male,24,0,0,28034,10.5,,S
-1100,1,"Rosenbaum, Miss. Edith Louise",female,33,0,0,PC 17613,27.7208,A11,C
-1101,3,"Delalic, Mr. Redjo",male,25,0,0,349250,7.8958,,S
-1102,3,"Andersen, Mr. Albert Karvin",male,32,0,0,C 4001,22.525,,S
-1103,3,"Finoli, Mr. Luigi",male,,0,0,SOTON/O.Q. 3101308,7.05,,S
-1104,2,"Deacon, Mr. Percy William",male,17,0,0,S.O.C. 14879,73.5,,S
-1105,2,"Howard, Mrs. Benjamin (Ellen Truelove Arman)",female,60,1,0,24065,26,,S
-1106,3,"Andersson, Miss. Ida Augusta Margareta",female,38,4,2,347091,7.775,,S
-1107,1,"Head, Mr. Christopher",male,42,0,0,113038,42.5,B11,S
-1108,3,"Mahon, Miss. Bridget Delia",female,,0,0,330924,7.8792,,Q
-1109,1,"Wick, Mr. George Dennick",male,57,1,1,36928,164.8667,,S
-1110,1,"Widener, Mrs. George Dunton (Eleanor Elkins)",female,50,1,1,113503,211.5,C80,C
-1111,3,"Thomson, Mr. Alexander Morrison",male,,0,0,32302,8.05,,S
-1112,2,"Duran y More, Miss. Florentina",female,30,1,0,SC/PARIS 2148,13.8583,,C
-1113,3,"Reynolds, Mr. Harold J",male,21,0,0,342684,8.05,,S
-1114,2,"Cook, Mrs. (Selena Rogers)",female,22,0,0,W./C. 14266,10.5,F33,S
-1115,3,"Karlsson, Mr. Einar Gervasius",male,21,0,0,350053,7.7958,,S
-1116,1,"Candee, Mrs. Edward (Helen Churchill Hungerford)",female,53,0,0,PC 17606,27.4458,,C
-1117,3,"Moubarek, Mrs. George (Omine Amenia"" Alexander)""",female,,0,2,2661,15.2458,,C
-1118,3,"Asplund, Mr. Johan Charles",male,23,0,0,350054,7.7958,,S
-1119,3,"McNeill, Miss. Bridget",female,,0,0,370368,7.75,,Q
-1120,3,"Everett, Mr. Thomas James",male,40.5,0,0,C.A. 6212,15.1,,S
-1121,2,"Hocking, Mr. Samuel James Metcalfe",male,36,0,0,242963,13,,S
-1122,2,"Sweet, Mr. George Frederick",male,14,0,0,220845,65,,S
-1123,1,"Willard, Miss. Constance",female,21,0,0,113795,26.55,,S
-1124,3,"Wiklund, Mr. Karl Johan",male,21,1,0,3101266,6.4958,,S
-1125,3,"Linehan, Mr. Michael",male,,0,0,330971,7.8792,,Q
-1126,1,"Cumings, Mr. John Bradley",male,39,1,0,PC 17599,71.2833,C85,C
-1127,3,"Vendel, Mr. Olof Edvin",male,20,0,0,350416,7.8542,,S
-1128,1,"Warren, Mr. Frank Manley",male,64,1,0,110813,75.25,D37,C
-1129,3,"Baccos, Mr. Raffull",male,20,0,0,2679,7.225,,C
-1130,2,"Hiltunen, Miss. Marta",female,18,1,1,250650,13,,S
-1131,1,"Douglas, Mrs. Walter Donald (Mahala Dutton)",female,48,1,0,PC 17761,106.425,C86,C
-1132,1,"Lindstrom, Mrs. Carl Johan (Sigrid Posse)",female,55,0,0,112377,27.7208,,C
-1133,2,"Christy, Mrs. (Alice Frances)",female,45,0,2,237789,30,,S
-1134,1,"Spedden, Mr. Frederic Oakley",male,45,1,1,16966,134.5,E34,C
-1135,3,"Hyman, Mr. Abraham",male,,0,0,3470,7.8875,,S
-1136,3,"Johnston, Master. William Arthur Willie""""",male,,1,2,W./C. 6607,23.45,,S
-1137,1,"Kenyon, Mr. Frederick R",male,41,1,0,17464,51.8625,D21,S
-1138,2,"Karnes, Mrs. J Frank (Claire Bennett)",female,22,0,0,F.C.C. 13534,21,,S
-1139,2,"Drew, Mr. James Vivian",male,42,1,1,28220,32.5,,S
-1140,2,"Hold, Mrs. Stephen (Annie Margaret Hill)",female,29,1,0,26707,26,,S
-1141,3,"Khalil, Mrs. Betros (Zahie Maria"" Elias)""",female,,1,0,2660,14.4542,,C
-1142,2,"West, Miss. Barbara J",female,0.92,1,2,C.A. 34651,27.75,,S
-1143,3,"Abrahamsson, Mr. Abraham August Johannes",male,20,0,0,SOTON/O2 3101284,7.925,,S
-1144,1,"Clark, Mr. Walter Miller",male,27,1,0,13508,136.7792,C89,C
-1145,3,"Salander, Mr. Karl Johan",male,24,0,0,7266,9.325,,S
-1146,3,"Wenzel, Mr. Linhart",male,32.5,0,0,345775,9.5,,S
-1147,3,"MacKay, Mr. George William",male,,0,0,C.A. 42795,7.55,,S
-1148,3,"Mahon, Mr. John",male,,0,0,AQ/4 3130,7.75,,Q
-1149,3,"Niklasson, Mr. Samuel",male,28,0,0,363611,8.05,,S
-1150,2,"Bentham, Miss. Lilian W",female,19,0,0,28404,13,,S
-1151,3,"Midtsjo, Mr. Karl Albert",male,21,0,0,345501,7.775,,S
-1152,3,"de Messemaeker, Mr. Guillaume Joseph",male,36.5,1,0,345572,17.4,,S
-1153,3,"Nilsson, Mr. August Ferdinand",male,21,0,0,350410,7.8542,,S
-1154,2,"Wells, Mrs. Arthur Henry (Addie"" Dart Trevaskis)""",female,29,0,2,29103,23,,S
-1155,3,"Klasen, Miss. Gertrud Emilia",female,1,1,1,350405,12.1833,,S
-1156,2,"Portaluppi, Mr. Emilio Ilario Giuseppe",male,30,0,0,C.A. 34644,12.7375,,C
-1157,3,"Lyntakoff, Mr. Stanko",male,,0,0,349235,7.8958,,S
-1158,1,"Chisholm, Mr. Roderick Robert Crispin",male,,0,0,112051,0,,S
-1159,3,"Warren, Mr. Charles William",male,,0,0,C.A. 49867,7.55,,S
-1160,3,"Howard, Miss. May Elizabeth",female,,0,0,A. 2. 39186,8.05,,S
-1161,3,"Pokrnic, Mr. Mate",male,17,0,0,315095,8.6625,,S
-1162,1,"McCaffry, Mr. Thomas Francis",male,46,0,0,13050,75.2417,C6,C
-1163,3,"Fox, Mr. Patrick",male,,0,0,368573,7.75,,Q
-1164,1,"Clark, Mrs. Walter Miller (Virginia McDowell)",female,26,1,0,13508,136.7792,C89,C
-1165,3,"Lennon, Miss. Mary",female,,1,0,370371,15.5,,Q
-1166,3,"Saade, Mr. Jean Nassr",male,,0,0,2676,7.225,,C
-1167,2,"Bryhl, Miss. Dagmar Jenny Ingeborg ",female,20,1,0,236853,26,,S
-1168,2,"Parker, Mr. Clifford Richard",male,28,0,0,SC 14888,10.5,,S
-1169,2,"Faunthorpe, Mr. Harry",male,40,1,0,2926,26,,S
-1170,2,"Ware, Mr. John James",male,30,1,0,CA 31352,21,,S
-1171,2,"Oxenham, Mr. Percy Thomas",male,22,0,0,W./C. 14260,10.5,,S
-1172,3,"Oreskovic, Miss. Jelka",female,23,0,0,315085,8.6625,,S
-1173,3,"Peacock, Master. Alfred Edward",male,0.75,1,1,SOTON/O.Q. 3101315,13.775,,S
-1174,3,"Fleming, Miss. Honora",female,,0,0,364859,7.75,,Q
-1175,3,"Touma, Miss. Maria Youssef",female,9,1,1,2650,15.2458,,C
-1176,3,"Rosblom, Miss. Salli Helena",female,2,1,1,370129,20.2125,,S
-1177,3,"Dennis, Mr. William",male,36,0,0,A/5 21175,7.25,,S
-1178,3,"Franklin, Mr. Charles (Charles Fardon)",male,,0,0,SOTON/O.Q. 3101314,7.25,,S
-1179,1,"Snyder, Mr. John Pillsbury",male,24,1,0,21228,82.2667,B45,S
-1180,3,"Mardirosian, Mr. Sarkis",male,,0,0,2655,7.2292,F E46,C
-1181,3,"Ford, Mr. Arthur",male,,0,0,A/5 1478,8.05,,S
-1182,1,"Rheims, Mr. George Alexander Lucien",male,,0,0,PC 17607,39.6,,S
-1183,3,"Daly, Miss. Margaret Marcella Maggie""""",female,30,0,0,382650,6.95,,Q
-1184,3,"Nasr, Mr. Mustafa",male,,0,0,2652,7.2292,,C
-1185,1,"Dodge, Dr. Washington",male,53,1,1,33638,81.8583,A34,S
-1186,3,"Wittevrongel, Mr. Camille",male,36,0,0,345771,9.5,,S
-1187,3,"Angheloff, Mr. Minko",male,26,0,0,349202,7.8958,,S
-1188,2,"Laroche, Miss. Louise",female,1,1,2,SC/Paris 2123,41.5792,,C
-1189,3,"Samaan, Mr. Hanna",male,,2,0,2662,21.6792,,C
-1190,1,"Loring, Mr. Joseph Holland",male,30,0,0,113801,45.5,,S
-1191,3,"Johansson, Mr. Nils",male,29,0,0,347467,7.8542,,S
-1192,3,"Olsson, Mr. Oscar Wilhelm",male,32,0,0,347079,7.775,,S
-1193,2,"Malachard, Mr. Noel",male,,0,0,237735,15.0458,D,C
-1194,2,"Phillips, Mr. Escott Robert",male,43,0,1,S.O./P.P. 2,21,,S
-1195,3,"Pokrnic, Mr. Tome",male,24,0,0,315092,8.6625,,S
-1196,3,"McCarthy, Miss. Catherine Katie""""",female,,0,0,383123,7.75,,Q
-1197,1,"Crosby, Mrs. Edward Gifford (Catherine Elizabeth Halstead)",female,64,1,1,112901,26.55,B26,S
-1198,1,"Allison, Mr. Hudson Joshua Creighton",male,30,1,2,113781,151.55,C22 C26,S
-1199,3,"Aks, Master. Philip Frank",male,0.83,0,1,392091,9.35,,S
-1200,1,"Hays, Mr. Charles Melville",male,55,1,1,12749,93.5,B69,S
-1201,3,"Hansen, Mrs. Claus Peter (Jennie L Howard)",female,45,1,0,350026,14.1083,,S
-1202,3,"Cacic, Mr. Jego Grga",male,18,0,0,315091,8.6625,,S
-1203,3,"Vartanian, Mr. David",male,22,0,0,2658,7.225,,C
-1204,3,"Sadowitz, Mr. Harry",male,,0,0,LP 1588,7.575,,S
-1205,3,"Carr, Miss. Jeannie",female,37,0,0,368364,7.75,,Q
-1206,1,"White, Mrs. John Stuart (Ella Holmes)",female,55,0,0,PC 17760,135.6333,C32,C
-1207,3,"Hagardon, Miss. Kate",female,17,0,0,AQ/3. 30631,7.7333,,Q
-1208,1,"Spencer, Mr. William Augustus",male,57,1,0,PC 17569,146.5208,B78,C
-1209,2,"Rogers, Mr. Reginald Harry",male,19,0,0,28004,10.5,,S
-1210,3,"Jonsson, Mr. Nils Hilding",male,27,0,0,350408,7.8542,,S
-1211,2,"Jefferys, Mr. Ernest Wilfred",male,22,2,0,C.A. 31029,31.5,,S
-1212,3,"Andersson, Mr. Johan Samuel",male,26,0,0,347075,7.775,,S
-1213,3,"Krekorian, Mr. Neshan",male,25,0,0,2654,7.2292,F E57,C
-1214,2,"Nesson, Mr. Israel",male,26,0,0,244368,13,F2,S
-1215,1,"Rowe, Mr. Alfred G",male,33,0,0,113790,26.55,,S
-1216,1,"Kreuchen, Miss. Emilie",female,39,0,0,24160,211.3375,,S
-1217,3,"Assam, Mr. Ali",male,23,0,0,SOTON/O.Q. 3101309,7.05,,S
-1218,2,"Becker, Miss. Ruth Elizabeth",female,12,2,1,230136,39,F4,S
-1219,1,"Rosenshine, Mr. George (Mr George Thorne"")""",male,46,0,0,PC 17585,79.2,,C
-1220,2,"Clarke, Mr. Charles Valentine",male,29,1,0,2003,26,,S
-1221,2,"Enander, Mr. Ingvar",male,21,0,0,236854,13,,S
-1222,2,"Davies, Mrs. John Morgan (Elizabeth Agnes Mary White) ",female,48,0,2,C.A. 33112,36.75,,S
-1223,1,"Dulles, Mr. William Crothers",male,39,0,0,PC 17580,29.7,A18,C
-1224,3,"Thomas, Mr. Tannous",male,,0,0,2684,7.225,,C
-1225,3,"Nakid, Mrs. Said (Waika Mary"" Mowad)""",female,19,1,1,2653,15.7417,,C
-1226,3,"Cor, Mr. Ivan",male,27,0,0,349229,7.8958,,S
-1227,1,"Maguire, Mr. John Edward",male,30,0,0,110469,26,C106,S
-1228,2,"de Brito, Mr. Jose Joaquim",male,32,0,0,244360,13,,S
-1229,3,"Elias, Mr. Joseph",male,39,0,2,2675,7.2292,,C
-1230,2,"Denbury, Mr. Herbert",male,25,0,0,C.A. 31029,31.5,,S
-1231,3,"Betros, Master. Seman",male,,0,0,2622,7.2292,,C
-1232,2,"Fillbrook, Mr. Joseph Charles",male,18,0,0,C.A. 15185,10.5,,S
-1233,3,"Lundstrom, Mr. Thure Edvin",male,32,0,0,350403,7.5792,,S
-1234,3,"Sage, Mr. John George",male,,1,9,CA. 2343,69.55,,S
-1235,1,"Cardeza, Mrs. James Warburton Martinez (Charlotte Wardle Drake)",female,58,0,1,PC 17755,512.3292,B51 B53 B55,C
-1236,3,"van Billiard, Master. James William",male,,1,1,A/5. 851,14.5,,S
-1237,3,"Abelseth, Miss. Karen Marie",female,16,0,0,348125,7.65,,S
-1238,2,"Botsford, Mr. William Hull",male,26,0,0,237670,13,,S
-1239,3,"Whabee, Mrs. George Joseph (Shawneene Abi-Saab)",female,38,0,0,2688,7.2292,,C
-1240,2,"Giles, Mr. Ralph",male,24,0,0,248726,13.5,,S
-1241,2,"Walcroft, Miss. Nellie",female,31,0,0,F.C.C. 13528,21,,S
-1242,1,"Greenfield, Mrs. Leo David (Blanche Strouse)",female,45,0,1,PC 17759,63.3583,D10 D12,C
-1243,2,"Stokes, Mr. Philip Joseph",male,25,0,0,F.C.C. 13540,10.5,,S
-1244,2,"Dibden, Mr. William",male,18,0,0,S.O.C. 14879,73.5,,S
-1245,2,"Herman, Mr. Samuel",male,49,1,2,220845,65,,S
-1246,3,"Dean, Miss. Elizabeth Gladys Millvina""""",female,0.17,1,2,C.A. 2315,20.575,,S
-1247,1,"Julian, Mr. Henry Forbes",male,50,0,0,113044,26,E60,S
-1248,1,"Brown, Mrs. John Murray (Caroline Lane Lamson)",female,59,2,0,11769,51.4792,C101,S
-1249,3,"Lockyer, Mr. Edward",male,,0,0,1222,7.8792,,S
-1250,3,"O'Keefe, Mr. Patrick",male,,0,0,368402,7.75,,Q
-1251,3,"Lindell, Mrs. Edvard Bengtsson (Elin Gerda Persson)",female,30,1,0,349910,15.55,,S
-1252,3,"Sage, Master. William Henry",male,14.5,8,2,CA. 2343,69.55,,S
-1253,2,"Mallet, Mrs. Albert (Antoinette Magnin)",female,24,1,1,S.C./PARIS 2079,37.0042,,C
-1254,2,"Ware, Mrs. John James (Florence Louise Long)",female,31,0,0,CA 31352,21,,S
-1255,3,"Strilic, Mr. Ivan",male,27,0,0,315083,8.6625,,S
-1256,1,"Harder, Mrs. George Achilles (Dorothy Annan)",female,25,1,0,11765,55.4417,E50,C
-1257,3,"Sage, Mrs. John (Annie Bullen)",female,,1,9,CA. 2343,69.55,,S
-1258,3,"Caram, Mr. Joseph",male,,1,0,2689,14.4583,,C
-1259,3,"Riihivouri, Miss. Susanna Juhantytar Sanni""""",female,22,0,0,3101295,39.6875,,S
-1260,1,"Gibson, Mrs. Leonard (Pauline C Boeson)",female,45,0,1,112378,59.4,,C
-1261,2,"Pallas y Castello, Mr. Emilio",male,29,0,0,SC/PARIS 2147,13.8583,,C
-1262,2,"Giles, Mr. Edgar",male,21,1,0,28133,11.5,,S
-1263,1,"Wilson, Miss. Helen Alice",female,31,0,0,16966,134.5,E39 E41,C
-1264,1,"Ismay, Mr. Joseph Bruce",male,49,0,0,112058,0,B52 B54 B56,S
-1265,2,"Harbeck, Mr. William H",male,44,0,0,248746,13,,S
-1266,1,"Dodge, Mrs. Washington (Ruth Vidaver)",female,54,1,1,33638,81.8583,A34,S
-1267,1,"Bowen, Miss. Grace Scott",female,45,0,0,PC 17608,262.375,,C
-1268,3,"Kink, Miss. Maria",female,22,2,0,315152,8.6625,,S
-1269,2,"Cotterill, Mr. Henry Harry""""",male,21,0,0,29107,11.5,,S
-1270,1,"Hipkins, Mr. William Edward",male,55,0,0,680,50,C39,S
-1271,3,"Asplund, Master. Carl Edgar",male,5,4,2,347077,31.3875,,S
-1272,3,"O'Connor, Mr. Patrick",male,,0,0,366713,7.75,,Q
-1273,3,"Foley, Mr. Joseph",male,26,0,0,330910,7.8792,,Q
-1274,3,"Risien, Mrs. Samuel (Emma)",female,,0,0,364498,14.5,,S
-1275,3,"McNamee, Mrs. Neal (Eileen O'Leary)",female,19,1,0,376566,16.1,,S
-1276,2,"Wheeler, Mr. Edwin Frederick""""",male,,0,0,SC/PARIS 2159,12.875,,S
-1277,2,"Herman, Miss. Kate",female,24,1,2,220845,65,,S
-1278,3,"Aronsson, Mr. Ernst Axel Algot",male,24,0,0,349911,7.775,,S
-1279,2,"Ashby, Mr. John",male,57,0,0,244346,13,,S
-1280,3,"Canavan, Mr. Patrick",male,21,0,0,364858,7.75,,Q
-1281,3,"Palsson, Master. Paul Folke",male,6,3,1,349909,21.075,,S
-1282,1,"Payne, Mr. Vivian Ponsonby",male,23,0,0,12749,93.5,B24,S
-1283,1,"Lines, Mrs. Ernest H (Elizabeth Lindsey James)",female,51,0,1,PC 17592,39.4,D28,S
-1284,3,"Abbott, Master. Eugene Joseph",male,13,0,2,C.A. 2673,20.25,,S
-1285,2,"Gilbert, Mr. William",male,47,0,0,C.A. 30769,10.5,,S
-1286,3,"Kink-Heilmann, Mr. Anton",male,29,3,1,315153,22.025,,S
-1287,1,"Smith, Mrs. Lucien Philip (Mary Eloise Hughes)",female,18,1,0,13695,60,C31,S
-1288,3,"Colbert, Mr. Patrick",male,24,0,0,371109,7.25,,Q
-1289,1,"Frolicher-Stehli, Mrs. Maxmillian (Margaretha Emerentia Stehli)",female,48,1,1,13567,79.2,B41,C
-1290,3,"Larsson-Rondberg, Mr. Edvard A",male,22,0,0,347065,7.775,,S
-1291,3,"Conlon, Mr. Thomas Henry",male,31,0,0,21332,7.7333,,Q
-1292,1,"Bonnell, Miss. Caroline",female,30,0,0,36928,164.8667,C7,S
-1293,2,"Gale, Mr. Harry",male,38,1,0,28664,21,,S
-1294,1,"Gibson, Miss. Dorothy Winifred",female,22,0,1,112378,59.4,,C
-1295,1,"Carrau, Mr. Jose Pedro",male,17,0,0,113059,47.1,,S
-1296,1,"Frauenthal, Mr. Isaac Gerald",male,43,1,0,17765,27.7208,D40,C
-1297,2,"Nourney, Mr. Alfred (Baron von Drachstedt"")""",male,20,0,0,SC/PARIS 2166,13.8625,D38,C
-1298,2,"Ware, Mr. William Jeffery",male,23,1,0,28666,10.5,,S
-1299,1,"Widener, Mr. George Dunton",male,50,1,1,113503,211.5,C80,C
-1300,3,"Riordan, Miss. Johanna Hannah""""",female,,0,0,334915,7.7208,,Q
-1301,3,"Peacock, Miss. Treasteall",female,3,1,1,SOTON/O.Q. 3101315,13.775,,S
-1302,3,"Naughton, Miss. Hannah",female,,0,0,365237,7.75,,Q
-1303,1,"Minahan, Mrs. William Edward (Lillian E Thorpe)",female,37,1,0,19928,90,C78,Q
-1304,3,"Henriksson, Miss. Jenny Lovisa",female,28,0,0,347086,7.775,,S
-1305,3,"Spector, Mr. Woolf",male,,0,0,A.5. 3236,8.05,,S
-1306,1,"Oliva y Ocana, Dona. Fermina",female,39,0,0,PC 17758,108.9,C105,C
-1307,3,"Saether, Mr. Simon Sivertsen",male,38.5,0,0,SOTON/O.Q. 3101262,7.25,,S
-1308,3,"Ware, Mr. Frederick",male,,0,0,359309,8.05,,S
-1309,3,"Peter, Master. Michael J",male,,1,1,2668,22.3583,,C
--- a/ml2/images/EscUpmPolit_p.gif
+++ b/ml2/images/EscUpmPolit_p.gif
--- a/ml2/images/machine-learning-process.jpg
+++ b/ml2/images/machine-learning-process.jpg
--- a/ml2/images/titanic.jpg
+++ b/ml2/images/titanic.jpg
--- a/ml2/plot_learning_curve.py
+++ b/ml2/plot_learning_curve.py
@@ -0,0 +1,109 @@
+"""
+Taken from http://scikit-learn.org/stable/auto_examples/model_selection/plot_learning_curve.html
+
+========================
+Plotting Learning Curves
+========================
+
+On the left side the learning curve of a naive Bayes classifier is shown for
+the digits dataset. Note that the training score and the cross-validation score
+are both not very good at the end. However, the shape of the curve can be found
+in more complex datasets very often: the training score is very high at the
+beginning and decreases and the cross-validation score is very low at the
+beginning and increases. On the right side we see the learning curve of an SVM
+with RBF kernel. We can see clearly that the training score is still around
+the maximum and the validation score could be increased with more training
+samples.
+"""
+#print(__doc__)
+
+import numpy as np
+import matplotlib.pyplot as plt
+from sklearn import cross_validation
+from sklearn.naive_bayes import GaussianNB
+from sklearn.svm import SVC
+from sklearn.datasets import load_digits
+from sklearn.learning_curve import learning_curve
+
+
+def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,
+                        n_jobs=1, train_sizes=np.linspace(.1, 1.0, 5)):
+    """
+    Generate a simple plot of the test and traning learning curve.
+
+    Parameters
+    ----------
+    estimator : object type that implements the "fit" and "predict" methods
+        An object of that type which is cloned for each validation.
+
+    title : string
+        Title for the chart.
+
+    X : array-like, shape (n_samples, n_features)
+        Training vector, where n_samples is the number of samples and
+        n_features is the number of features.
+
+    y : array-like, shape (n_samples) or (n_samples, n_features), optional
+        Target relative to X for classification or regression;
+        None for unsupervised learning.
+
+    ylim : tuple, shape (ymin, ymax), optional
+        Defines minimum and maximum yvalues plotted.
+
+    cv : integer, cross-validation generator, optional
+        If an integer is passed, it is the number of folds (defaults to 3).
+        Specific cross-validation objects can be passed, see
+        sklearn.cross_validation module for the list of possible objects
+
+    n_jobs : integer, optional
+        Number of jobs to run in parallel (default 1).
+    """
+    plt.figure()
+    plt.title(title)
+    if ylim is not None:
+        plt.ylim(*ylim)
+    plt.xlabel("Training examples")
+    plt.ylabel("Score")
+    train_sizes, train_scores, test_scores = learning_curve(
+        estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)
+    train_scores_mean = np.mean(train_scores, axis=1)
+    train_scores_std = np.std(train_scores, axis=1)
+    test_scores_mean = np.mean(test_scores, axis=1)
+    test_scores_std = np.std(test_scores, axis=1)
+    plt.grid()
+
+    plt.fill_between(train_sizes, train_scores_mean - train_scores_std,
+                     train_scores_mean + train_scores_std, alpha=0.1,
+                     color="r")
+    plt.fill_between(train_sizes, test_scores_mean - test_scores_std,
+                     test_scores_mean + test_scores_std, alpha=0.1, color="g")
+    plt.plot(train_sizes, train_scores_mean, 'o-', color="r",
+             label="Training score")
+    plt.plot(train_sizes, test_scores_mean, 'o-', color="g",
+             label="Cross-validation score")
+
+    plt.legend(loc="best")
+    return plt
+
+
+#digits = load_digits()
+#X, y = digits.data, digits.target
+
+
+#title = "Learning Curves (Naive Bayes)"
+# Cross validation with 100 iterations to get smoother mean test and train
+# score curves, each time with 20% data randomly selected as a validation set.
+#cv = cross_validation.ShuffleSplit(digits.data.shape[0], n_iter=100,
+#                                   test_size=0.2, random_state=0)
+
+#estimator = GaussianNB()
+#plot_learning_curve(estimator, title, X, y, ylim=(0.7, 1.01), cv=cv, n_jobs=4)
+
+#title = "Learning Curves (SVM, RBF kernel, $\gamma=0.001$)"
+# SVC is more expensive so we do a lower number of CV iterations:
+#cv = cross_validation.ShuffleSplit(digits.data.shape[0], n_iter=10,
+#	                                   test_size=0.2, random_state=0)
+#estimator = SVC(gamma=0.001)
+#plot_learning_curve(estimator, title, X, y, (0.7, 1.01), cv=cv, n_jobs=4)
+
+#plt.show()
--- a/ml2/plot_svm.py
+++ b/ml2/plot_svm.py
@@ -0,0 +1,80 @@
+from patsy import dmatrices
+import matplotlib.pyplot as plt
+import numpy as np
+from sklearn import svm
+
+#Taken from http://nbviewer.jupyter.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb
+
+def plot_svm(df):
+	# set plotting parameters
+	plt.figure(figsize=(8,6))
+
+        # # Create an acceptable formula for our machine learning algorithms
+	formula_ml = 'Survived ~ C(Pclass) + C(Sex) + Age + SibSp + Parch + C(Embarked)'
+	# create a regression friendly data frame
+	y, x = dmatrices(formula_ml, data=df, return_type='matrix')
+
+	# select which features we would like to analyze
+	# try chaning the selection here for diffrent output.
+	# Choose : [2,3] - pretty sweet DBs [3,1] --standard DBs [7,3] -very cool DBs,
+	# [3,6] -- very long complex dbs, could take over an hour to calculate! 
+	feature_1 = 2
+	feature_2 = 3
+
+	X = np.asarray(x)
+	X = X[:,[feature_1, feature_2]]  
+
+
+	y = np.asarray(y)
+	# needs to be 1 dimensional so we flatten. it comes out of dmatrices with a shape. 
+	y = y.flatten()      
+
+	n_sample = len(X)
+
+	np.random.seed(0)
+	order = np.random.permutation(n_sample)
+
+	X = X[order]
+	y = y[order].astype(np.float)
+
+	# do a cross validation
+	nighty_precent_of_sample = int(.9 * n_sample)
+	X_train = X[:nighty_precent_of_sample]
+	y_train = y[:nighty_precent_of_sample]
+	X_test = X[nighty_precent_of_sample:]
+	y_test = y[nighty_precent_of_sample:]
+
+	# create a list of the types of kerneks we will use for your analysis
+	types_of_kernels = ['linear', 'rbf', 'poly']
+
+	# specify our color map for plotting the results
+	color_map = plt.cm.RdBu_r
+
+	# fit the model
+	for fig_num, kernel in enumerate(types_of_kernels):
+    		clf = svm.SVC(kernel=kernel, gamma=3)
+    		clf.fit(X_train, y_train)
+
+    		plt.figure(fig_num)
+    		plt.scatter(X[:, 0], X[:, 1], c=y, zorder=10, cmap=color_map)
+
+    		# circle out the test data
+    		plt.scatter(X_test[:, 0], X_test[:, 1], s=80, facecolors='none', zorder=10)
+    
+    		plt.axis('tight')
+   	 	x_min = X[:, 0].min()
+    		x_max = X[:, 0].max()
+    		y_min = X[:, 1].min()
+    		y_max = X[:, 1].max()
+
+    		XX, YY = np.mgrid[x_min:x_max:200j, y_min:y_max:200j]
+    		Z = clf.decision_function(np.c_[XX.ravel(), YY.ravel()])
+
+    		# put the result into a color plot
+    		Z = Z.reshape(XX.shape)
+    		plt.pcolormesh(XX, YY, Z > 0, cmap=color_map)
+    		plt.contour(XX, YY, Z, colors=['k', 'k', 'k'], linestyles=['--', '-', '--'],
+               		levels=[-.5, 0, .5])
+
+    		plt.title(kernel)
+    		plt.show()