mirror of
https://github.com/gsi-upm/sitc
synced 2024-12-22 03:38:13 +00:00
Review J
This commit is contained in:
parent
65d1dc162f
commit
62f4fce1ed
@ -102,7 +102,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.5.1"
|
||||
"version": "3.5.1+"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -122,7 +122,7 @@
|
||||
"source": [
|
||||
"If you installed the conda distribution, scikit-learn is already installed! This is the best option.\n",
|
||||
"\n",
|
||||
"In case it is an old installation, you can updated it using conda: `conda update scikit-learn`.\n",
|
||||
"In case it is an old installation, you can update it using conda: `conda update scikit-learn`.\n",
|
||||
"\n",
|
||||
"If it is not installed, install it with conda: `conda install scikit-learn`.\n",
|
||||
"\n",
|
||||
@ -176,7 +176,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.5.1"
|
||||
"version": "3.5.1+"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -4,27 +4,12 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Course Notes for Learning Intelligent Systems"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"![](files/images/EscUpmPolit_p.gif \"UPM\")\n",
|
||||
"\n",
|
||||
"# Course Notes for Learning Intelligent Systems\n",
|
||||
"\n",
|
||||
"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias\n",
|
||||
"\n",
|
||||
"## [Introduction to Machine Learning](2_0_0_Intro_ML.ipynb)"
|
||||
]
|
||||
},
|
||||
@ -51,7 +36,7 @@
|
||||
"source": [
|
||||
"The goal of this notebook is to learn how to read and load a sample dataset.\n",
|
||||
"\n",
|
||||
"Scikit-learn come with some bundled [datasets](http://scikit-learn.org/stable/datasets/): iris, digits, boston, etc.\n",
|
||||
"Scikit-learn comes with some bundled [datasets](http://scikit-learn.org/stable/datasets/): iris, digits, boston, etc.\n",
|
||||
"\n",
|
||||
"In this notebook we are going to use the Iris dataset."
|
||||
]
|
||||
@ -78,12 +63,12 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In ordert to read the dataset, we import the bundle datasets and then load the Iris dataset. "
|
||||
"In ordert to read the dataset, we import the datasets bundle and then load the Iris dataset. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -105,7 +90,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 9,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -116,7 +101,7 @@
|
||||
"sklearn.datasets.base.Bunch"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@ -128,7 +113,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 10,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -204,12 +189,12 @@
|
||||
],
|
||||
"source": [
|
||||
"# print descrition of the dataset\n",
|
||||
"print (iris.DESCR)"
|
||||
"print(iris.DESCR)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 35,
|
||||
"execution_count": 11,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -229,7 +214,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 36,
|
||||
"execution_count": 12,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -249,7 +234,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 33,
|
||||
"execution_count": 13,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -260,7 +245,7 @@
|
||||
"numpy.ndarray"
|
||||
]
|
||||
},
|
||||
"execution_count": 33,
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@ -279,7 +264,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 37,
|
||||
"execution_count": 14,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -472,7 +457,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"execution_count": 16,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -493,7 +478,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"execution_count": 17,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -513,7 +498,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"execution_count": 18,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -533,7 +518,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"execution_count": 19,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -553,7 +538,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"execution_count": 20,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -575,7 +560,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In another session, we will learn how to load a dataset from a file (csv, excel, ...). We will use the library pandas for this purpose."
|
||||
"In following sessions we will learn how to load a dataset from a file (csv, excel, ...) using the pandas library."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -625,7 +610,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.5.1"
|
||||
"version": "3.5.1+"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -55,7 +55,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
@ -185,12 +185,12 @@
|
||||
"source": [
|
||||
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
|
||||
"\n",
|
||||
"The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set so as to be able to later reapply the same transformation on the testing set."
|
||||
"The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set. Later, the same transformation will be applied on the testing set."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"execution_count": 10,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -205,7 +205,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 11,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -306,7 +306,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.5.1"
|
||||
"version": "3.5.1+"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -36,8 +36,8 @@
|
||||
"\n",
|
||||
"* [Machine Learning](#Machine-Learning)\n",
|
||||
"* [Machine learning algorithms](#Machine-learning-algorithms)\n",
|
||||
"\t\t* [Supervised machine learning model](#Supervised-machine-learning-model)\n",
|
||||
"\t\t* [Unsupervised machine learning model](#Unsupervised-machine-learning-model)\n",
|
||||
" * [Supervised machine learning model](#Supervised-machine-learning-model)\n",
|
||||
"\t* [Unsupervised machine learning model](#Unsupervised-machine-learning-model)\n",
|
||||
"* [sklearn interface](#sklearn-interface)\n",
|
||||
"* [References](#References)"
|
||||
]
|
||||
@ -53,7 +53,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This is an introduction of general ideas about machine learning and the general interface of scikit-learn, taken from the [scikit-learn tutorial](http://www.astroml.org/sklearn_tutorial/general_concepts.html). \n",
|
||||
"This is an introduction of general ideas about machine learning and the interface of scikit-learn, taken from the [scikit-learn tutorial](http://www.astroml.org/sklearn_tutorial/general_concepts.html). \n",
|
||||
"\n",
|
||||
"You can skip it during the lab session and read it later,"
|
||||
]
|
||||
@ -75,7 +75,7 @@
|
||||
"* **n_samples**: number of samples. Each sample is an item to process (i.e. classify). A sample can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits.\n",
|
||||
"* **n_features**: The number of features or distinct traits that can be used to describe each item in a quantitative manner.\n",
|
||||
"\n",
|
||||
"The number of features should be defined in advanced and it can be very high dimensional (e.g. millions of features) with most of them being zeros for a given sample. In this case we may use (scipy.sparse) sparse matrices instead of (numpy) arrays so as to make the data fit in memory.\n",
|
||||
"The number of features should be defined in advance. There is a specific type of feature sets that are high dimensional (e.g. millions of features), but most of the values are zero for a given sample. Using (numpy) arrays, all those values that are zero would also take up memory. For this reason, these feature sets are often represented with sparse matrices (scipy.sparse) instead of (numpy) arrays.\n",
|
||||
"\n",
|
||||
"The first step in machine learning is **identifying the relevant features** from the input data, and the second step is **extracting the features** from the input data. \n",
|
||||
"\n",
|
||||
@ -193,7 +193,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.5.1"
|
||||
"version": "3.5.1+"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
@ -55,7 +55,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -68,7 +68,7 @@
|
||||
" weights='uniform'))])"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@ -108,7 +108,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
@ -119,7 +119,7 @@
|
||||
"array([0])"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@ -140,7 +140,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
@ -196,7 +196,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.5.1"
|
||||
"version": "3.5.1+"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue
Block a user