mirror of
https://github.com/gsi-upm/sitc
synced 2025-06-13 11:42:21 +00:00
Update 2_4_Preprocessing.ipynb
Changed image path
This commit is contained in:
parent
b9ecccdeab
commit
f82203f371
@ -4,7 +4,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
""
|
||||
""
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -76,7 +76,7 @@
|
||||
"source": [
|
||||
"A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n",
|
||||
"\n",
|
||||
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
|
||||
"We will use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -122,9 +122,9 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
|
||||
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might misbehave if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
|
||||
"\n",
|
||||
"The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set. Later, the same transformation will be applied on the testing set."
|
||||
"The preprocessing module further provides a utility class `StandardScaler` to compute a training set's mean and standard deviation. Later, the same transformation will be applied on the testing set."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -173,7 +173,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Licences\n",
|
||||
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||
"\n",
|
||||
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
|
||||
]
|
||||
|
Loading…
x
Reference in New Issue
Block a user