1
0
mirror of https://github.com/gsi-upm/sitc synced 2025-06-13 11:42:21 +00:00

Update 2_4_Preprocessing.ipynb

Changed image path
This commit is contained in:
Carlos A. Iglesias 2025-06-02 16:29:26 +03:00 committed by GitHub
parent b9ecccdeab
commit f82203f371
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -76,7 +76,7 @@
"source": [
"A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n",
"\n",
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
"We will use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
]
},
{
@ -122,9 +122,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
"Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might misbehave if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n",
"\n",
"The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set. Later, the same transformation will be applied on the testing set."
"The preprocessing module further provides a utility class `StandardScaler` to compute a training set's mean and standard deviation. Later, the same transformation will be applied on the testing set."
]
},
{
@ -173,7 +173,7 @@
"metadata": {},
"source": [
"### Licences\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]