diff --git a/ml1/2_4_Preprocessing.ipynb b/ml1/2_4_Preprocessing.ipynb index f0ef2ed..703a523 100644 --- a/ml1/2_4_Preprocessing.ipynb +++ b/ml1/2_4_Preprocessing.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "![](files/images/EscUpmPolit_p.gif \"UPM\")" + "![](./images/EscUpmPolit_p.gif \"UPM\")" ] }, { @@ -76,7 +76,7 @@ "source": [ "A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n", "\n", - "We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)." + "We will use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)." ] }, { @@ -122,9 +122,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might behave badly if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n", + "Standardization of datasets is a common requirement for many machine learning estimators implemented in the scikit; they might misbehave if the individual features do not more or less look like standard normally distributed data: Gaussian with zero mean and unit variance.\n", "\n", - "The preprocessing module further provides a utility class `StandardScaler` to compute the mean and standard deviation on a training set. Later, the same transformation will be applied on the testing set." + "The preprocessing module further provides a utility class `StandardScaler` to compute a training set's mean and standard deviation. Later, the same transformation will be applied on the testing set." ] }, { @@ -173,7 +173,7 @@ "metadata": {}, "source": [ "### Licences\n", - "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", + "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "\n", "© Carlos A. Iglesias, Universidad Politécnica de Madrid." ]