1
0
mirror of https://github.com/gsi-upm/sitc synced 2025-12-15 09:38:16 +00:00

Fix visualization section

This commit is contained in:
Stefano
2017-12-11 18:12:06 +01:00
parent fdf696380d
commit 23073b3431
4 changed files with 8 additions and 7 deletions

View File

@@ -50,7 +50,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal of this notebook is to learn how separate the dataset into training and test datasets and then preprocess the data."
"The goal of this notebook is to learn how to split the dataset into a training and a test datasets and then preprocess the data."
]
},
{
@@ -78,7 +78,7 @@
"source": [
"A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n",
"\n",
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ration 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
]
},
{