mirror of
https://github.com/gsi-upm/sitc
synced 2025-12-15 09:38:16 +00:00
Fix visualization section
This commit is contained in:
@@ -50,7 +50,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The goal of this notebook is to learn how separate the dataset into training and test datasets and then preprocess the data."
|
||||
"The goal of this notebook is to learn how to split the dataset into a training and a test datasets and then preprocess the data."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -78,7 +78,7 @@
|
||||
"source": [
|
||||
"A common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the **training set** on which we learn data properties and one that we call the **testing set** on which we test these properties. \n",
|
||||
"\n",
|
||||
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ration 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
|
||||
"We are going to use *scikit-learn* to split the data into random training and testing sets. We follow the ratio 75% for training and 25% for testing. We use `random_state` to ensure that the result is always the same and it is reproducible. (Otherwise, we would get different training and testing sets every time)."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user