mirror of
https://github.com/gsi-upm/sitc
synced 2025-07-04 20:02:22 +00:00
Update 2_5_1_kNN_Model.ipynb
Changed image path
This commit is contained in:
parent
c5967746ea
commit
7b4d16964d
@ -4,7 +4,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
""
|
||||
""
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -55,7 +55,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The goal of this notebook is to learn how to train a model, make predictions with that model and evaluate these predictions.\n",
|
||||
"The goal of this notebook is to learn how to train a model, make predictions with that model, and evaluate these predictions.\n",
|
||||
"\n",
|
||||
"The notebook uses the [kNN (k nearest neighbors) algorithm](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm)."
|
||||
]
|
||||
@ -212,14 +212,14 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Precision, recall and f-score"
|
||||
"### Precision, recall, and f-score"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall and F1-score\n",
|
||||
"For evaluating classification algorithms, we usually calculate three metrics: precision, recall, and F1-score\n",
|
||||
"\n",
|
||||
"* **Precision**: This computes the proportion of instances predicted as positives that were correctly evaluated (it measures how right our classifier is when it says that an instance is positive).\n",
|
||||
"* **Recall**: This counts the proportion of positive instances that were correctly evaluated (measuring how right our classifier is when faced with a positive instance).\n",
|
||||
@ -246,7 +246,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Another useful metric is the confusion matrix"
|
||||
"Another useful metric is the confusion matrix."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -262,7 +262,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We see we classify well all the 'setosa' and 'versicolor' samples. "
|
||||
"We classify all the 'setosa' and 'versicolor' samples well. "
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -276,7 +276,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In order to avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**."
|
||||
"To avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -298,7 +298,7 @@
|
||||
"# create a k-fold cross validation iterator of k=10 folds\n",
|
||||
"cv = KFold(10, shuffle=True, random_state=33)\n",
|
||||
"\n",
|
||||
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
|
||||
"# by default the score used is the one returned by the score method of the estimator (accuracy)\n",
|
||||
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
|
||||
"print(scores)"
|
||||
]
|
||||
@ -307,7 +307,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure"
|
||||
"We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -340,7 +340,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We are going to tune the algorithm, and calculate which is the best value for the k hyperparameter."
|
||||
"We will tune the algorithm and calculate the best value for the k hyperparameter."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -365,7 +365,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The result is very dependent of the input data. Execute again the train_test_split and test again how the result changes with k."
|
||||
"The result is very dependent on the input data. Execute the train_test_split again and test how the result changes with k."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -387,7 +387,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Licence\n",
|
||||
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||
"\n",
|
||||
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
|
||||
]
|
||||
|
Loading…
x
Reference in New Issue
Block a user