From 21e7ae2f57ce6ca6345e3dde8f1d6a9c7c6d4ee0 Mon Sep 17 00:00:00 2001 From: "Carlos A. Iglesias" Date: Mon, 2 Jun 2025 17:13:49 +0300 Subject: [PATCH] Update 2_5_2_Decision_Tree_Model.ipynb Changed image path --- ml1/2_5_2_Decision_Tree_Model.ipynb | 30 ++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/ml1/2_5_2_Decision_Tree_Model.ipynb b/ml1/2_5_2_Decision_Tree_Model.ipynb index bf6adf8..c2facd5 100644 --- a/ml1/2_5_2_Decision_Tree_Model.ipynb +++ b/ml1/2_5_2_Decision_Tree_Model.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "![](files/images/EscUpmPolit_p.gif \"UPM\")" + "![](./images/EscUpmPolit_p.gif \"UPM\")" ] }, { @@ -56,9 +56,9 @@ "source": [ "The goal of this notebook is to learn how to create a classification object using a [decision tree learning algorithm](https://en.wikipedia.org/wiki/Decision_tree_learning). \n", "\n", - "There are a number of well known machine learning algorithms for decision tree learning, such as ID3, C4.5, C5.0 and CART. The scikit-learn uses an optimised version of the [CART (Classification and Regression Trees) algorithm](https://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees).\n", + "There are several well-known machine learning algorithms for decision tree learning, such as ID3, C4.5, C5.0, and CART. The scikit-learn uses an optimised version of the [CART (Classification and Regression Trees) algorithm](https://en.wikipedia.org/wiki/Predictive_analytics#Classification_and_regression_trees).\n", "\n", - "This notebook will follow the same steps that the previous notebook for learning using the [kNN Model](2_5_1_kNN_Model.ipynb), and details some peculiarities of the decision tree algorithms.\n", + "This notebook will follow the same steps as the previous notebook for learning using the [kNN Model](2_5_1_kNN_Model.ipynb), and details some peculiarities of the decision tree algorithms.\n", "\n", "You need to install pydotplus: `conda install pydotplus` for the visualization." ] @@ -69,7 +69,7 @@ "source": [ "## Load data and preprocessing\n", "\n", - "Here we repeat the same operations for loading data and preprocessing than in the previous notebooks." + "Here we repeat the same operations for loading data and preprocessing as in the previous notebooks." ] }, { @@ -262,8 +262,8 @@ "The current version of pydot does not work well in Python 3.\n", "For obtaining an image, you need to install `pip install pydotplus` and then `conda install graphviz`.\n", "\n", - "You can skip this example. Since it can require installing additional packages, we include here the result.\n", - "![Decision Tree](files/images/cart.png)" + "You can skip this example. Since it can require installing additional packages, we have included the result here.\n", + "![Decision Tree](./images/cart.png)" ] }, { @@ -330,7 +330,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Next we are going to export the pseudocode of the the learnt decision tree." + "Next, we will export the pseudocode of the learnt decision tree." ] }, { @@ -378,14 +378,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Precision, recall and f-score" + "### Precision, recall, and f-score" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "For evaluating classification algorithms, we usually calculate three metrics: precision, recall and F1-score\n", + "For evaluating classification algorithms, we usually calculate three metrics: precision, recall, and F1-score\n", "\n", "* **Precision**: This computes the proportion of instances predicted as positives that were correctly evaluated (it measures how right our classifier is when it says that an instance is positive).\n", "* **Recall**: This counts the proportion of positive instances that were correctly evaluated (measuring how right our classifier is when faced with a positive instance).\n", @@ -412,7 +412,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Another useful metric is the confusion matrix" + "Another useful metric is the confusion matrix." ] }, { @@ -428,7 +428,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We see we classify well all the 'setosa' and 'versicolor' samples. " + "We classify all the 'setosa' and 'versicolor' samples well. " ] }, { @@ -442,7 +442,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In order to avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**.\n", + "To avoid bias in the training and testing dataset partition, it is recommended to use **k-fold validation**.\n", "\n", "Sklearn comes with other strategies for [cross validation](http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation), such as stratified K-fold, label k-fold, Leave-One-Out, Leave-P-Out, Leave-One-Label-Out, Leave-P-Label-Out or Shuffle & Split." ] @@ -466,7 +466,7 @@ "# create a k-fold cross validation iterator of k=10 folds\n", "cv = KFold(10, shuffle=True, random_state=33)\n", "\n", - "# by default the score used is the one returned by score method of the estimator (accuracy)\n", + "# by default the score used is the one returned by the score method of the estimator (accuracy)\n", "scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n", "print(scores)" ] @@ -475,7 +475,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure" + "We get an array of k scores. We can calculate the mean and the standard error to obtain a final figure." ] }, { @@ -518,7 +518,7 @@ "metadata": {}, "source": [ "## Licence\n", - "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", + "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "\n", "© Carlos A. Iglesias, Universidad Politécnica de Madrid." ]