1
0
mirror of https://github.com/gsi-upm/sitc synced 2025-06-13 11:42:21 +00:00

Update 2_3_0_Visualisation.ipynb

Changed image path
This commit is contained in:
Carlos A. Iglesias 2025-06-02 16:06:53 +03:00 committed by GitHub
parent ec02125396
commit ec11ff2d5e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
"![](./images/EscUpmPolit_p.gif \"UPM\")"
]
},
{
@ -49,7 +49,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The goal of this notebook is to learn how to analyse a dataset. We will cover other tasks such as cleaning or munging (changing the format) the dataset in other sessions."
"This notebook aims to learn how to analyse a dataset. We will cover other tasks such as cleaning or munging (changing the format) the dataset in other sessions."
]
},
{
@ -65,13 +65,13 @@
"source": [
"This section covers different ways to inspect the distribution of samples per feature.\n",
"\n",
"First of all, let's see how many samples of each class we have, using a [histogram](https://en.wikipedia.org/wiki/Histogram). \n",
"First of all, let's see how many samples we have in each class using a [histogram](https://en.wikipedia.org/wiki/Histogram). \n",
"\n",
"A histogram is a graphical representation of the distribution of numerical data. It is an estimation of the probability distribution of a continuous variable (quantitative variable). \n",
"A histogram is a graphical representation of the distribution of numerical data. It estimates the probability distribution of a continuous variable (quantitative variable). \n",
"\n",
"For building a histogram, we need first to 'bin' the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. \n",
"For building a histogram, we need to 'bin' the range of values—that is, divide the entire range of values into a series of intervals—and then count how many values fall into each interval. \n",
"\n",
"In our case, since the values are not continuous and we have only three values, we do not need to bin them."
"Since the values are not continuous and we have only three values, we do not need to bin them."
]
},
{
@ -115,7 +115,7 @@
"metadata": {},
"source": [
"As can be seen, we have the same distribution of samples for every class.\n",
"The next step is to see the distribution of the features"
"The next step is to see the distribution of the features."
]
},
{
@ -184,7 +184,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As we can see, the Setosa class seems to be linearly separable with these two features.\n",
"As we can see, the Setosa class seems linearly separable with these two features.\n",
"\n",
"Another nice visualisation is given below."
]
@ -241,7 +241,7 @@
"source": [
"## Licence\n",
"\n",
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
"\n",
"© Carlos A. Iglesias, Universidad Politécnica de Madrid."
]