sitc/ml2/3_6_Machine_Learning.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](images/EscUpmPolit_p.gif \"UPM\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Course Notes for Learning Intelligent Systems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, ©  Carlos A. Iglesias"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## [Introduction to Machine Learning II](3_0_0_Intro_ML_2.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Machine Learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the previous session, we learnt how to apply machine learning algorithms to the Iris dataset.\n",
    "\n",
    "We are going to review the full process now. As you probably have noticed, data preparation, cleaning, and transformation account for more than 90% of the data mining effort.\n",
    "\n",
    "The phases are:\n",
    "\n",
    "* **Data ingestion**: reading the data from the data lake\n",
    "* **Preprocessing**: \n",
    "    * **Data cleaning (munging)**:  fill missing values, smooth noisy data (binning methods), identify or remove outliers, and resolve inconsistencies \n",
    "    * **Data integration**: Integrate multiple datasets\n",
    "    * **Data transformation**: normalization (rescale numeric values between 0 and 1), standardisation (rescale values to have a mean of 0 and std of 1), transformation for smoothing a variable (e.g., square root, ...), aggregation of data from several datasets\n",
    "    * **Data reduction**: dimensionality reduction, clustering, and sampling. \n",
    "    * **Data discretization**: for numerical values and algorithms that do not accept continuous variables\n",
    "    * **Feature engineering**: selection of the most relevant features, creation of new features, and deletion of non-relevant features\n",
    "    * Apply  Sampling for dividing the dataset into training and test datasets.\n",
    "* **Machine learning**: apply machine learning algorithms and obtain an estimator, tuning its parameters.\n",
    "* **Evaluation** of the model\n",
    "* **Prediction**: Use the model for new data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "![Machine Learning Process from *Python Machine Learning* book](images/machine-learning-process.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Licence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* [Python Machine Learning](https://learning.oreilly.com/library/view/python-machine-learning/9781789955750/), Sebastian Raschka and Vahid Mirjalili, Packt Publishing, 2019."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Licence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
    "\n",
    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.2"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}