{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](files/images/EscUpmPolit_p.gif \"UPM\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Course Notes for Learning Intelligent Systems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, ©  Carlos A. Iglesias"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## [Introduction to Machine Learning](2_0_0_Intro_ML.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Table of Contents\n",
    "* [Introduction to scikit-learn](#Introduction-to-scikit-learn)\n",
    "* [What is scikit-learn?](#What-is-scikit-learn?)\n",
    "* [Problems that scikit-learn can solve](#Problems-that-scikit-learn-can-solve)\n",
    "* [Helpers for Machine Learning](#Helpers-for-Machine-Learning)\n",
    "* [How to install scikit-learn](#How-to-install-scikit-learn)\n",
    "* [References](#References)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to scikit-learn"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This lecture provides a quick introduction to [scikit-learn](http://scikit-learn.org/stable/), a Python library for machine learning."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is scikit-learn?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Scikit-Learn is a Python library that provides a wealth of machine learning algorithms. \n",
    "\n",
    "The library is built upon SciPy (Scientific Python) that should be installed before using scikit-learn.\n",
    "\n",
    "In particular, scikit-learn uses:\n",
    "* **NumPy**: package for managing n-dimensional arrays (http://www.numpy.org/)\n",
    "* **pandas**: data analysis toolkit (http://pandas.pydata.org/pandas-docs/stable/index.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problems that scikit-learn can solve"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Scikit-learn provides algorithms for solving the following problems:\n",
    "* **Classification**: Identifying to which category an object belongs to. Some of the available [classification algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are decision trees (ID3, C4.5, ...), kNN, SVM, Random forest, Perceptron, etc. \n",
    "* **Clustering**: Automatic grouping of similar objects into sets. Some of the available [clustering algorithms](http://scikit-learn.org/stable/modules/clustering.html#clustering) are k-Means, Affinity propagation, etc.\n",
    "* **Regression**: Predicting a continuous-valued attribute associated with an object. Some of the available [regression algorithms](http://scikit-learn.org/stable/supervised_learning.html#supervised-learning) are linear regression, logistic regression, etc.\n",
    "* **Dimensionality reduction**: Reducing the number of random variables to consider. Some of the available [dimensionality reduction algorithms](http://scikit-learn.org/stable/modules/decomposition.html#decompositions) are SVD, PCA, etc."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Helpers for Machine Learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In addition, scikit-learn helps in several tasks:\n",
    "* **Model selection**: Comparing, validating, choosing parameters and models, and persisting models. Some of the [available functionalities](http://scikit-learn.org/stable/model_selection.html#model-selection) are cross-validation or grid search for optimizing the parameters. \n",
    "* **Preprocessing**: Several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. Some of the available [preprocessing functions](http://scikit-learn.org/stable/modules/preprocessing.html#preprocessing) are scaling and normalizing data, or imputing missing values."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How to install scikit-learn"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you installed the conda distribution, scikit-learn is already installed! This is the best option.\n",
    "\n",
    "Anyway, before starting, update all the packages: `conda update --all`. \n",
    "\n",
    "In case it is an old installation, you can update it using conda: `conda update scikit-learn`.\n",
    "\n",
    "If it is not installed, install it with conda: `conda install scikit-learn`.\n",
    "\n",
    "If you have installed scipy and numpy, you can also installed using pip: `pip install -U scikit-learn`.\n",
    "\n",
    "It is not recommended to use pip for installing scipy and numpy. Instead, use conda or install the linux package *python-sklearn*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* [Scikit-learn site](http://scikit-learn.org/stable/index.html)\n",
    "* [How to install Scikit-learn](http://scikit-learn.org/stable/install.html/)\n",
    "* [An introduction to NumPy and Scipy](http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf)\n",
    "* [NumPy tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Licence\n",
    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
    "\n",
    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}