sitc/nlp/4_7_Exercises.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](images/EscUpmPolit_p.gif \"UPM\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Course Notes for Learning Intelligent Systems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Table of Contents\n",
    "\n",
    "* [Exercises](#Exercises)\n",
    "\t* [Exercise 1 - Sentiment classification for Twitter](#Exercise-1---Sentiment-classification-for-Twitter)\n",
    "\t* [Exercise 2 - Spam classification](#Exercise-2---Spam-classification)\n",
    "\t* [Exercise 3 - Automatic essay classification](#Exercise-3---Automatic-essay-classification)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we propose several exercises, it is recommended to work only in one of them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 1 - Sentiment classification for Twitter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The purpose of this exercise is:\n",
    "* Collect geolocated tweets\n",
    "* Analyse their sentiment\n",
    "* Represent the result in a map, so that one can understand the sentiment in a geographic region.\n",
    "\n",
    "The steps (and most of the code) can be found [here](http://pybonacci.org/2015/11/24/como-hacer-analisis-de-sentimiento-en-espanol-2/). \n",
    "\n",
    "You can select the tweets in any language."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 2 - Spam classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The classification of spam is a classical problem. [Here](http://zacstewart.com/2015/04/28/document-classification-with-scikit-learn.html) you can find a detailed example of how to do it using the datasets Enron-Spama and  SpamAssassin. You can try to test yourself the classification."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 3 - Automatic essay classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you have seen, we did not got great results in the previous notebook. You can try to improve them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Licence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
    "\n",
    "© Carlos A. Iglesias, Universidad Politécnica de Madrid."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}