sitc/ml5/2_6_1_Q-Learning_Exercises.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](images/EscUpmPolit_p.gif \"UPM\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Course Notes for Learning Intelligent Systems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos Á. Iglesias"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## [Introduction to Machine Learning V](2_6_0_Intro_RL.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercises\n",
    "\n",
    "\n",
    "## Taxi\n",
    "Analyze the [Taxi problem](https://gymnasium.farama.org/environments/toy_text/taxi/) and solve it applying Q-Learning. You can find a solution as the one previously presented  [here](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym), and the notebook is [here](https://github.com/wagonhelm/Reinforcement-Learning-Introduction/blob/master/Reinforcement%20Learning%20Introduction.ipynb). Take into account that Gymnasium has changed, so you will have to adapt the code.\n",
    "\n",
    "Analyze the impact of not changing the learning rate or changing it in a different way. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Optional exercises\n",
    "Select one of the following exercises.\n",
    "\n",
    "## Blackjack\n",
    "Analyze how to appy Q-Learning for solving Blackjack.\n",
    "You can find information in this [article](https://gymnasium.farama.org/tutorials/training_agents/blackjack_tutorial/).\n",
    "\n",
    "## Doom\n",
    "Read this [article](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8) and execute the companion [notebook](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Deep%20Q%20Learning/Doom/Deep%20Q%20learning%20with%20Doom.ipynb). Analyze the results and provide conclusions about DQN.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "* [Gymnasium documentation](https://gymnasium.farama.org/).\n",
    "* [Diving deeper into Reinforcement Learning with Q-Learning, Thomas Simonini](https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe).\n",
    "* Illustrations by [Thomas Simonini](https://github.com/simoninithomas/Deep_reinforcement_learning_Course) and [Sung Kim](https://www.youtube.com/watch?v=xgoO54qN4lY).\n",
    "* [Frozen Lake solution with TensorFlow](https://analyticsindiamag.com/openai-gym-frozen-lake-beginners-guide-reinforcement-learning/)\n",
    "* [Deep Q-Learning for Doom](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8)\n",
    "* [Intro OpenAI Gym with Random Search and the Cart Pole scenario](http://www.pinchofintelligence.com/getting-started-openai-gym/)\n",
    "* [Q-Learning for the Taxi scenario](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Licence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
    "\n",
    "© Carlos Á. Iglesias, Universidad Politécnica de Madrid."
   ]
  }
 ],
 "metadata": {
  "datacleaner": {
   "position": {
    "top": "50px"
   },
   "python": {
    "varRefreshCmd": "try:\n    print(_datacleaner.dataframe_metadata())\nexcept:\n    print([])"
   },
   "window_display": false
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
Actualizada práctica a gymnasium y extendida 2023-04-27 13:42:01 +00:00			`{`
			`"cells": [`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"![](images/EscUpmPolit_p.gif \"UPM\")"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"# Course Notes for Learning Intelligent Systems"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos Á. Iglesias"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## [Introduction to Machine Learning V](2_6_0_Intro_RL.ipynb)"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"# Exercises\n",`
			`"\n",`
			`"\n",`
			`"## Taxi\n",`
			"Analyze the [Taxi problem](https://gymnasium.farama.org/environments/toy_text/taxi/) and solve it applying Q-Learning. You can find a solution as the one previously presented [here](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym), and the notebook is [here](https://github.com/wagonhelm/Reinforcement-Learning-Introduction/blob/master/Reinforcement%20Learning%20Introduction.ipynb). Take into account that Gymnasium has changed, so you will have to adapt the code.\n",
			`"\n",`
			`"Analyze the impact of not changing the learning rate or changing it in a different way. "`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"# Optional exercises\n",`
			`"Select one of the following exercises.\n",`
			`"\n",`
			`"## Blackjack\n",`
			`"Analyze how to appy Q-Learning for solving Blackjack.\n",`
			`"You can find information in this [article](https://gymnasium.farama.org/tutorials/training_agents/blackjack_tutorial/).\n",`
			`"\n",`
			`"## Doom\n",`
			`"Read this [article](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8) and execute the companion [notebook](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Deep%20Q%20Learning/Doom/Deep%20Q%20learning%20with%20Doom.ipynb). Analyze the results and provide conclusions about DQN.\n",`
			`"\n"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## References\n",`
			`"* [Gymnasium documentation](https://gymnasium.farama.org/).\n",`
			`"* [Diving deeper into Reinforcement Learning with Q-Learning, Thomas Simonini](https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe).\n",`
			`"* Illustrations by [Thomas Simonini](https://github.com/simoninithomas/Deep_reinforcement_learning_Course) and [Sung Kim](https://www.youtube.com/watch?v=xgoO54qN4lY).\n",`
			`"* [Frozen Lake solution with TensorFlow](https://analyticsindiamag.com/openai-gym-frozen-lake-beginners-guide-reinforcement-learning/)\n",`
			`"* [Deep Q-Learning for Doom](https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8)\n",`
			`"* [Intro OpenAI Gym with Random Search and the Cart Pole scenario](http://www.pinchofintelligence.com/getting-started-openai-gym/)\n",`
			`"* [Q-Learning for the Taxi scenario](https://www.oreilly.com/learning/introduction-to-reinforcement-learning-and-openai-gym)"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"## Licence"`
			`]`
			`},`
			`{`
			`"cell_type": "markdown",`
			`"metadata": {},`
			`"source": [`
			`"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",`
			`"\n",`
			`"© Carlos Á. Iglesias, Universidad Politécnica de Madrid."`
			`]`
			`}`
			`],`
			`"metadata": {`
			`"datacleaner": {`
			`"position": {`
			`"top": "50px"`
			`},`
			`"python": {`
			`"varRefreshCmd": "try:\n print(_datacleaner.dataframe_metadata())\nexcept:\n print([])"`
			`},`
			`"window_display": false`
			`},`
			`"kernelspec": {`
			`"display_name": "Python 3 (ipykernel)",`
			`"language": "python",`
			`"name": "python3"`
			`},`
			`"language_info": {`
			`"codemirror_mode": {`
			`"name": "ipython",`
			`"version": 3`
			`},`
			`"file_extension": ".py",`
			`"mimetype": "text/x-python",`
			`"name": "python",`
			`"nbconvert_exporter": "python",`
			`"pygments_lexer": "ipython3",`
			`"version": "3.10.10"`
			`},`
			`"latex_envs": {`
			`"LaTeX_envs_menu_present": true,`
			`"autocomplete": true,`
			`"bibliofile": "biblio.bib",`
			`"cite_by": "apalike",`
			`"current_citInitial": 1,`
			`"eqLabelWithNumbers": true,`
			`"eqNumInitial": 1,`
			`"hotkeys": {`
			`"equation": "Ctrl-E",`
			`"itemize": "Ctrl-I"`
			`},`
			`"labels_anchors": false,`
			`"latex_user_defs": false,`
			`"report_style_numbering": false,`
			`"user_envs_cfg": false`
			`}`
			`},`
			`"nbformat": 4,`
			`"nbformat_minor": 1`
			`}`