mirror of
https://github.com/gsi-upm/sitc
synced 2026-02-08 23:58:17 +00:00
Update 2_6_1_Q-Learning_Visualization.ipynb
This commit is contained in:
committed by
GitHub
parent
921eda4c9f
commit
d062777922
@@ -39,7 +39,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"In this section we are going to visualize Q-Learning based on this [link](https://gymnasium.farama.org/tutorials/training_agents/FrozenLake_tuto/#sphx-glr-tutorials-training-agents-frozenlake-tuto-py). The code has been ported to the last version of Gymnasium.\n",
|
"In this section, we are going to visualize Q-Learning based on this [link](https://gymnasium.farama.org/tutorials/training_agents/FrozenLake_tuto/#sphx-glr-tutorials-training-agents-frozenlake-tuto-py). The code has been ported to the last version of Gymnasium.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"First, we are going to define a class *Params* for the Q-Learning parameters and the environment based on these values."
|
"First, we are going to define a class *Params* for the Q-Learning parameters and the environment based on these values."
|
||||||
]
|
]
|
||||||
@@ -129,7 +129,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"## Running the environment"
|
"## Running the environment."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -161,7 +161,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"We want to plot the policy the agent has learned in the end. To do that the function *qtable_directions_map* perform these actions: 1. extract the best Q-values from the Q-table for each state, 2. get the corresponding best action for those Q-values, 3. map each action to an arrow so we can visualize it."
|
"We want to plot the policy the agent has learned in the end. To do that, the function *qtable_directions_map* performs these actions: 1. extract the best Q-values from the Q-table for each state, 2. get the corresponding best action for those Q-values, 3. map each action to an arrow so we can visualize it."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -182,7 +182,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Now we’ll be running our agent on a few increasing maps sizes: \n",
|
"Now we’ll be running our agent on a few increasing map sizes: \n",
|
||||||
"- 4x4\n",
|
"- 4x4\n",
|
||||||
"- 7x7\n",
|
"- 7x7\n",
|
||||||
"- 9x9\n",
|
"- 9x9\n",
|
||||||
@@ -312,7 +312,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
"The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
||||||
"\n",
|
"\n",
|
||||||
"© Carlos Á. Iglesias, Universidad Politécnica de Madrid."
|
"© Carlos Á. Iglesias, Universidad Politécnica de Madrid."
|
||||||
]
|
]
|
||||||
|
|||||||
Reference in New Issue
Block a user