"We are going to implement the OneMax problem as seen in class.\n",
"\n",
"First, follow the DEAP package instructions and install DEAP.\n",
"\n",
"Then, follow the following notebook [OneMax](https://github.com/DEAP/notebooks/blob/master/OneMax.ipynb) to understand how DEAP works and solves this problem. Observe that it is requested to register types and functions in the DEAP framework. Observe also how you can execute genetic operators such as mutate.\n",
"\n",
"We have included a simple code that solves the OneMax problem in the following cell (taken from [DEAP](http://deap.readthedocs.io/en/master/examples/ga_onemax.html) and added a line to show the best individual in each generation).\n",
"\n",
"Read tutorial from [DEAP](http://deap.readthedocs.io/en/master/examples/ga_onemax.html) to understand the code."
" for ind, fit in zip(invalid_ind, fitnesses):\n",
" ind.fitness.values = fit\n",
" \n",
" pop[:] = offspring\n",
" \n",
" # Gather all the fitnesses in one list and print the stats\n",
" fits = [ind.fitness.values[0] for ind in pop]\n",
" \n",
" length = len(pop)\n",
" mean = sum(fits) / length\n",
" sum2 = sum(x*x for x in fits)\n",
" std = abs(sum2 / length - mean**2)**0.5\n",
" \n",
" print(\" Min %s\" % min(fits))\n",
" print(\" Max %s\" % max(fits))\n",
" print(\" Avg %s\" % mean)\n",
" print(\" Std %s\" % std)\n",
" best_ind = tools.selBest(pop, 1)[0]\n",
" print(\"Best individual so far is %s, %s\" % (best_ind, best_ind.fitness.values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the genetic algorithm and interpret the results."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"main()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exercises"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Comparing\n",
"Your task is modify the previous code to canonical GA configuration from Holland (look at the lesson's slides). In addition you should consult the [DEAP API](http://deap.readthedocs.io/en/master/api/tools.html#operators).\n",
"\n",
"Submit your notebook and include a the modified code, and a comparison of the effects of these changes. \n",
"\n",
"Discuss your findings."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optimizing ML hyperparameters\n",
"\n",
"One of the applications of Genetic Algorithms is the optimization of ML hyperparameters. Previously we have used GridSearch from Scikit. Using (sklearn-deap)[#References], optimize the Titatic hyperparameters using both GridSearch and Genetic Algorithms. \n",
"\n",
"The same exercise (using the digits dataset) can be found in this [notebook](https://github.com/rsteca/sklearn-deap/blob/master/test.ipynb).\n",
"\n",
"Submit a notebook where you include well-crafted conclusions about the exercises, discussing the pros and cons of using genetic algorithms for this purpose.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Optional exercises\n",
"\n",
"Here there is a proposed optional exercise."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optimizing a ML pipeline with a genetic algorithm\n",
"\n",
"The library [TPOT](#References) optimizes ML pipelines and comes with a lot of (examples)[https://epistasislab.github.io/tpot/examples/] and even notebooks, for example for the [iris dataset](https://github.com/EpistasisLab/tpot/blob/master/tutorials/IRIS.ipynb).\n",
"\n",
"Your task is to apply TPOT to the intermediate challenge and write a short essay explaining:\n",
"* what TPOT does (with your own words).\n",
"* how you have experimented with TPOT (what you have tried and how long. Take into account that it should be run from hours to days to get good results. Read the documentation, it is not that long!).\n",
"* the results. If TPOT is rather clever or your group got better results."
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",