{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![](files/images/EscUpmPolit_p.gif \"UPM\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Course Notes for Learning Intelligent Systems" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © Carlos A. Iglesias" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [Introduction to Machine Learning](2_0_0_Intro_ML.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Table of Contents\n", "* [Model Persistence](#Model-Persistence)\n", "* [References](#References)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model Persistence" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of this notebook is to learn how to save a model in the the scikit by using Python’s built-in persistence model, namely pickle\n", "\n", "First we recap the previous tasks: load data, preprocess and train the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# load iris\n", "from sklearn import datasets\n", "iris = datasets.load_iris()\n", "\n", "# Training and test spliting\n", "from sklearn.model_selection import train_test_split\n", "x_iris, y_iris = iris.data, iris.target\n", "# Test set will be the 25% taken randomly\n", "x_train, x_test, y_train, y_test = train_test_split(x_iris, y_iris, test_size=0.25, random_state=33)\n", "\n", "# Create the model using the pipeline\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.neighbors import KNeighborsClassifier\n", "\n", "# create a composite estimator made by a pipeline of preprocessing and the KNN model\n", "model = Pipeline([\n", " ('scaler', StandardScaler()),\n", " ('KNN', KNeighborsClassifier())\n", "])\n", "\n", "# Train the model\n", "model.fit(x_train, y_train) \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are going to save the model to a data structure called *pickle*. A pickle is a dictionary and can be used as a file or a string." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pickle\n", "s = pickle.dumps(model)\n", "model2 = pickle.loads(s)\n", "model2.predict(x_iris[0:1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A more efficient alternative to pickle is joblib, especially for big data problems. In this case the model can only be saved to a file and not to a string." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# save model\n", "from sklearn.externals import joblib\n", "joblib.dump(model, 'filename.pkl') \n", "\n", "#load model\n", "model2 = joblib.load('filename.pkl') " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* [Tutorial scikit-learn](http://scikit-learn.org/stable/tutorial/basic/tutorial.html)\n", "* [Model persistence in scikit-learn](http://scikit-learn.org/stable/modules/model_persistence.html#model-persistence)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Licence\n", "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n", "\n", "© Carlos A. Iglesias, Universidad Politécnica de Madrid." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false } }, "nbformat": 4, "nbformat_minor": 1 }