mirror of
https://github.com/gsi-upm/sitc
synced 2024-11-22 06:22:29 +00:00
1297 lines
88 KiB
Plaintext
1297 lines
88 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"![](files/images/EscUpmPolit_p.gif \"UPM\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Course Notes for Learning Intelligent Systems"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Department of Telematic Engineering Systems, Universidad Politécnica de Madrid, © 2016 Carlos A. Iglesias"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## [Introduction to Machine Learning](2_0_0_Intro_ML.ipynb)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Table of Contents\n",
|
|
"\n",
|
|
"* [Model Tuning](#Model-Tuning)\n",
|
|
"* [Load data and preprocessing](#Load-data-and-preprocessing)\n",
|
|
"* [Train classifier](#Train-classifier)\n",
|
|
"* [More about Pipelines](#More-about-Pipelines)\n",
|
|
"* [Tuning the algorithm](#Tuning-the-algorithm)\n",
|
|
"\t* [Grid Search for Parameter optimization](#Grid-Search-for-Parameter-optimization)\n",
|
|
"* [Evaluating the algorithm](#Evaluating-the-algorithm)\n",
|
|
"\t* [K-Fold validation](#K-Fold-validation)\n",
|
|
"* [References](#References)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Model Tuning"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"In the previous [notebook](2_5_2_Decision_Tree_Model.ipynb), we got an accuracy of 9.47. Could we get a better accuracy if we tune the parameters of the estimator?\n",
|
|
"\n",
|
|
"The goal of this notebook is to learn how to tune an algorithm by opimizing its parameters using grid search."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Load data and preprocessing"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# library for displaying plots\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"# display plots in the notebook \n",
|
|
"%matplotlib inline\n",
|
|
"\n",
|
|
"## First, we repeat the load and preprocessing steps\n",
|
|
"\n",
|
|
"# Load data\n",
|
|
"from sklearn import datasets\n",
|
|
"iris = datasets.load_iris()\n",
|
|
"\n",
|
|
"# Training and test spliting\n",
|
|
"from sklearn.cross_validation import train_test_split\n",
|
|
"\n",
|
|
"x_iris, y_iris = iris.data, iris.target\n",
|
|
"# Test set will be the 25% taken randomly\n",
|
|
"x_train, x_test, y_train, y_test = train_test_split(x_iris, y_iris, test_size=0.25, random_state=33)\n",
|
|
"\n",
|
|
"# Preprocess: normalize\n",
|
|
"from sklearn import preprocessing\n",
|
|
"scaler = preprocessing.StandardScaler().fit(x_train)\n",
|
|
"x_train = scaler.transform(x_train)\n",
|
|
"x_test = scaler.transform(x_test)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"source": [
|
|
"## Train classifier"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"As previously, we train the model and evaluate the result."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Mean score: 0.947 (+/- 0.022)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from sklearn.cross_validation import cross_val_score, KFold\n",
|
|
"from sklearn.pipeline import Pipeline\n",
|
|
"from sklearn.preprocessing import StandardScaler\n",
|
|
"from sklearn.tree import DecisionTreeClassifier\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"# create a composite estimator made by a pipeline of preprocessing and the KNN model\n",
|
|
"model = Pipeline([\n",
|
|
" ('scaler', StandardScaler()),\n",
|
|
" ('ds', DecisionTreeClassifier())\n",
|
|
"])\n",
|
|
"\n",
|
|
"# Fit the model\n",
|
|
"model.fit(x_train, y_train) \n",
|
|
"\n",
|
|
"# create a k-fold cross validation iterator of k=10 folds\n",
|
|
"cv = KFold(x_iris.shape[0], 10, shuffle=True, random_state=33)\n",
|
|
"\n",
|
|
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
|
|
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
|
|
"\n",
|
|
"from scipy.stats import sem\n",
|
|
"def mean_score(scores):\n",
|
|
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
|
|
"print(mean_score(scores))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We obtain an accuracy of 0.947."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## More about Pipelines"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"When we use a Pipeline, every chained estimator is stored in the dictionary *named_steps* and as a list in *steps*."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"{'ds': DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n",
|
|
" max_features=None, max_leaf_nodes=None, min_samples_leaf=1,\n",
|
|
" min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
|
|
" presort=False, random_state=None, splitter='best'),\n",
|
|
" 'scaler': StandardScaler(copy=True, with_mean=True, with_std=True)}"
|
|
]
|
|
},
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model.named_steps"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)),\n",
|
|
" ('ds',\n",
|
|
" DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n",
|
|
" max_features=None, max_leaf_nodes=None, min_samples_leaf=1,\n",
|
|
" min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
|
|
" presort=False, random_state=None, splitter='best'))]"
|
|
]
|
|
},
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model.steps"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can get the list of parameters of the model. As you will observe, the parameters of the estimators in the pipeline can be accessed using the <estimator>__<parameter> syntax. We will use this for tuning the parameters."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"dict_keys(['ds__max_depth', 'scaler__with_mean', 'ds__random_state', 'ds__max_features', 'ds__max_leaf_nodes', 'ds', 'ds__min_weight_fraction_leaf', 'ds__splitter', 'ds__presort', 'ds__min_samples_split', 'steps', 'ds__class_weight', 'scaler__copy', 'scaler', 'ds__criterion', 'scaler__with_std', 'ds__min_samples_leaf'])"
|
|
]
|
|
},
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model.get_params().keys()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's see what happens if we change a parameter"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"Pipeline(steps=[('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)), ('ds', DecisionTreeClassifier(class_weight='balanced', criterion='gini',\n",
|
|
" max_depth=None, max_features=None, max_leaf_nodes=None,\n",
|
|
" min_samples_leaf=1, min_samples_split=2,\n",
|
|
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n",
|
|
" splitter='best'))])"
|
|
]
|
|
},
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model.set_params(ds__class_weight='balanced')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Another alternative is to create the pipeline with the values we want to set, but it can be useful to access the estimators of the Pipeline."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"Pipeline(steps=[('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)), ('ds', DecisionTreeClassifier(class_weight='balanced', criterion='gini',\n",
|
|
" max_depth=None, max_features=None, max_leaf_nodes=None,\n",
|
|
" min_samples_leaf=1, min_samples_split=2,\n",
|
|
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n",
|
|
" splitter='best'))])"
|
|
]
|
|
},
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model = Pipeline([\n",
|
|
" ('scaler', StandardScaler()),\n",
|
|
" ('ds', DecisionTreeClassifier(class_weight='balanced'))\n",
|
|
"])\n",
|
|
"model"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The same approach can be used for accessing attributes such as *feature_importances_* we saw in the previous notebook."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[ 0.01834862 0.01910853 0.06605168 0.89649117]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Fit the model\n",
|
|
"model.fit(x_train, y_train) \n",
|
|
"# Using named_steps\n",
|
|
"my_decision_tree = model.named_steps['ds']\n",
|
|
"print(my_decision_tree.feature_importances_)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"[ 0.01834862 0.01910853 0.06605168 0.89649117]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"#Using steps, we take the last step (-1) or the second step (1)\n",
|
|
"#name, my_desision_tree = model.steps[1]\n",
|
|
"name, my_desision_tree = model.steps[-1]\n",
|
|
"print(my_decision_tree.feature_importances_)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Tuning the algorithm"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We see that the most important feature for this classifier is `petal width`.\n",
|
|
"\n",
|
|
"Look at the [API](http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html) of *scikit-learn* to understand better the algorithm, as well as which parameters can be tuned. As you see, we can change several ones, such as *criterion*, *splitter*, *max_features*, *max_depth*, *min_samples_split*, *class_weight*, etc.\n",
|
|
"\n",
|
|
"We can get the full list parameters of an estimator with the method *get_params()*. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"{'ds': DecisionTreeClassifier(class_weight='balanced', criterion='gini',\n",
|
|
" max_depth=None, max_features=None, max_leaf_nodes=None,\n",
|
|
" min_samples_leaf=1, min_samples_split=2,\n",
|
|
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n",
|
|
" splitter='best'),\n",
|
|
" 'ds__class_weight': 'balanced',\n",
|
|
" 'ds__criterion': 'gini',\n",
|
|
" 'ds__max_depth': None,\n",
|
|
" 'ds__max_features': None,\n",
|
|
" 'ds__max_leaf_nodes': None,\n",
|
|
" 'ds__min_samples_leaf': 1,\n",
|
|
" 'ds__min_samples_split': 2,\n",
|
|
" 'ds__min_weight_fraction_leaf': 0.0,\n",
|
|
" 'ds__presort': False,\n",
|
|
" 'ds__random_state': None,\n",
|
|
" 'ds__splitter': 'best',\n",
|
|
" 'scaler': StandardScaler(copy=True, with_mean=True, with_std=True),\n",
|
|
" 'scaler__copy': True,\n",
|
|
" 'scaler__with_mean': True,\n",
|
|
" 'scaler__with_std': True,\n",
|
|
" 'steps': [('scaler',\n",
|
|
" StandardScaler(copy=True, with_mean=True, with_std=True)),\n",
|
|
" ('ds', DecisionTreeClassifier(class_weight='balanced', criterion='gini',\n",
|
|
" max_depth=None, max_features=None, max_leaf_nodes=None,\n",
|
|
" min_samples_leaf=1, min_samples_split=2,\n",
|
|
" min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n",
|
|
" splitter='best'))]}"
|
|
]
|
|
},
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model.get_params()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can try different values for these parameters and observe the results."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Grid Search for Parameter optimization"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Changing manually the parameters to find their optimal values is not practical. Instead, we can consider to find the optimal value of the parameters as an *optimization problem*. \n",
|
|
"\n",
|
|
"The sklearn comes with several optimization techniques for this purpose, such as **grid search** and **randomized search**. In this notebook we are going to introduce the former one."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The sklearn provides an object that, given data, computes the score during the fit of an estimator on a parameter grid and chooses the parameters to maximize the cross-validation score. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Best score: 0.946428571429\n",
|
|
"Best params: {'max_depth': 3}\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from sklearn.grid_search import GridSearchCV\n",
|
|
"from sklearn.tree import DecisionTreeClassifier\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"param_grid = {'max_depth': np.arange(3, 10)} \n",
|
|
"\n",
|
|
"gs = GridSearchCV(DecisionTreeClassifier(), param_grid)\n",
|
|
"\n",
|
|
"gs.fit(x_train, y_train)\n",
|
|
"\n",
|
|
"# summarize the results of the grid search\n",
|
|
"print(\"Best score: \", gs.best_score_)\n",
|
|
"print(\"Best params: \", gs.best_params_)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"source": [
|
|
"Now we are going to show the results of grid search"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"0.946 (+/-0.075) for {'max_depth': 3}\n",
|
|
"0.938 (+/-0.050) for {'max_depth': 4}\n",
|
|
"0.938 (+/-0.050) for {'max_depth': 5}\n",
|
|
"0.929 (+/-0.025) for {'max_depth': 6}\n",
|
|
"0.946 (+/-0.075) for {'max_depth': 7}\n",
|
|
"0.938 (+/-0.050) for {'max_depth': 8}\n",
|
|
"0.938 (+/-0.050) for {'max_depth': 9}\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# We print the score for each value of max_depth\n",
|
|
"for params, mean_score, scores in gs.grid_scores_:\n",
|
|
" print(\"%0.3f (+/-%0.03f) for %r\" % (mean_score, scores.std() * 2, params))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can now evaluate the KFold with this optimized parameter as follows."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Mean score: 0.953 (+/- 0.020)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# create a composite estimator made by a pipeline of preprocessing and the KNN model\n",
|
|
"model = Pipeline([\n",
|
|
" ('scaler', StandardScaler()),\n",
|
|
" ('ds', DecisionTreeClassifier(max_depth=3))\n",
|
|
"])\n",
|
|
"\n",
|
|
"# Fit the model\n",
|
|
"model.fit(x_train, y_train) \n",
|
|
"\n",
|
|
"# create a k-fold cross validation iterator of k=10 folds\n",
|
|
"cv = KFold(x_iris.shape[0], 10, shuffle=True, random_state=33)\n",
|
|
"\n",
|
|
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
|
|
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
|
|
"def mean_score(scores):\n",
|
|
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
|
|
"print(mean_score(scores))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We have got an *improvement* from 0.947 to 0.953 with k-fold.\n",
|
|
"\n",
|
|
"We are now to try to fit the best combination of the parameters of the algorithm. It can take some time to compute it."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"# Tuning hyper-parameters for precision\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n",
|
|
"/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.\n",
|
|
" 'precision', 'predicted', average, warn_for)\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Best parameters set found on development set:\n",
|
|
"\n",
|
|
"{'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"\n",
|
|
"Grid scores on development set:\n",
|
|
"\n",
|
|
"0.964 (+/-0.092) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.117) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.947 (+/-0.076) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.959 (+/-0.111) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.935 (+/-0.123) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.960 (+/-0.083) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.935 (+/-0.074) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.964 (+/-0.092) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.954 (+/-0.111) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.115) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.954 (+/-0.099) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.940 (+/-0.121) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.942 (+/-0.150) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.954 (+/-0.112) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.945 (+/-0.102) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.954 (+/-0.112) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.926 (+/-0.146) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.878 (+/-0.345) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.935 (+/-0.107) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.939 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.919 (+/-0.148) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.939 (+/-0.142) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.956 (+/-0.098) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.942 (+/-0.170) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.126) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.924 (+/-0.156) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.954 (+/-0.078) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.941 (+/-0.135) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.968 (+/-0.082) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.915 (+/-0.165) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.943 (+/-0.083) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.964 (+/-0.091) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.947 (+/-0.149) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.899 (+/-0.331) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.966 (+/-0.093) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.948 (+/-0.108) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.190) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.936 (+/-0.107) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.964 (+/-0.091) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.954 (+/-0.112) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.962 (+/-0.113) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.945 (+/-0.125) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.956 (+/-0.121) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.966 (+/-0.069) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.942 (+/-0.114) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.931 (+/-0.107) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.960 (+/-0.082) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.935 (+/-0.151) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.918 (+/-0.151) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.951 (+/-0.137) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.172) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.952 (+/-0.127) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.094) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.928 (+/-0.137) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.953 (+/-0.134) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.968 (+/-0.082) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.844 (+/-0.340) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.948 (+/-0.101) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.081) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.919 (+/-0.252) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.940 (+/-0.185) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.123) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.979 (+/-0.062) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.956 (+/-0.121) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.939 (+/-0.142) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.916 (+/-0.237) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.114) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.968 (+/-0.082) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.957 (+/-0.121) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.923 (+/-0.205) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.965 (+/-0.070) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.947 (+/-0.074) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.939 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.968 (+/-0.081) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.954 (+/-0.111) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.953 (+/-0.080) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.925 (+/-0.128) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.968 (+/-0.082) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.973 (+/-0.068) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.968 (+/-0.081) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.913 (+/-0.251) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.944 (+/-0.125) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.933 (+/-0.141) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.968 (+/-0.082) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.257) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.960 (+/-0.086) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.961 (+/-0.081) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.948 (+/-0.108) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.932 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.935 (+/-0.128) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.970 (+/-0.092) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.134) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.934 (+/-0.123) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.937 (+/-0.128) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.939 (+/-0.142) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.921 (+/-0.166) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.935 (+/-0.145) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.917 (+/-0.241) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.968 (+/-0.082) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.943 (+/-0.113) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.939 (+/-0.158) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.948 (+/-0.108) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.897 (+/-0.243) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.979 (+/-0.062) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.182) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.943 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.928 (+/-0.129) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.941 (+/-0.135) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.926 (+/-0.128) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.954 (+/-0.079) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.957 (+/-0.120) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.953 (+/-0.079) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.975 (+/-0.079) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.950 (+/-0.118) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.943 (+/-0.114) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"\n",
|
|
"Detailed classification report:\n",
|
|
"\n",
|
|
"The model is trained on the full development set.\n",
|
|
"The scores are computed on the full evaluation set.\n",
|
|
"\n",
|
|
" precision recall f1-score support\n",
|
|
"\n",
|
|
" 0 1.00 1.00 1.00 8\n",
|
|
" 1 1.00 1.00 1.00 11\n",
|
|
" 2 1.00 1.00 1.00 19\n",
|
|
"\n",
|
|
"avg / total 1.00 1.00 1.00 38\n",
|
|
"\n",
|
|
"\n",
|
|
"# Tuning hyper-parameters for recall\n",
|
|
"\n",
|
|
"Best parameters set found on development set:\n",
|
|
"\n",
|
|
"{'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"\n",
|
|
"Grid scores on development set:\n",
|
|
"\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.911 (+/-0.146) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.116) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.148) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.973 (+/-0.083) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.153) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.164) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.141) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.110) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.187) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.175) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.964 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.168) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.141) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.955 (+/-0.146) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.263) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.141) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.147) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.955 (+/-0.141) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.193) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.109) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.145) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.911 (+/-0.137) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.168) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.159) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.145) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.117) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.190) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.884 (+/-0.175) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.173) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.089) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.145) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.173) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.121) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.091) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.159) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.128) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.964 (+/-0.166) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.197) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.173) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.179) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.179) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.136) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.168) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.114) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': None, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.167) for {'max_leaf_nodes': 5, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.131) for {'max_leaf_nodes': 10, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.116) for {'max_leaf_nodes': 20, 'class_weight': 'balanced', 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.893 (+/-0.213) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.151) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.893 (+/-0.170) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.139) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.117) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.169) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.115) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.131) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.159) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.190) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.172) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.120) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.219) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.120) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.155) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.137) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.187) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.911 (+/-0.173) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.902 (+/-0.099) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.920 (+/-0.219) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'gini'}\n",
|
|
"0.946 (+/-0.088) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'gini'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.185) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.902 (+/-0.169) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.123) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.964 (+/-0.087) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 3, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.103) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.123) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.964 (+/-0.119) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.082) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 4, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.173) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.219) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.114) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.114) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 5, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.141) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.145) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.115) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 6, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.161) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.257) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.183) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.141) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 7, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.169) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.117) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.929 (+/-0.132) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.179) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 8, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.911 (+/-0.137) for {'max_leaf_nodes': None, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.946 (+/-0.140) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.147) for {'max_leaf_nodes': 5, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.955 (+/-0.115) for {'max_leaf_nodes': 10, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"0.938 (+/-0.138) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'best', 'criterion': 'entropy'}\n",
|
|
"0.920 (+/-0.181) for {'max_leaf_nodes': 20, 'class_weight': None, 'max_depth': 9, 'splitter': 'random', 'criterion': 'entropy'}\n",
|
|
"\n",
|
|
"Detailed classification report:\n",
|
|
"\n",
|
|
"The model is trained on the full development set.\n",
|
|
"The scores are computed on the full evaluation set.\n",
|
|
"\n",
|
|
" precision recall f1-score support\n",
|
|
"\n",
|
|
" 0 1.00 1.00 1.00 8\n",
|
|
" 1 1.00 1.00 1.00 11\n",
|
|
" 2 1.00 1.00 1.00 19\n",
|
|
"\n",
|
|
"avg / total 1.00 1.00 1.00 38\n",
|
|
"\n",
|
|
"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Set the parameters by cross-validation\n",
|
|
"\n",
|
|
"from sklearn.metrics import classification_report\n",
|
|
"\n",
|
|
"# set of parameters to test\n",
|
|
"tuned_parameters = [{'max_depth': np.arange(3, 10),\n",
|
|
"# 'max_weights': [1, 10, 100, 1000]},\n",
|
|
" 'criterion': ['gini', 'entropy'], \n",
|
|
" 'splitter': ['best', 'random'],\n",
|
|
" # 'min_samples_leaf': [2, 5, 10],\n",
|
|
" 'class_weight':['balanced', None],\n",
|
|
" 'max_leaf_nodes': [None, 5, 10, 20]\n",
|
|
" }]\n",
|
|
"\n",
|
|
"scores = ['precision', 'recall']\n",
|
|
"\n",
|
|
"for score in scores:\n",
|
|
" print(\"# Tuning hyper-parameters for %s\" % score)\n",
|
|
" print()\n",
|
|
"\n",
|
|
" # cv = the fold of the cross-validation cv, defaulted to 5\n",
|
|
" gs = GridSearchCV(DecisionTreeClassifier(), tuned_parameters, cv=10, scoring='%s_weighted' % score)\n",
|
|
" gs.fit(x_train, y_train)\n",
|
|
"\n",
|
|
" print(\"Best parameters set found on development set:\")\n",
|
|
" print()\n",
|
|
" print(gs.best_params_)\n",
|
|
" print()\n",
|
|
" print(\"Grid scores on development set:\")\n",
|
|
" print()\n",
|
|
" for params, mean_score, scores in gs.grid_scores_:\n",
|
|
" print(\"%0.3f (+/-%0.03f) for %r\" % (mean_score, scores.std() * 2, params))\n",
|
|
" print()\n",
|
|
"\n",
|
|
" print(\"Detailed classification report:\")\n",
|
|
" print()\n",
|
|
" print(\"The model is trained on the full development set.\")\n",
|
|
" print(\"The scores are computed on the full evaluation set.\")\n",
|
|
" print()\n",
|
|
" y_true, y_pred = y_test, gs.predict(x_test)\n",
|
|
" print(classification_report(y_true, y_pred))\n",
|
|
" print()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"source": [
|
|
"Let's evaluate the resulting tuning."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"metadata": {
|
|
"collapsed": false
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Mean score: 0.940 (+/- 0.021)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# create a composite estimator made by a pipeline of preprocessing and the KNN model\n",
|
|
"model = Pipeline([\n",
|
|
" ('scaler', StandardScaler()),\n",
|
|
" ('ds', DecisionTreeClassifier(max_leaf_nodes=20, criterion='gini', \n",
|
|
" splitter='random', class_weight='balanced', max_depth=3))\n",
|
|
"])\n",
|
|
"\n",
|
|
"# Fit the model\n",
|
|
"model.fit(x_train, y_train) \n",
|
|
"\n",
|
|
"# create a k-fold cross validation iterator of k=10 folds\n",
|
|
"cv = KFold(x_iris.shape[0], 10, shuffle=True, random_state=33)\n",
|
|
"\n",
|
|
"# by default the score used is the one returned by score method of the estimator (accuracy)\n",
|
|
"scores = cross_val_score(model, x_iris, y_iris, cv=cv)\n",
|
|
"def mean_score(scores):\n",
|
|
" return (\"Mean score: {0:.3f} (+/- {1:.3f})\").format(np.mean(scores), sem(scores))\n",
|
|
"print(mean_score(scores))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"So, we get an average accuracy of 0.96!! Better than 0.947 (without tuning) and 0.953 (tuning only *max_depth*)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## References"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* [Plot the decision surface of a decision tree on the iris dataset](http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html)\n",
|
|
"* [Learning scikit-learn: Machine Learning in Python](http://proquest.safaribooksonline.com/book/programming/python/9781783281930/1dot-machine-learning-a-gentle-introduction/ch01s02_html), Raúl Garreta; Guillermo Moncecchi, Packt Publishing, 2013.\n",
|
|
"* [Python Machine Learning](http://proquest.safaribooksonline.com/book/programming/python/9781783555130), Sebastian Raschka, Packt Publishing, 2015.\n",
|
|
"* [Parameter estimation using grid search with cross-validation](http://scikit-learn.org/stable/auto_examples/model_selection/grid_search_digits.html)\n",
|
|
"* [Decision trees in python with scikit-learn and pandas](http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Licence"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/). \n",
|
|
"\n",
|
|
"© 2016 Carlos A. Iglesias, Universidad Politécnica de Madrid."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.5.1+"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 0
|
|
}
|