"In this session we will work with the Titanic dataset. This dataset is provided by [Kaggle](http://www.kaggle.com). Kaggle is a crowdsourcing platform that organizes competitions where researchers and companies post their data and users compete to obtain the best models.\n",
"In this session, we will work with the Titanic dataset. This dataset is provided by [Kaggle](http://www.kaggle.com). Kaggle is a crowdsourcing platform that organizes competitions where researchers and companies post their data and users compete to obtain the best models.\n",
"\n",
"![Titanic](images/titanic.jpg)\n",
"\n",
"\n",
"The main objective is predicting which passengers survived the sinking of the Titanic.\n",
"The main objective is to predict which passengers survived the sinking of the Titanic.\n",
"\n",
"The data is available [here](https://www.kaggle.com/c/titanic/data). There are two files, one for training ([train.csv](files/data-titanic/train.csv)) and another file for testing [test.csv](files/data-titanic/test.csv). A local copy has been included in this notebook under the folder *data-titanic*.\n",
"\n",
"\n",
"Here follows a description of the variables.\n",
"\n",
"|Variable | Description| Values|\n",
"|-------------------------------|\n",
"| survival| Survival| (0 = No; 1 = Yes)|\n",
"|Pclass |Name | |\n",
"|Sex |Sex | male, female|\n",
"|Age |Age|\n",
"|SibSp |Number of Siblings/Spouses Aboard||\n",
"|Parch |Number of Parents/Children Aboard||\n",
"|Ticket|Ticket Number||\n",
"|Fare |Passenger Fare||\n",
"|Cabin |Cabin||\n",
"|Embarked |Port of Embarkation| (C = Cherbourg; Q = Queenstown; S = Southampton)|\n",