Update 4_1_Lexical_Processing.ipynb

Merge pull request #4 from gsi-upm/dveni-patch-1
Update 3_3_Data_Munging_with_Pandas.ipynb
2025-09-18 12:52:20 +00:00 · 2019-11-26 15:14:40 +01:00 · 2019-09-19 10:46:19 +02:00 · 2019-09-18 15:39:16 +02:00
2 changed files with 2 additions and 2 deletions
--- a/ml2/3_3_Data_Munging_with_Pandas.ipynb
+++ b/ml2/3_3_Data_Munging_with_Pandas.ipynb
@@ -437,7 +437,7 @@
    "\n",
    "#Show mean Age, mean SibSp, and number of passengers older than 25 that survived,  grouped by Passenger Class and Sex\n",
    "df[(df.Age > 25 & (df.Survived == 1))].groupby(['Pclass', 'Sex'])['Age','SibSp','Survived'].agg({'Age': np.mean, \n",
-    "                                                                         'SibSp': np.mean, 'Survived': np.size})"
+    "                                                                         'SibSp': np.mean, 'Survived': np.sum})"
   ]
  },
  {
--- a/nlp/4_1_Lexical_Processing.ipynb
+++ b/nlp/4_1_Lexical_Processing.ipynb
@@ -326,7 +326,7 @@
    "def preprocess(words, type='doc'):\n",
    "    if (type == 'tweet'):\n",
    "        tknzr = TweetTokenizer(strip_handles=True, reduce_len=True)\n",
-    "        tokens = tknzr.tokenize(tweet)\n",
+    "        tokens = tknzr.tokenize(words)\n",
    "    else:\n",
    "        tokens = nltk.word_tokenize(words.lower())\n",
    "    porter = nltk.PorterStemmer()\n",
Author	SHA1	Message	Date
Dani Vera	19ea5dff09	Update 4_1_Lexical_Processing.ipynb	2019-11-26 15:14:40 +01:00
Carlos A. Iglesias	e70689072f	Merge pull request #4 from gsi-upm/dveni-patch-1 Update 3_3_Data_Munging_with_Pandas.ipynb	2019-09-19 10:46:19 +02:00
Dani Vera	344e054ba4	Update 3_3_Data_Munging_with_Pandas.ipynb Se utiliza np.size en la última columna. Esto calcula el tamaño de la serie, creo que de valores no null, pero no lo que pienso que se pretende es calcular el número de supervivientes, para lo que se podría utilizar np.sum.	2019-09-18 15:39:16 +02:00