Updated LLM

Updated LLM - compability problems with v5
Fix punctuation and update Scikit-Learn link
2026-04-30 14:44:36 +00:00 · 2026-04-21 14:46:37 +02:00 · 2026-04-21 14:45:18 +02:00 · 2026-04-16 18:42:12 +02:00 · 2026-04-16 16:34:36 +02:00 · 2026-04-16 16:27:23 +02:00
5 changed files with 275 additions and 198 deletions
--- a/ml3/2_4_1_Exercise.ipynb
+++ b/ml3/2_4_1_Exercise.ipynb
@@ -197,7 +197,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The features are simply the position of each point in the 2 dimension plane.\n",
+    "The features are simply the position of each point in the 2-dimensional plane.\n",
    "\n",
    "In other words, a point $\\mathbf{x}$ is represented by its values $x_1$ and $x_2$:\n",
    "\n",
@@ -208,14 +208,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Perform the classification task on several classifiers"
+    "## Perform the classification task on several classifiers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Following, the classification on the spiral is done with several classifiers. We can see the performance on each class (each spiral), and their decision surfaces."
+    "Following the classification on the spiral is done with several classifiers. We can see the performance on each class (each spiral), and their decision surfaces."
   ]
  },
  {
@@ -266,7 +266,7 @@
   "source": [
    "from sklearn.linear_model import LogisticRegression\n",
    "\n",
-    "lr = LogisticRegression(n_jobs=-1)\n",
+    "lr = LogisticRegression()\n",
    "lr.fit(X,y)\n",
    "\n",
    "lr_preds = lr.predict(X_test)\n",
@@ -275,8 +275,8 @@
    "print(classification_report(y_test, lr_preds))\n",
    "\n",
    "plt.figure(figsize=(10,7))\n",
-    "# This methods outputs a visualization\n",
+    "# This method outputs a visualization\n",
-    "# the h parameter adjusts the precision of the visualization\n",
+    "# The h parameter adjusts the precision of the visualization\n",
    "# if you find memory errors, set h to a higher value (e.g., h=0.1)\n",
    "plot_decision_surface(X, y, lr, h=0.02) "
   ]
@@ -535,11 +535,11 @@
    "collapsed": true
   },
   "source": [
-    "We see that some classifiers (kNN, SVM) successfully learn the spiral problem. They can classify correctly in any part of the plane.\n",
+    "We see that some classifiers (kNN, SVM) successfully learn the spiral problem. They can classify correctly at any point in the plane.\n",
    "\n",
    "Nevertheless, some classifiers (Logistic Regression, Gaussian Naive Bayes) are not able to learn the spiral pattern with their default configurations.\n",
    "\n",
-    "In particular, the MLP performs very bad: it is not able to learn the spiral function. Nevertheless, it should be able to."
+    "In particular, the MLP performs very badly: it is not able to learn the spiral function. Nevertheless, it should be able to."
   ]
  },
  {
@@ -578,7 +578,7 @@
    "- regularization of the network\n",
    "- new features that are passed to the network\n",
    "\n",
-    "You can search inspiration on [this playground](http://playground.tensorflow.org)."
+    "You can search for inspiration on [this playground](http://playground.tensorflow.org)."
   ]
  },
  {
@@ -621,7 +621,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The notebook is freely licensed under under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
+    "The notebook is freely licensed under the [Creative Commons Attribution Share-Alike license](https://creativecommons.org/licenses/by/2.0/).  \n",
    "\n",
    "© Óscar Araque, Universidad Politécnica de Madrid."
   ]
--- a/nlp/0_1_LLM.ipynb
+++ b/nlp/0_1_LLM.ipynb
--- a/nlp/4_3_Vector_Representation.ipynb
+++ b/nlp/4_3_Vector_Representation.ipynb
@@ -239,7 +239,7 @@
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "/home/cif/anaconda3/lib/python3.10/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function get_feature_names is deprecated; get_feature_names is deprecated in 1.0 and will be removed in 1.2. Please use get_feature_names_out instead.\n",
+      "--",
      "  warnings.warn(msg, category=FutureWarning)\n"
     ]
    },
@@ -331,7 +331,7 @@
   "source": [
    "vectorizer = CountVectorizer(analyzer=\"word\", stop_words='english', binary=True) \n",
    "vectors = vectorizer.fit_transform(documents)\n",
-    "vectorizer.get_feature_names()"
+    "vectorizer.get_feature_names_out()"
   ]
  },
  {
@@ -363,9 +363,9 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "vectorizer = CountVectorizer(analyzer=\"word\", stop_words='english', ngram_range=[2,2]) \n",
+    "vectorizer = CountVectorizer(analyzer=\"word\", stop_words='english', ngram_range=(2,2)) \n",
    "vectors = vectorizer.fit_transform(documents)\n",
-    "vectorizer.get_feature_names()"
+    "vectorizer.get_feature_names_out()"
   ]
  },
  {
@@ -401,7 +401,7 @@
    "\n",
    "vectorizer = TfidfVectorizer(analyzer=\"word\", stop_words='english')\n",
    "vectors = vectorizer.fit_transform(documents)\n",
-    "vectorizer.get_feature_names()"
+    "vectorizer.get_feature_names_out()"
   ]
  },
  {
@@ -429,9 +429,9 @@
    "train = [doc1, doc2, doc3]\n",
    "vectorizer = TfidfVectorizer(analyzer=\"word\", stop_words='english')\n",
    "\n",
-    "# We learn the vocabulary (fit) and tranform the docs into vectors\n",
+    "# We learn the vocabulary (fit) and transform the docs into vectors\n",
    "vectors = vectorizer.fit_transform(train)\n",
-    "vectorizer.get_feature_names()"
+    "vectorizer.get_feature_names_out()"
   ]
  },
  {
--- a/nlp/4_5_Semantic_Models.ipynb
+++ b/nlp/4_5_Semantic_Models.ipynb
@@ -51,7 +51,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "In this session we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n",
+    "In this session, we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n",
    "\n",
    "The main objectives of this session are:\n",
    "* Understand the models and their differences\n",
@@ -69,9 +69,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We are going to use on of the corpus that come prepackaged with Scikit-learn: the [20 newsgroup datase](http://qwone.com/~jason/20Newsgroups/). The 20  newsgroup dataset contains 20k documents that belong to 20 topics.\n",
+    "We are going to use one of the corpora that come prepackaged with Scikit-learn: the [20 newsgroup dataset](http://qwone.com/~jason/20Newsgroups/). The 20  newsgroup dataset contains 20k documents that belong to 20 topics.\n",
    "\n",
-    "We inspect now the corpus using the facilities from Scikit-learn, as explain in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)"
+    "We inspect now the corpus using the facilities from Scikit-learn, as explained in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)"
   ]
  },
  {
@@ -117,19 +117,19 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Converting Scikit-learn to gensim"
+    "# Converting Scikit-learn to gensim."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
+    "Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
    "\n",
    "You should install first:\n",
    "\n",
    "* *gensim*. Run 'conda install gensim' in a terminal.\n",
-    "* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal"
+    "* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal."
   ]
  },
  {
@@ -183,7 +183,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*."
+    "Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*."
   ]
  },
  {
--- a/nlp/4_7_Exercises.ipynb
+++ b/nlp/4_7_Exercises.ipynb
@@ -51,7 +51,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Here we propose several exercises, it is recommended to work only in one of them."
+    "Here we propose several exercises; it is recommended to work only in one of them."
   ]
  },
  {
@@ -65,8 +65,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "You can try the exercise Exercise 2: Sentiment Analysis on movie reviews of Scikit-Learn https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html. \n",
+    "You can try the exercise Exercise 2: Sentiment Analysis on movie reviews of Scikit-Learn https://scikit-learn.org/1.4/tutorial/text_analytics/working_with_text_data.html. \n",
-    "Previously you should follow the installation instructions in the section Tutorial Setup."
+    "Previously, you should follow the installation instructions in the section Tutorial Setup."
    ]
  },
  {
Author	SHA1	Message	Date
cif	c361e23c8f	Updated LLM	2026-04-21 14:46:37 +02:00
cif	7d473dcdf2	Updated LLM - compability problems with v5	2026-04-21 14:45:18 +02:00
Carlos A. Iglesias	7562b18968	Fix punctuation and update Scikit-Learn link	2026-04-16 18:42:12 +02:00
Carlos A. Iglesias	d1374320f0	Update 4_5_Semantic_Models.ipynb Minor typos.	2026-04-16 16:34:36 +02:00
Carlos A. Iglesias	1e8dbe70a3	Update 4_3_Vector_Representation.ipynb Updated ngram_range to tuple	2026-04-16 16:27:23 +02:00
Carlos A. Iglesias	b3c799e564	Update 4_3_Vector_Representation.ipynb Changed get_feature_names() with get_feature_names_out()	2026-04-16 16:24:45 +02:00
Carlos A. Iglesias	59badc1df2	Fix markdown formatting in Exercise notebook	2026-04-09 11:57:03 +02:00
Carlos A. Iglesias	77ed6c91be	Fix typos and improve clarity in markdown cells	2026-04-09 11:51:47 +02:00