1
0
mirror of https://github.com/gsi-upm/sitc synced 2026-04-16 22:58:16 +00:00

Update 4_5_Semantic_Models.ipynb

Minor typos.
This commit is contained in:
Carlos A. Iglesias
2026-04-16 16:34:36 +02:00
committed by GitHub
parent 1e8dbe70a3
commit d1374320f0

View File

@@ -51,7 +51,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this session we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n",
"In this session, we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n",
"\n",
"The main objectives of this session are:\n",
"* Understand the models and their differences\n",
@@ -69,9 +69,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We are going to use on of the corpus that come prepackaged with Scikit-learn: the [20 newsgroup datase](http://qwone.com/~jason/20Newsgroups/). The 20 newsgroup dataset contains 20k documents that belong to 20 topics.\n",
"We are going to use one of the corpora that come prepackaged with Scikit-learn: the [20 newsgroup dataset](http://qwone.com/~jason/20Newsgroups/). The 20 newsgroup dataset contains 20k documents that belong to 20 topics.\n",
"\n",
"We inspect now the corpus using the facilities from Scikit-learn, as explain in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)"
"We inspect now the corpus using the facilities from Scikit-learn, as explained in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)"
]
},
{
@@ -117,19 +117,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Converting Scikit-learn to gensim"
"# Converting Scikit-learn to gensim."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
"Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
"\n",
"You should install first:\n",
"\n",
"* *gensim*. Run 'conda install gensim' in a terminal.\n",
"* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal"
"* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal."
]
},
{
@@ -183,7 +183,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*."
"Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*."
]
},
{