mirror of
https://github.com/gsi-upm/sitc
synced 2026-04-16 22:58:16 +00:00
Update 4_5_Semantic_Models.ipynb
Minor typos.
This commit is contained in:
committed by
GitHub
parent
1e8dbe70a3
commit
d1374320f0
@@ -51,7 +51,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In this session we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n",
|
||||
"In this session, we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n",
|
||||
"\n",
|
||||
"The main objectives of this session are:\n",
|
||||
"* Understand the models and their differences\n",
|
||||
@@ -69,9 +69,9 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We are going to use on of the corpus that come prepackaged with Scikit-learn: the [20 newsgroup datase](http://qwone.com/~jason/20Newsgroups/). The 20 newsgroup dataset contains 20k documents that belong to 20 topics.\n",
|
||||
"We are going to use one of the corpora that come prepackaged with Scikit-learn: the [20 newsgroup dataset](http://qwone.com/~jason/20Newsgroups/). The 20 newsgroup dataset contains 20k documents that belong to 20 topics.\n",
|
||||
"\n",
|
||||
"We inspect now the corpus using the facilities from Scikit-learn, as explain in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)"
|
||||
"We inspect now the corpus using the facilities from Scikit-learn, as explained in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -117,19 +117,19 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Converting Scikit-learn to gensim"
|
||||
"# Converting Scikit-learn to gensim."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
|
||||
"Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n",
|
||||
"\n",
|
||||
"You should install first:\n",
|
||||
"\n",
|
||||
"* *gensim*. Run 'conda install gensim' in a terminal.\n",
|
||||
"* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal"
|
||||
"* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -183,7 +183,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*."
|
||||
"Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user