diff --git a/nlp/4_5_Semantic_Models.ipynb b/nlp/4_5_Semantic_Models.ipynb index b2e9b9a..1a8851e 100644 --- a/nlp/4_5_Semantic_Models.ipynb +++ b/nlp/4_5_Semantic_Models.ipynb @@ -51,7 +51,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In this session we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n", + "In this session, we provide a quick overview of the semantic models presented during the classes. In this case, we will use a real corpus so that we can extract meaningful patterns.\n", "\n", "The main objectives of this session are:\n", "* Understand the models and their differences\n", @@ -69,9 +69,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We are going to use on of the corpus that come prepackaged with Scikit-learn: the [20 newsgroup datase](http://qwone.com/~jason/20Newsgroups/). The 20 newsgroup dataset contains 20k documents that belong to 20 topics.\n", + "We are going to use one of the corpora that come prepackaged with Scikit-learn: the [20 newsgroup dataset](http://qwone.com/~jason/20Newsgroups/). The 20 newsgroup dataset contains 20k documents that belong to 20 topics.\n", "\n", - "We inspect now the corpus using the facilities from Scikit-learn, as explain in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)" + "We inspect now the corpus using the facilities from Scikit-learn, as explained in [scikit-learn](http://scikit-learn.org/stable/datasets/twenty_newsgroups.html#newsgroups)" ] }, { @@ -117,19 +117,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Converting Scikit-learn to gensim" + "# Converting Scikit-learn to gensim." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n", + "Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*. Anyway, if you are using intensively LDA,it can be convenient to create the corpus with their functions.\n", "\n", "You should install first:\n", "\n", "* *gensim*. Run 'conda install gensim' in a terminal.\n", - "* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal" + "* *python-Levenshtein*. Run 'conda install python-Levenshtein' in a terminal." ] }, { @@ -183,7 +183,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Although scikit-learn provides an LDA implementation, it is more popular the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*." + "Although scikit-learn provides an LDA implementation, it is more popular than the package *gensim*, which also provides an LSI implementation, as well as other functionalities. Fortunately, scikit-learn sparse matrices can be used in Gensim using the function *matutils.Sparse2Corpus()*." ] }, {