鏡像自
https://github.com/gsi-upm/sitc
已同步 2025-09-16 11:52:20 +00:00
比較提交
3 次程式碼提交
dveni-patc
...
75f08ea170
作者 | SHA1 | 日期 | |
---|---|---|---|
|
75f08ea170 | ||
|
19ea5dff09 | ||
|
e70689072f |
@@ -326,7 +326,7 @@
|
||||
"def preprocess(words, type='doc'):\n",
|
||||
" if (type == 'tweet'):\n",
|
||||
" tknzr = TweetTokenizer(strip_handles=True, reduce_len=True)\n",
|
||||
" tokens = tknzr.tokenize(tweet)\n",
|
||||
" tokens = tknzr.tokenize(words)\n",
|
||||
" else:\n",
|
||||
" tokens = nltk.word_tokenize(words.lower())\n",
|
||||
" porter = nltk.PorterStemmer()\n",
|
||||
|
新增問題並參考
封鎖使用者