1
0
mirror of https://github.com/gsi-upm/sitc synced 2025-08-24 02:22:21 +00:00

added import nltk

This commit is contained in:
cif2cif
2017-04-20 16:07:10 +02:00
parent c55a1c077b
commit b24f866056
12 changed files with 180 additions and 179 deletions

View File

@@ -162,157 +162,13 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"$: dollar\n",
" $ -$ --$ A$ C$ HK$ M$ NZ$ S$ U.S.$ US$\n",
"'': closing quotation mark\n",
" ' ''\n",
"(: opening parenthesis\n",
" ( [ {\n",
"): closing parenthesis\n",
" ) ] }\n",
",: comma\n",
" ,\n",
"--: dash\n",
" --\n",
".: sentence terminator\n",
" . ! ?\n",
":: colon or ellipsis\n",
" : ; ...\n",
"CC: conjunction, coordinating\n",
" & 'n and both but either et for less minus neither nor or plus so\n",
" therefore times v. versus vs. whether yet\n",
"CD: numeral, cardinal\n",
" mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-\n",
" seven 1987 twenty '79 zero two 78-degrees eighty-four IX '60s .025\n",
" fifteen 271,124 dozen quintillion DM2,000 ...\n",
"DT: determiner\n",
" all an another any both del each either every half la many much nary\n",
" neither no some such that the them these this those\n",
"EX: existential there\n",
" there\n",
"FW: foreign word\n",
" gemeinschaft hund ich jeux habeas Haementeria Herr K'ang-si vous\n",
" lutihaw alai je jour objets salutaris fille quibusdam pas trop Monte\n",
" terram fiche oui corporis ...\n",
"IN: preposition or conjunction, subordinating\n",
" astride among uppon whether out inside pro despite on by throughout\n",
" below within for towards near behind atop around if like until below\n",
" next into if beside ...\n",
"JJ: adjective or numeral, ordinal\n",
" third ill-mannered pre-war regrettable oiled calamitous first separable\n",
" ectoplasmic battery-powered participatory fourth still-to-be-named\n",
" multilingual multi-disciplinary ...\n",
"JJR: adjective, comparative\n",
" bleaker braver breezier briefer brighter brisker broader bumper busier\n",
" calmer cheaper choosier cleaner clearer closer colder commoner costlier\n",
" cozier creamier crunchier cuter ...\n",
"JJS: adjective, superlative\n",
" calmest cheapest choicest classiest cleanest clearest closest commonest\n",
" corniest costliest crassest creepiest crudest cutest darkest deadliest\n",
" dearest deepest densest dinkiest ...\n",
"LS: list item marker\n",
" A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002 SP-44005\n",
" SP-44007 Second Third Three Two * a b c d first five four one six three\n",
" two\n",
"MD: modal auxiliary\n",
" can cannot could couldn't dare may might must need ought shall should\n",
" shouldn't will would\n",
"NN: noun, common, singular or mass\n",
" common-carrier cabbage knuckle-duster Casino afghan shed thermostat\n",
" investment slide humour falloff slick wind hyena override subhumanity\n",
" machinist ...\n",
"NNP: noun, proper, singular\n",
" Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos\n",
" Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA\n",
" Shannon A.K.C. Meltex Liverpool ...\n",
"NNPS: noun, proper, plural\n",
" Americans Americas Amharas Amityvilles Amusements Anarcho-Syndicalists\n",
" Andalusians Andes Andruses Angels Animals Anthony Antilles Antiques\n",
" Apache Apaches Apocrypha ...\n",
"NNS: noun, common, plural\n",
" undergraduates scotches bric-a-brac products bodyguards facets coasts\n",
" divestitures storehouses designs clubs fragrances averages\n",
" subjectivists apprehensions muses factory-jobs ...\n",
"PDT: pre-determiner\n",
" all both half many quite such sure this\n",
"POS: genitive marker\n",
" ' 's\n",
"PRP: pronoun, personal\n",
" hers herself him himself hisself it itself me myself one oneself ours\n",
" ourselves ownself self she thee theirs them themselves they thou thy us\n",
"PRP$: pronoun, possessive\n",
" her his mine my our ours their thy your\n",
"RB: adverb\n",
" occasionally unabatingly maddeningly adventurously professedly\n",
" stirringly prominently technologically magisterially predominately\n",
" swiftly fiscally pitilessly ...\n",
"RBR: adverb, comparative\n",
" further gloomier grander graver greater grimmer harder harsher\n",
" healthier heavier higher however larger later leaner lengthier less-\n",
" perfectly lesser lonelier longer louder lower more ...\n",
"RBS: adverb, superlative\n",
" best biggest bluntest earliest farthest first furthest hardest\n",
" heartiest highest largest least less most nearest second tightest worst\n",
"RP: particle\n",
" aboard about across along apart around aside at away back before behind\n",
" by crop down ever fast for forth from go high i.e. in into just later\n",
" low more off on open out over per pie raising start teeth that through\n",
" under unto up up-pp upon whole with you\n",
"SYM: symbol\n",
" % & ' '' ''. ) ). * + ,. < = > @ A[fj] U.S U.S.S.R * ** ***\n",
"TO: \"to\" as preposition or infinitive marker\n",
" to\n",
"UH: interjection\n",
" Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey Kee-reist Oops amen\n",
" huh howdy uh dammit whammo shucks heck anyways whodunnit honey golly\n",
" man baby diddle hush sonuvabitch ...\n",
"VB: verb, base form\n",
" ask assemble assess assign assume atone attention avoid bake balkanize\n",
" bank begin behold believe bend benefit bevel beware bless boil bomb\n",
" boost brace break bring broil brush build ...\n",
"VBD: verb, past tense\n",
" dipped pleaded swiped regummed soaked tidied convened halted registered\n",
" cushioned exacted snubbed strode aimed adopted belied figgered\n",
" speculated wore appreciated contemplated ...\n",
"VBG: verb, present participle or gerund\n",
" telegraphing stirring focusing angering judging stalling lactating\n",
" hankerin' alleging veering capping approaching traveling besieging\n",
" encrypting interrupting erasing wincing ...\n",
"VBN: verb, past participle\n",
" multihulled dilapidated aerosolized chaired languished panelized used\n",
" experimented flourished imitated reunifed factored condensed sheared\n",
" unsettled primed dubbed desired ...\n",
"VBP: verb, present tense, not 3rd person singular\n",
" predominate wrap resort sue twist spill cure lengthen brush terminate\n",
" appear tend stray glisten obtain comprise detest tease attract\n",
" emphasize mold postpone sever return wag ...\n",
"VBZ: verb, present tense, 3rd person singular\n",
" bases reconstructs marks mixes displeases seals carps weaves snatches\n",
" slumps stretches authorizes smolders pictures emerges stockpiles\n",
" seduces fizzes uses bolsters slaps speaks pleads ...\n",
"WDT: WH-determiner\n",
" that what whatever which whichever\n",
"WP: WH-pronoun\n",
" that what whatever whatsoever which who whom whosoever\n",
"WP$: WH-pronoun, possessive\n",
" whose\n",
"WRB: Wh-adverb\n",
" how however whence whenever where whereby whereever wherein whereof why\n",
"``: opening quotation mark\n",
" ` ``\n"
]
}
],
"outputs": [],
"source": [
"import nltk\n",
"nltk.help.upenn_tagset()"
]
},