bscheibel
/
technical_drawings_extraction


			
							12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
							.. Copyright (C) 2001-2019 NLTK Project
.. For license information, see LICENSE.TXT

============================
Japanese Language Processing
============================

    >>> from nltk import *

-------------
Corpus Access
-------------

KNB Corpus
----------

    >>> from nltk.corpus import knbc

Access the words: this should produce a list of strings:

    >>> type(knbc.words()[0]) is not bytes
    True

Access the sentences: this should produce a list of lists of strings:

    >>> type(knbc.sents()[0][0]) is not bytes
    True

Access the tagged words: this should produce a list of word, tag pairs:

    >>> type(knbc.tagged_words()[0])
    <... 'tuple'>

Access the tagged sentences: this should produce a list of lists of word, tag pairs:

    >>> type(knbc.tagged_sents()[0][0])
    <... 'tuple'>


JEITA Corpus
------------

    >>> from nltk.corpus import jeita

Access the tagged words: this should produce a list of word, tag pairs, where a tag is a string:

    >>> type(jeita.tagged_words()[0][1]) is not bytes
    True