japanese.doctest 1.0 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
  1. .. Copyright (C) 2001-2019 NLTK Project
  2. .. For license information, see LICENSE.TXT
  3. ============================
  4. Japanese Language Processing
  5. ============================
  6. >>> from nltk import *
  7. -------------
  8. Corpus Access
  9. -------------
  10. KNB Corpus
  11. ----------
  12. >>> from nltk.corpus import knbc
  13. Access the words: this should produce a list of strings:
  14. >>> type(knbc.words()[0]) is not bytes
  15. True
  16. Access the sentences: this should produce a list of lists of strings:
  17. >>> type(knbc.sents()[0][0]) is not bytes
  18. True
  19. Access the tagged words: this should produce a list of word, tag pairs:
  20. >>> type(knbc.tagged_words()[0])
  21. <... 'tuple'>
  22. Access the tagged sentences: this should produce a list of lists of word, tag pairs:
  23. >>> type(knbc.tagged_sents()[0][0])
  24. <... 'tuple'>
  25. JEITA Corpus
  26. ------------
  27. >>> from nltk.corpus import jeita
  28. Access the tagged words: this should produce a list of word, tag pairs, where a tag is a string:
  29. >>> type(jeita.tagged_words()[0][1]) is not bytes
  30. True