r-cran-tokenizers 0.2.3-1 (arm64 binary) in ubuntu lunar
Convert natural language text into tokens. Includes tokenizers for
shingled n-grams, skip n-grams, words, word stems, sentences,
paragraphs, characters, shingled characters, lines, tweets, Penn
Treebank, regular expressions, as well as functions for counting
characters, words, and sentences, and a function for splitting longer
texts into separate documents, each with the same number of words.
The tokenizers have a consistent interface, and the package is built
on the 'stringi' and 'Rcpp' packages for fast yet correct
tokenization in 'UTF-8'.
Details
- Package version:
- 0.2.3-1
- Status:
- Superseded
- Component:
- universe
- Priority:
- Optional
Downloadable files
arm64 build of r-cran-tokenizers 0.2.3-1 in ubuntu lunar PROPOSED produced
these files:
- r-cran-tokenizers_0.2.3-1_arm64.deb (645.9 KiB)
Package relationships
- Suggests:
- Recommends: