Ubuntu

Overview
Code
Bugs
Blueprints
Translations
Answers

r-cran-tokenizers binary package in Ubuntu Oracular amd64

Oracular (24.10)
amd64
r-cran-tokenizers

Convert natural language text into tokens. Includes tokenizers for
shingled n-grams, skip n-grams, words, word stems, sentences,
paragraphs, characters, shingled characters, lines, tweets, Penn
Treebank, regular expressions, as well as functions for counting
characters, words, and sentences, and a function for splitting longer
texts into separate documents, each with the same number of words.
The tokenizers have a consistent interface, and the package is built
on the 'stringi' and 'Rcpp' packages for fast yet correct
tokenization in 'UTF-8'.

Publishing history

	Date	Status	Target	Pocket	Component	Section	Priority	Phased updates	Version
	2024-04-29 18:55:16 UTC	Published	Ubuntu Oracular amd64	release	universe	gnu-r	Optional		0.3.0-1
Published on 2024-04-29 Copied from ubuntu lunar-proposed amd64 in Primary Archive for Ubuntu

Source package

r-cran-tokenizers package in Ubuntu