Ubuntu

Overview
Code
Bugs
Blueprints
Translations
Answers

Binary package “libboilerpipe-java” in ubuntu bionic

Bionic (18.04)
libboilerpipe-java

Boilerplate removal and fulltext extraction from HTML pages

The boilerpipe library provides algorithms to detect and remove the surplus
"clutter" (boilerplate, templates) around the main textual content of a web
page.
.
The library already provides specific strategies for common tasks (for example:
news article extraction) and may also be easily extended for individual problem
settings.
.
Extracting content is very fast (milliseconds), just needs the input document
(no global or site-level information required) and is usually quite accurate.

Source package

boilerpipe 1.2.0-1 source package in Ubuntu

Published versions

libboilerpipe-java 1.2.0-1 in amd64 (Release)
libboilerpipe-java 1.2.0-1 in arm64 (Release)
libboilerpipe-java 1.2.0-1 in armhf (Release)
libboilerpipe-java 1.2.0-1 in i386 (Release)
libboilerpipe-java 1.2.0-1 in ppc64el (Release)
libboilerpipe-java 1.2.0-1 in s390x (Release)