Binary package “libboilerpipe-java” in ubuntu bionic

Boilerplate removal and fulltext extraction from HTML pages

 The boilerpipe library provides algorithms to detect and remove the surplus
 "clutter" (boilerplate, templates) around the main textual content of a web
 page.
 .
 The library already provides specific strategies for common tasks (for example:
 news article extraction) and may also be easily extended for individual problem
 settings.
 .
 Extracting content is very fast (milliseconds), just needs the input document
 (no global or site-level information required) and is usually quite accurate.