Binary package “libhtmlparser-java” in ubuntu trusty

java library to parse html

 HTML Parser is a Java library used to parse HTML in either a linear
 or nested fashion. Primarily used for transformation or extraction,
 it features filters, visitors, custom tags and easy to use
 JavaBeans.
 .
 The two fundamental use-cases that are handled by the parser are
 extraction and transformation (the syntheses use-case, where HTML
 pages are created from scratch, is better handled by other tools
 closer to the source of data).
 .
 In general, to use the HTMLParser you will need to be able to write
 code in the Java programming language. Although some example programs
 are provided that may be useful as they stand, it's more than likely
 you will need (or want) to create your own programs or modify the
 ones provided to match your intended application.