Binary package “swish-e” in ubuntu lucid
Simple Web Indexing System for Humans - Enhanced
SWISH-Enhanced is a fast, powerful, flexible, and easy to use system
for indexing collections of HTML Web pages, or any XML or text files like
Open Office Documents, Open Document files, emails, and so on.
* Quickly index a large number of text, HTML, and XML documents
* Use filters to index any type of files such as PDF, OpenOffice, DOC, XLS,
* Includes a web spider for indexing remote documents over HTTP
* Can use an external program to supply documents including
records from a relational database.
* Word stemming, soundex, metaphone, and double-metaphone indexing for
* Powerful Regular Expressions to select documents for indexing or exclusion
* Limit searches to parts of documents such as certain HTML tags or to
* Index file is portable between platforms.
* A Swish-e library is provided to allow embedding Swish-e into your
applications for very fast searching.
You'll find ready to use examples for indexing the Debian documentation, PDF,
OpenOffice and MSOffice files, whole Maildir, and more.