Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files. Swish-e is ideally suited for collections of a million documents or smaller.

Swish-e is Simple Web Indexing System for Humans - Enhanced. Swish-e can quickly and easily index directories of files or remote web sites and search the generated indexes.

Swish-e is extremely fast in both indexing and searching, highly configurable, and can be seamlessly integrated with existing web sites to maintain a consistent design. Swish-e can index web pages, but can just as easily index text files, mailing list archives, or data stored in a relational database.

Swish is designed to index small- to medium-sized collection of documents, Although a few users are indexing over a million documents, typical usage is more often in the tens of thousands. Currently, Swish-e only indexes eight bit character encodings.

Swish-e version 2.2 was a major rewrite of the code and the addition of many new features. Memory requirements for indexing have been reduced and indexing speed is significantly improved from previous versions. New features allow more control over indexing, better document parsing, improved indexing and searching logic, better filter code, and the ability to index from any data source.

Swish-e version 2.4 includes a major rewrite of the C API and a new Perl module for accessing the Swish-e C library.

Swish-e is not a "turn-key" indexing and searching solution. The Swish-e distribution contains most of the parts to create such a system, but you need to put the parts together as best meets your needs. This gives you the power to index and search your documents the way you wish and to seamlessly integrate a search engine into your web site or application.

To use Swish-e, you will need to configure Swish-e to index your documents, create an index by running Swish-e, and setup an interface such as a CGI script (a script is included) to search the index and display results. Swish uses helper programs to index documents of types that Swish-e cannot natively index. These programs may need to be installed separately from Swish-e.

Swish-e is an Open Source (see: ) program supported by developers and a large group of users. Please take time to join the Swish-e discussion list at

Using the GNOME™ libxml2 parser and a collection of filters, Swish-e can index plain text, e-mail, PDF, HTML, XML, Microsoft® Word/PowerPoint/Excel and just about any file that can be converted to XML or HTML text. Swish-e is also often used to supplement databases like the MySQL® DBMS for very fast full-text searching.

