Full-Text Search Engines
It is obvious that the matter of a full-text search engine implementation is becoming more and more important (at least as far as my work is concerned) .
A while ago I implemented the TSearch2 full-text search engine with one of our PostgreSQL databases1. Let me say that it wasn’t easy (mostly because I wanted to make it support the Romanian language), although I did get it working properly with quite impressive effects2.
Since then I stumbled upon a many more free full-text search engines that claim all sorts of features and I seem to find myself in the eve of a great need for a high performance full-text search implementation so this post is actually something of a preliminary list of tools out there that can help me. I plan (as you can see in the Pending list at the right) to make an extensive tests on the subject pretty soon and I will post the results here of course in a more coherent (and hopefully much more meaningful) report.
So here they are, in all their glory3:
- TSearch2 (PostgreSQL)
- Sphinx (MySQL and PostgreSQL - provides server daemon)
- OpenFTS (PostgreSQL)
- Namazu (CGI based)
- Apache Lucene (Java based) and servers (Solr)
- Pylucene (Python port of Lucene)
- Plucene (Perl port of Lucene)
- Egothor (Java based)
- MG4J (Java based)
- Ioda (CGI and server daemon)
- Zend_Search (PHP5 based)
- Xapian (C++ based with bindings for many languages)
- Swish-e (libxml2 based)
- Hyper Estraier (for communities - supports CGI)
Feel free to correct me where necessary. Hope this post will be completed with other search systems. That would help me with my attempt to find “the right one”.
Gotta go now, bye!
- I have yet to use my TSearch2 implementation in a heavy traffic environment. ⇑
- From what I could gather from my experience until now and by reading other reviews, tests and opinions, I know that TSearch2 updates it’s indexes in real-time while searching and indexing becomes increasingly slower with large-sized indexes. ⇑
- Bare in mind that this is only a listing, and only free or open-source systems are being considered! ⇑
Enjoyed this post?, why not subscribe to the RSS feed!