Loading...
 
Skip to main content

History: Unified Index Comparison

Source of version: 62

Copy to clipboard
            ! Unified Index Comparison
The ((Search and List from Unified Index)) has support for multiple engines. While all of them offer the same general functionality and connect to various functionality such as the content search, ((PluginList)), ((PluginCustomSearch)) and various others, they will have different performance characteristics and some may offer additional features. 

As a general rule, the engine can simply be switched and the index rebuilt without any additional change to the configuration.

!! Overview
The unified index engines are:
* ((MariaDB)) / ((MySQL)) / ((Percona Server for MySQL)) Full Text Search __MyISAM__
** __First version: ((Tiki12)) Last version: ((Tiki27))__ (the default engine for all these versions)
** Additional memory required
** Fast indexing (can be 10 times faster than now removed Zend_Search_Lucene), slower/unstable query speed
** No configuration required
** Not customizable
** Stored in tables in the database with a prefix of index_ and are stored in ((MyISAM)) even if the actual data is in ((InnoDB))
* ((MariaDB)) / ((MySQL)) / ((Percona Server for MySQL)) Full Text Search __INNODB__
** ''[https://gitlab.com/tikiwiki/tiki/-/merge_requests/5297|introduced] in __((Tiki28))__'' (new default engine from now on)
** No more limit on the number of columns (We now workaround INNODB limits)
** Stored in tables in the database with a prefix of index_ and are stored in ((InnoDB)). All the data tables should also be in InnoDB, but it's also possible the data is still in ((MyISAM)). Ex.: an Tiki instance started when MyISAM was the default, and the conversion was never done.
* ((Elasticsearch))
** ''introduced in ((Tiki12))''
** Independent Java server(s), horizontally scalable
** Feature-rich
** Fast indexing, fast/stable query speed, decent/good results
** Typically, Elasticsearch is set up as a cluster on different servers than Tiki (or using a third-party service), but it is also possible to install on the same server.
** Customizable
* ((Manticore Search))
** ''introduced in ((Tiki25))''
** Feature-rich
** Very fast
** Written in C++ with ((Manticore Buddy)) in PHP
** Customizable
** Can be set up as a cluster
** Requires small amounts of RAM. (compared to Elasticsearch)
*** It runs fine on a virtual machine with 1 GB of RAM
** Great support for [https://packagist.org/packages/manticoresoftware/|PHP]
** [https://forum.manticoresearch.com/t/install-and-start-on-centos-7-without-root-access/511|Can be installed without root access] so shared hosting should be OK (They will need SSH though).
** [https://wikisuite.org/Manticore-Search|This is the default setup for WikiSuite] once ((Tiki26)) is released, and it is an option of the [http://wikisuite.org/How-to-install-WikiSuite|installer].

The system is designed for maintaining an autonomy vis à vis the engines. So more can be added later. No long-term data is stored in the indexes and it's fairly easy to switch from one to another. The next logical addition is ((OpenSearch)). Please contact Marc Laporte if you have specific needs.

!! Limitations
!!! MySQL/MariaDB MYISAM
* Words with fewer than 3 or 4 characters will not be indexed unless the server configuration is modified. Variables name: ft_min_word_len and innodb_ft_min_token_size
* Comes with an extensive list of English stop words, preventing many queries from working.
* Can use a single index at a time. Depending on the query, performance can vary significantly.
* Several limitations on the number of columns and indexes it can contain. Complex sites with many different query patterns may hit those limitations.
* No support for field boosting, such as providing more relevance for hits on the title.
* There is a limitation on the number of tracker fields. The limitation is quite high (2000+), but when you hit it, you need to move to another engine because MySQL/MariaDB has a [https://dev.mysql.com/doc/mysql-reslimits-excerpt/5.7/en/column-count-limit.html#:~:text=MySQL%20has%20hard%20limit%20of,columns%20cannot%20exceed%20this%20size.|hard limit]. It is not possible to know in advance the precise number of maximum fields because some tracker field types require more than one column.
** It is possible to ((exclude some fields from the unified index)) to stay within the limit.
* ((How to search currency amounts)) likely produces bad results (to be tested)

!!! MySQL/MariaDB INNODB
* This is new in ((Tiki28)) so we will discover limitations as we use it, but already [https://gitlab.com/tikiwiki/tiki/-/merge_requests/5297|we will no longer be limited by the number of tracker fields]. (We now workaround INNODB limits)

!!! Elasticsearch
* [https://www.elastic.co/pricing/faq/licensing|No longer Open Source]
* Requires a dedicated environment to be installed and works better with multiple instances running in a cluster.
** Requires Java and a lot of RAM/CPU {QUOTE(source_url="https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_memory")}A machine with 64 GB of RAM is the ideal sweet spot, but 32 GB and 16 GB machines are also common.{QUOTE}

!!! Manticore
There is a hard limit of 256 full text fields per index. Additional fields will be slower.

!! Extra features
* ((Stored Search))
** Only supported by Elasticsearch and Manticore
* ((Faceted search)) (dynamic filters applicable on search results)
** Only supported by Elasticsearch and Manticore
* ((Module More Like This))
** Only supported by Elasticsearch (Manticore on roadmap)
* ((Federated Search))
** Only supported by Elasticsearch (Manticore on roadmap)

!! Selection guidelines
!!! Tiki 25 and before
* Small sites, simple functionality: MariaDB/MySQL Full Text Search
* Medium or large sites, advanced functionality: Elasticsearch

!!! Tiki 26 and up
* Small sites, simple functionality: MySQL Full Text Search
* If you are already using Elasticsearch and are happy with it: Elasticsearch
* Medium or large sites, advanced functionality: Manticore

!! Speed comparison
* Executive summary: Manticore Search is super fast and requires fewer resources (even just 1 gig of RAM can manage a surprisingly large data set)
* https://db-benchmarks.com/?cache=fast_avg&engines=clickhouse_21.8.11.4%2Celasticsearch_7.15.2%2Cmanticoresearch_6.0.2%2Cmeilisearch_1.1.1%2Cmysql_8.0.28%2Cmysql_percona_8.0.28-19%2Cpostgres_15.2+%28Debian+15.2-1.pgdg110%2B1%29%2Ctypesense_26.0&tests=hn_small&memory=110000&queries=0%2C1%2C2%2C5%2C6%2C7%2C8%2C16%2C17%2C18%2C19%2C20%2C21%2C22%2C26%2C27

!! Differences of results between engines
((index:compare-engines))

!! Legacy
Zend_Search_Lucene (PHP Implementation) was introduced in ((Tiki7)) and later removed Last version: ((Tiki21)).


-=alias=-
* (alias(Search Engine Comparison))
        

History

Information Version
Marc Laporte 64
Marc Laporte 63
Marc Laporte 62
Marc Laporte 61
Marc Laporte 60
Marc Laporte 59
Marc Laporte Full Text Search MyISAM vs Full Text Search INNODB 58
Marc Laporte Important info 57
Marc Laporte 56
Bernard Sfez / Tiki Specialist Improved some information 55
Marc Laporte Manticore Buddy 54
Marc Laporte Adding Percona Server for MySQL, which is a "a free, fully compatible drop in replacement for Oracle MySQL." 53
Marc Laporte 52
Marc Laporte How to search currency amounts 51
Marc Laporte 50
Marc Laporte 49
Marc Laporte 48
Marc Laporte 47
Marc Laporte 46
Marc Laporte 45
Marc Laporte 44
Marc Laporte 43
Marc Laporte 42
Marc Laporte 41
Marc Laporte 40
Marc Laporte 39
Marc Laporte 38
Marc Laporte 37
Marc Laporte 36
Marc Laporte Remove info about Zend_Search_Lucene 35
Marc Laporte 34
Marc Laporte 33
Marc Laporte 32
Marc Laporte 31
Marc Laporte 30
Marc Laporte 29
Marc Laporte 28
Marc Laporte 27
Marc Laporte Zend_Search_Lucene was removed so doesn't make sense to recommend 26
Marc Laporte 25
Marc Laporte A bit of info about Manticore Search 24
Marc Laporte 23
Marc Laporte 22
Marc Laporte 21
Marc Laporte 20
Marc Laporte 19
Marc Laporte 18
Marc Laporte 17
Marc Laporte Zend Leach Lucene: no more releases: https://github.com/zendframework/ZendSearch/pull/23 16
Philippe Cloutier fix/simplify 15
  • «
  • 1 (current)
  • 2