Optimization
- First performance tune PostgreSQL.
- Keep the number of word stems low.
- The practical limit is 100,000 DISTINCT word stems after parsing, stop words, and dictionaries.
- Use the stat() function to find stop words and patterns of word stems you can do without.
- Look at most frequent words for stop words.
- Look at least frequent words for patterns of including too
much (bloating tail).
- Use dictionaries to stem as many words as possible.
- Use contrib/btree_gist to create multicolumn indexes where needed.
- Avoid superfulous updates on non tsvector columns in a table.
- PostgreSQL MVCC recreates all indexes even if ts_vector does not change.
- OR queries {'|'} take a lot longer than AND queries ('&').
|