Previous   Next 

Optimization

  • First performance tune PostgreSQL.
  • Keep the number of word stems low.
  • The practical limit is 100,000 DISTINCT word stems after parsing, stop words, and dictionaries.
  • Use the stat() function to find stop words and patterns of word stems you can do without.
  • Look at most frequent words for stop words.
  • Look at least frequent words for patterns of including too much (bloating tail).
  • Use dictionaries to stem as many words as possible.
  • Use contrib/btree_gist to create multicolumn indexes where needed.
  • Avoid superfulous updates on non tsvector columns in a table.
  • PostgreSQL MVCC recreates all indexes even if ts_vector does not change.
  • OR queries {'|'} take a lot longer than AND queries ('&').