|Full-Text Search in PostgreSQL: A Gentle Introduction|
|Prev||Fast Backward||Chapter 2. FTS Operators and Functions||Fast Forward||Next|
to_tsquery( [configuration,] querytext text) RETURNS TSQUERY
Accepts querytext, which should be a single tokens
separated by the boolean operators & and, |
or, and ! not, which can be grouped using parenthesis.
In other words,
to_tsquery expects already parsed text.
Each token is reduced to a lexeme using the current or specified configuration.
Weight class can be assigned to each lexeme entry to restrict search region
setweight for explanation), for example
'fat:a & rats'
to_tsquery function could accept
text string. In this case querytext
should be quoted. This may be useful, for example, to use with thesaurus
dictionary. In example below, thesaurus contains rule
supernovae stars : sn.
=# select to_tsquery('''supernovae stars'' & !crab'); to_tsquery ---------------- 'sn' & !'crab'
to_tsquery will complain about syntax error.
plainto_tsquery( [configuration,] querytext text) RETURNS TSQUERY
Transforms unformatted text querytext to tsquery.
It is the same as
to_tsquery, but accepts text
and will call parser to break it onto tokens.
plainto_tsquery assumes & boolean
operator between words and doesn't recognizes weight classes.
querytree(query TSQUERY) RETURNS text
This returns a query which actually used in searching in index. It could be used to test for an empty query. Select below returns 'T', which corresponds to empty query, since GIN index doesn't supports negate query and full index scan is very ineffective.
=# select querytree( to_tsquery('!defined') ); querytree ----------- T
text::TSQUERY RETURNS TSQUERY
Directly casting text to a tsquery allows you to directly inject lexemes into a query, with whatever positions and position weight flags you choose to specify. The text should be formatted like the query would be printed by the output of a SELECT.
numnode(query TSQUERY) RETURNS INTEGER
This returns the number of nodes in query tree. This function could be used to resolve if query is meaningful ( returns > 0) , or contains only stop-words (returns 0).
=# select numnode( plainto_tsquery('the any') ); NOTICE: query contains only stopword(s) or doesn't contain lexeme(s), ignored numnode --------- 0 =# select numnode( plainto_tsquery('the table') ); numnode --------- 1 =# select numnode( plainto_tsquery('long table') ); numnode --------- 3
TSQUERY && TSQUERY RETURNS TSQUERY
Returns AND-ed TSQUERY
TSQUERY || TSQUERY RETURNS TSQUERY
Returns OR-ed TSQUERY
!! TSQUERY RETURNS TSQUERY
negation of TSQUERY
TSQUERY < TSQUERY TSQUERY <= TSQUERY TSQUERY = TSQUERY TSQUERY >= TSQUERY TSQUERY > TSQUERY
All btree operations defined for tsquery type. tsqueries compares with each other using lexicographical order.
Query rewriting is a set of functions and operators for tsquery type. It allows to control search at query time without reindexing (opposite to thesaurus), for example, expand search using synonyms (new york, big apple, nyc, gotham) or narrow search directing user to some hot topic. Notice, that rewriting rules can be added online.
rewrite() function changes original query by replacing
part of the query by sample string of type tsquery, as it defined by rewrite
rule. Arguments of
rewrite() function can be
column names of type tsquery.
CREATE TABLE aliases (t TSQUERY primary key, s TSQUERY); INSERT INTO aliases values('a', 'c');
rewrite (query TSQUERY, target TSQUERY, sample TSQUERY) RETURNS TSQUERY
=# select rewrite('a & b'::TSQUERY, 'a'::TSQUERY, 'c'::TSQUERY); rewrite ----------- 'b' & 'c'
rewrite (ARRAY[query TSQUERY, target TSQUERY, sample TSQUERY]) RETURNS TSQUERY
=# select rewrite(ARRAY['a & b'::TSQUERY, t,s]) from aliases; rewrite ----------- 'b' & 'c'
rewrite (query TSQUERY,'select target ,sample from test'::text) RETURNS TSQUERY
=# select rewrite('a & b'::TSQUERY, 'select t,s from aliases'); rewrite ----------- 'b' & 'c'
What if there are several variants of rewriting ? For example, query 'a & b' can be rewritten as 'b & c' and 'cc'.
=# select * from aliases; t | s -----------+------ 'a' | 'c' 'x' | 'z' 'a' & 'b' | 'cc'
This ambiguity can be resolved specifying sort order.
=# select rewrite('a & b', 'select t,s from aliases order by t desc'); rewrite --------- 'cc' =# select rewrite('a & b', 'select t,s from aliases order by t asc'); rewrite ----------- 'b' & 'c'
Let's consider real-life astronomical example. We'll expand query supernovae using table-driven rewriting rules.
=# create table aliases (t tsquery primary key, s tsquery); =# insert into aliases values(to_tsquery('supernovae'), to_tsquery('supernovae|sn')); =# select rewrite(to_tsquery('supernovae'), 'select * from aliases') && to_tsquery('crab'); ?column? --------------------------------- ( 'supernova' | 'sn' ) & 'crab'
Notice, that we can change rewriting rule online !
=# update aliases set s=to_tsquery('supernovae|sn&!nebulae') where t=to_tsquery('supernovae'); =# select rewrite(to_tsquery('supernovae'), 'select * from aliases') && to_tsquery('crab'); ?column? --------------------------------------------- ( 'supernova' | 'sn' & !'nebula' ) & 'crab'
Rewriting can be slow in case of many rewriting rules, since it checks every rule for possible hit. To filter out obvious non-candidate rules there are containment operators for tsquery type. In example below, we select only those rules, which might contains in the original query.
=# select rewrite(ARRAY['a & b'::TSQUERY, t,s]) from aliases where 'a&b' @> t; rewrite ----------- 'b' & 'c'
Two operators defined for tsquery type:
TSQUERY @> TSQUERY
Returns TRUE if right agrument might contained in left argument.
TSQUERY <@ TSQUERY
Returns TRUE if left agrument might contained in right argument.
To speed up operators <@,@> for tsquery one can use GiST index with tsquery_ops opclass.
create index t_idx on aliases using gist (t tsquery_ops);