# 2009-08-12

## Algebra for full-text queries

We introduce phrase operator ?[n], or phrase conjuction operator, which is similar logical conjuction operator ( AND, &), but preserve order of operands (non-commutative) and constraint distance between them (<=n)

Logical conjuction operator (AND, &) is associative, commmutative, distributive, idempotent. In set theory intersection operator is an example of logical conjunction operator.

• The ? operator is non-commutative, so 'A ? B' ≠ 'B ? A'
• The ? operator is non-associative (left-associative) and evaluates from left to right.
```=# select '1 ? 2 ? 3'::tsquery = '(1 ? 2) ? 3'::tsquery;
?column?
----------
t
```

but

```=# select '1 ? 2 ? 3'::tsquery = '1 ?  (2 ? 3)'::tsquery;
?column?
----------
f
```

Function *phraseto_tsquery()* can be used for easy construction of phrase queries:

```=# select phraseto_tsquery('1 2 3');
phraseto_tsquery
---------------------
( '1' ? '2' ) ? '3'
```
• The ? operator distributes across OR and AND:
```=# select '1 ? ( 2 | 3)'::tsquery = '( 1 ? 2 ) | ( 1 ? 3 )'::tsquery;
?column?
----------
t
=# select '1 ? ( 2 & 3)'::tsquery = '( 1 ? 2 ) & ( 1 ? 3 )'::tsquery;
?column?
----------
t
```

'1 ? ( 2 & 3)'::tsquery looks like a problem, but consider situation when dictionary returns two lexems, so in tsvector they will have the same coodinates.

```=# select '1:1 2:2 3:2'::tsvector  @@ '1 ? ( 2 & 3)'::tsquery;
?column?
----------
t
```
• The ? operator is non-idempotent, i.e. 'A ? A' ≠ 'A' ( not as AND: A & A ≡ A )
```=# select '1 ? 1'::tsquery;
tsquery
-----------
'1' ? '1'
```

## Compound word

```=# CREATE TEXT SEARCH DICTIONARY nb_no_ispell ( TEMPLATE = ispell,
DictFile = nb_no, AffFile = nb_no );
=# select ts_lexize('nb_no_ispell', 'telefonsvarer');
ts_lexize
------------------------------
{telefonsvarer,telefon,svar}
=# CREATE TEXT SEARCH CONFIGURATION public.no ( COPY=pg_catalog.norwegian);
=# ALTER TEXT SEARCH CONFIGURATION  no ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,word,
hword, hword_part WITH nb_no_ispell, norwegian_stem;

=# select to_tsquery('no','telefonsvarer & device');
to_tsquery
----------------------------------------------------
( 'telefonsvarer' | 'telefon' & 'svar' ) & 'devic'
=# select to_tsvector('no','telefonsvarer  device');
to_tsvector
--------------------------------------------------
'devic':2 'svar':1 'telefon':1 'telefonsvarer':1
```

Now, see how phraseto_tsquery works:

```=# select phraseto_tsquery('no','telefonsvarer device');
phraseto_tsquery
----------------------------------------------------------------------------
'telefonsvarer' ? 'devic' | ( 'telefon' ? 'devic' ) & ( 'svar' ? 'devic' )
```

Casting produce the same result:

```=# select '(telefonsvarer | telefon & svar ) ? devic'::tsquery;
tsquery
----------------------------------------------------------------------------
'telefonsvarer' ? 'devic' | ( 'telefon' ? 'devic' ) & ( 'svar' ? 'devic' )
```

More complex phrase:

```=# select phraseto_tsquery('no','telefonsvarer device ok');
phraseto_tsquery
-------------------------------------------------------------------------------------------------------------
( 'telefonsvarer' ? 'devic' ) ? 'ok' | ( ( 'telefon' ? 'devic' ) ? 'ok' ) & ( ( 'svar' ? 'devic' ) ? 'ok' )

=# select '(telefonsvarer |  telefon & svar ) ? devic ? ok'::tsquery;
tsquery
-------------------------------------------------------------------------------------------------------------
( 'telefonsvarer' ? 'devic' ) ? 'ok' | ( ( 'telefon' ? 'devic' ) ? 'ok' ) & ( ( 'svar' ? 'devic' ) ? 'ok' )

(1 row)

```