Showing revision 2

ts changes

Difference between revision 2 and revision 2

The two revisions are the same.

Summary of changes

TEXT SEARCH CONFIGURATION

1. GUC variable default_text_search_config defines at initdb time

  • explicitly, using -T or --tsearch-config option
  • implicitly, using language part of locale name, for example, ru_RU.UTF-8 → default_text_search_config=russian

2. at runtime, if text search configuration doesn't specified, it determined from GUC variable default_text_search_config, if it doesn't defined, then raised error .

3. CREATE TEXT SEARCH CONFIGURATION doesn't have LOCALE option and DEFAULT flag

4. CREATE/ALTER/DROP … MAPPING removed, now MAPPING defined as

ALTER TEXT SEARCH CONFIGURATION public.pg ALTER MAPPING FOR lword, lhword, lpart_hword WITH english;

5. There are now 16 default text search configurations

postgres=# \dF
                 List of fulltext configurations
   Schema   |    Name    |              Description
------------+------------+---------------------------------------
 pg_catalog | danish     | Configuration for danish language
 pg_catalog | dutch      | Configuration for dutch language
 pg_catalog | english    | Configuration for english language
 pg_catalog | finnish    | Configuration for finnish language
 pg_catalog | french     | Configuration for french language
 pg_catalog | german     | Configuration for german language
 pg_catalog | hungarian  | Configuration for hungarian language
 pg_catalog | italian    | Configuration for italian language
 pg_catalog | norwegian  | Configuration for norwegian language
 pg_catalog | portuguese | Configuration for portuguese language
 pg_catalog | romanian   | Configuration for romanian language
 pg_catalog | russian    | Configuration for russian language
 pg_catalog | simple     | Simple configuration
 pg_catalog | spanish    | Configuration for spanish language
 pg_catalog | swedish    | Configuration for swedish language
 pg_catalog | turkish    | Configuration for turkish language

DICTIONARIES

Dictionaries expect files in UTF-8 encoding !!!!

Snowball dictionaries were renamed - english, russian,……

postgres=# \dFd
                            List of fulltext dictionaries
   Schema   |    Name    |                        Description
------------+------------+-----------------------------------------------------------
 pg_catalog | danish     | Snowball stemmer for danish language
 pg_catalog | dutch      | Snowball stemmer for dutch language
 pg_catalog | english    | Snowball stemmer for english language
 pg_catalog | finnish    | Snowball stemmer for finnish language
 pg_catalog | french     | Snowball stemmer for french language
 pg_catalog | german     | Snowball stemmer for german language
 pg_catalog | hungarian  | Snowball stemmer for hungarian language
 pg_catalog | italian    | Snowball stemmer for italian language
 pg_catalog | norwegian  | Snowball stemmer for norwegian language
 pg_catalog | portuguese | Snowball stemmer for portuguese language
 pg_catalog | romanian   | Snowball stemmer for romanian language
 pg_catalog | russian    | Snowball stemmer for russian language
 pg_catalog | simple     | simple dictionary: just lower case and check for stopword
 pg_catalog | spanish    | Snowball stemmer for spanish language
 pg_catalog | swedish    | Snowball stemmer for swedish language
 pg_catalog | turkish    | Snowball stemmer for turkish language

Due to security consideration now we have DICTIONARY TEMPLATE, which could be changed only by superuser.DICTIONARY could be defined by user using DICTIONARY TEMPLATE.

CREATE TEXT SEARCH DICTIONARY TEMPLATE dict_template LEXIZE lexize_function [INIT init_function] ;

DROP TEXT SEARCH DICTIONARY TEMPLATE [IF EXISTS] dict_template [CASCADE] ;

ALTER TEXT SEARCH DICTIONARY TEMPLATE dict_template RENAME TO newname;

CREATE TEXT SEARCH DICTIONARY dictname TEMPLATE dict_template [OPTION opt_text] ;

\dFt[+] [dict_template] - shows dictionary templates

postgres=# \dFt
                                                    List of fulltext dictionary's template
   Schema   |   Name    |           Init            |           Lexize            |                        Description
------------+-----------+---------------------------+-----------------------------+-----------------------------------------------------------
 pg_catalog | ispell    | pg_catalog.dispell_init   | pg_catalog.dispell_lexize   | Ispell dictionary template
 pg_catalog | simple    | pg_catalog.dsimple_init   | pg_catalog.dsimple_lexize   | simple dictionary: just lower case and check for stopword
 pg_catalog | snowball  | pg_catalog.dsnowball_init | pg_catalog.dsnowball_lexize | Snowball stemmer
 pg_catalog | synonym   | pg_catalog.dsynonym_init  | pg_catalog.dsynonym_lexize  | synonym dictionary: replace word by its synonym
 pg_catalog | thesaurus | pg_catalog.thesaurus_init | pg_catalog.thesaurus_lexize | Thesaurus template. Phrase by phrase substitution

Snowball options were changed - StopFile, Language ! Examples:

CREATE TEXT SEARCH DICTIONARY qq.stem_en TEMPLATE snowball 
OPTION '
 StopFile=dicts_data/english.stop,
 Language=russian'
;

ALTER TEXT SEARCH DICTIONARY qq.stem_en SET OPTION '
StopFile=dicts_data/english.stop,
Language=english'
;

CREATE TEXT SEARCH DICTIONARY qq.ispell_english TEMPLATE ispell
OPTION '
StopFile=dicts_data/english.stop,
AffFile=dicts_data/english-utf8.aff,
DictFile=dicts_data/english-utf8.dict
'
;