ts changes

Summary of changes

patch v. 0.58

TEXT SEARCH CONFIGURATION

1. GUC variable default_text_search_config defines at initdb time

  • explicitly, using -T or --text-search-config option
  • implicitly, using language part of locale name. For example, ru_RU.UTF-8 → default_text_search_config=russian

2. at runtime, if text search configuration doesn't specified, it determined from GUC variable default_text_search_config, if it doesn't defined, then an error is raised.

3. CREATE TEXT SEARCH CONFIGURATION doesn't have LOCALE option and DEFAULT flag, see above. So, syntax became simpler:

CREATE TEXT SEARCH CONFIGURATION cfgname [PARSER prsname ] [ LIKE template_cfg [WITH MAP] ];

CREATE TEXT SEARCH CONFIGURATION public.pg LIKE english WITH MAP;

4. CREATE/ALTER/DROP … MAPPING removed, now MAPPING is a part of CREATE TEXT SEARCH CONFIGURATION command:

ALTER TEXT SEARCH CONFIGURATION public.pg (ADD|ALTER) MAPPING FOR lword, lhword, lpart_hword WITH english;

ALTER TEXT SEARCH CONFIGURATION public.pg DROP MAPPING FOR lword, lhword, lpart_hword;

ALTER TEXT SEARCH CONFIGURATION cfgname RENAME TO newcfgname;

ALTER TEXT SEARCH CONFIGURATION cfgname SET PARSER prsname;

5. There are now 16 default text search configurations

postgres=# \dF
                 List of fulltext configurations
   Schema   |    Name    |              Description
------------+------------+---------------------------------------
 pg_catalog | danish     | Configuration for danish language
 pg_catalog | dutch      | Configuration for dutch language
 pg_catalog | english    | Configuration for english language
 pg_catalog | finnish    | Configuration for finnish language
 pg_catalog | french     | Configuration for french language
 pg_catalog | german     | Configuration for german language
 pg_catalog | hungarian  | Configuration for hungarian language
 pg_catalog | italian    | Configuration for italian language
 pg_catalog | norwegian  | Configuration for norwegian language
 pg_catalog | portuguese | Configuration for portuguese language
 pg_catalog | romanian   | Configuration for romanian language
 pg_catalog | russian    | Configuration for russian language
 pg_catalog | simple     | Simple configuration
 pg_catalog | spanish    | Configuration for spanish language
 pg_catalog | swedish    | Configuration for swedish language
 pg_catalog | turkish    | Configuration for turkish language

DICTIONARIES

Dictionaries expect files in UTF-8 encoding !!!!

Snowball dictionaries were renamed - english, russian,……

postgres=# \dFd
                            List of fulltext dictionaries
   Schema   |    Name    |                        Description
------------+------------+-----------------------------------------------------------
 pg_catalog | danish     | Snowball stemmer for danish language
 pg_catalog | dutch      | Snowball stemmer for dutch language
 pg_catalog | english    | Snowball stemmer for english language
 pg_catalog | finnish    | Snowball stemmer for finnish language
 pg_catalog | french     | Snowball stemmer for french language
 pg_catalog | german     | Snowball stemmer for german language
 pg_catalog | hungarian  | Snowball stemmer for hungarian language
 pg_catalog | italian    | Snowball stemmer for italian language
 pg_catalog | norwegian  | Snowball stemmer for norwegian language
 pg_catalog | portuguese | Snowball stemmer for portuguese language
 pg_catalog | romanian   | Snowball stemmer for romanian language
 pg_catalog | russian    | Snowball stemmer for russian language
 pg_catalog | simple     | simple dictionary: just lower case and check for stopword
 pg_catalog | spanish    | Snowball stemmer for spanish language
 pg_catalog | swedish    | Snowball stemmer for swedish language
 pg_catalog | turkish    | Snowball stemmer for turkish language

Due to security consideration now we have DICTIONARY TEMPLATE, which could be changed only by superuser, but visible to public. User can create DICTIONARY using templates.

CREATE TEXT SEARCH DICTIONARY TEMPLATE dict_template LEXIZE lexize_function [INIT init_function] ;

DROP TEXT SEARCH DICTIONARY TEMPLATE [IF EXISTS] dict_template [CASCADE] ;

ALTER TEXT SEARCH DICTIONARY TEMPLATE dict_template RENAME TO newname;

CREATE TEXT SEARCH DICTIONARY dictname TEMPLATE dict_template [OPTION opt_text] ;

\dFt[+] [dict_template] - shows dictionary templates

postgres=# \dFt
                                                    List of fulltext dictionary's template
   Schema   |   Name    |           Init            |           Lexize            |                        Description
------------+-----------+---------------------------+-----------------------------+-----------------------------------------------------------
 pg_catalog | ispell    | pg_catalog.dispell_init   | pg_catalog.dispell_lexize   | Ispell dictionary template
 pg_catalog | simple    | pg_catalog.dsimple_init   | pg_catalog.dsimple_lexize   | simple dictionary: just lower case and check for stopword
 pg_catalog | snowball  | pg_catalog.dsnowball_init | pg_catalog.dsnowball_lexize | Snowball stemmer
 pg_catalog | synonym   | pg_catalog.dsynonym_init  | pg_catalog.dsynonym_lexize  | synonym dictionary: replace word by its synonym
 pg_catalog | thesaurus | pg_catalog.thesaurus_init | pg_catalog.thesaurus_lexize | Thesaurus template. Phrase by phrase substitution

Snowball options were changed - StopFile, Language !

Examples:

CREATE TEXT SEARCH DICTIONARY qq.stem_en TEMPLATE snowball 
OPTION '
 StopFile=dicts_data/english.stop,
 Language=russian'
;

ALTER TEXT SEARCH DICTIONARY qq.stem_en SET OPTION '
StopFile=dicts_data/english.stop,
Language=english'
;

CREATE TEXT SEARCH DICTIONARY qq.ispell_english TEMPLATE ispell
OPTION '
StopFile=dicts_data/english.stop,
AffFile=dicts_data/english-utf8.aff,
DictFile=dicts_data/english-utf8.dict
'
;