5.7 全文检索语法

HybridDB的计算规格支持在SQL语法中便捷的分词+检索能力。需要说明的是,全文检索能力和SQL中的like是有区别的,这里的全文检索是指在SQL中搜索需求(需要分词,搜索匹配),而sql中的like是模糊匹配需求,只需要字符串列建索引就可以,目前HybridDB的like模糊匹配性能可以打到亿级别毫秒级响应。

全文检索列,底层实现是多值列,只是分词器为nlp。

1)建表

CREATE TABLE test3 (
    user_id BIGINT,
    city VARCHAR,
    text MULTIVALUE nlp_tokenizer 'ik' value_type 'varchar'
)
DISTRIBUTE BY HASH (user_id)
ENGINE = 'CSTORE'
INDEX_ALL='Y';

> 说明:
- 全文检索列,底层实现是多值列,只是分词器类型为nlp_tokenizer
- 默认的分词器为ElasticSearch的ik分词器

2)写入数据

insert into test3 values(1, 'HZ', '中华人民共和国');
insert into test3 values(2, 'BJ', 'HybridDB是新型HTAP数据库');
insert into test3 values(3, 'SH', 'hello, world');


mysql> select * from test3 order by user_id;
+---------+------+--------------------------------+
| user_id | city | text                           |
+---------+------+--------------------------------+
|       1 | HZ   | 中华人民共和国                 |
|       2 | BJ   | HybridDB是新型HTAP数据库       |
|       3 | SH   | hello, world                   |
+---------+------+--------------------------------+
3 rows in set (0.02 sec)

3)检索查询

mysql> select * from test3 where text in ('中华', '数据库');
+---------+------+--------------------------------+
| user_id | city | text                           |
+---------+------+--------------------------------+
|       1 | HZ   | 中华人民共和国                 |
|       2 | BJ   | HybridDB是新型HTAP数据库       |
+---------+------+--------------------------------+
2 rows in set (0.30 sec)


mysql> select * from test3 where text in ('hello') and city != 'HZ';
+---------+------+--------------+
| user_id | city | text         |
+---------+------+--------------+
|       3 | SH   | hello, world |
+---------+------+--------------+
1 row in set (0.93 sec)


mysql> select * from test3 where ref(text,0) in ('hybriddb');
+---------+------+-----------------------------------+
| user_id | city | text                              |
+---------+------+-----------------------------------+
|       2 | BJ   | HybridDB是新型HTAP数据库          |
+---------+------+-----------------------------------+
4 rows in set (0.02 sec)

# 普通字符串列也可以做like查找
mysql> select * from test3 where city like '%J%';
+---------+------+-----------------------------------+
| user_id | city | text                              |
+---------+------+-----------------------------------+
|       2 | BJ   | HybridDB是新型HTAP数据库          |
+---------+------+-----------------------------------+
4 rows in set (0.02 sec)

4)注意

  • 全文检索的英文单词,默认都是小写归一化处理。

results matching ""

    No results matching ""