redis 添加索引 redis索引怎么实现的

转载

梦里忧郁 2023-08-15 14:06:52

文章标签 redis 添加索引 redis java spring 全文检索 文章分类 Redis 数据库

文章目录

前言
一、反向索引
二、实现代码

前言

数据库文章全文搜索中我们很少使用 like 查询，因为假设使用包含查询，我们需要对每个检索词每一篇文章的每个单词进行遍历，检索的时间复杂度则会达到 o(n三次方)。通常，我们会直接使用 elasticsearch 组件来实现全文检索，但我们很少了解全文检索的原理，今天我们尝试使用 redis 来构建一个具有基本检索功能的全文索引组件，将时间复杂度降低到 o(n * log2n)。
这里补充 redislabs 提供的全文检索插件：https://docs.redislabs.com/latest/modules/redisearch/

一、反向索引

在实现这个全文索引组件前，我们需要了解反向索引的数据结构，因为接下来我们将使用反向索引来构建全文检索。
相比正向索引（代表从文档查询到词），而反向索引与之相反则代表从词查询到文档。
下面举个例子：我们使用不同数字代表不同的文档，以下这就是一个正向索引:

0:“hello world”
 1:“hello men”
 2:“better world”

而反向索引则使用单词作为索引，而文档的数字作为被索引的元素，以下这就是一个反向索引：

“hello” ：{0,1}
 “world” ：{0,2}
 “men” ：{1}
 “better” ：{2}

如果要检索"hello men"，那么就可以对 {0,1},{1} 取交集，最终得到文档索引 1 。

二、实现代码

注：以下代码使用 Java 实现，但实际上没有用到 Java 语言的特殊实现，主要依靠于 redis 的 SET、ZSET 以及对应的取交集命令 SINTER、SINTERSTORE，如果使用其他语言可做参考

首先使用 spring-boot-starter-data-redis 简化配置，让我们专注于业务。实现代码如下：

package com.ch.demo.redis.search;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.ConfigurableApplicationContext;
import org.springframework.data.redis.core.StringRedisTemplate;

import java.util.*;
import java.util.stream.Collectors;

@SpringBootApplication
public class Application {

    private static ConfigurableApplicationContext context;
    private static StringRedisTemplate redisTemplate;

    private static Map<String, String> articles = new HashMap<>();

    static {
        // 初始化文章
        articles.put("article_id_1", "there are moments in life when you miss someone so much that you just want to pick them from your dreams and hug them for real");
        articles.put("article_id_2", "when you were born,you were crying and everyone around you was smiling.Live your life so that when you die,you're the one who is smiling and everyone around you is crying");
    }

    public static void main(String[] args) {
        context = SpringApplication.run(Application.class, args);
        redisTemplate = context.getBean(StringRedisTemplate.class);
        // 建立反向索引
        buildIndex();
        System.out.println(search("you were crying"));
        System.out.println(search("you"));
    }

    private static Set<String> search(String txt) {
        // 将搜索语句分词
        List<String> keys = Arrays.asList(txtSplit(txt));
        keys = keys.stream().map(o -> generateKey(o)).collect(Collectors.toList());
        // 对这些分词下的集合取交集
        Set<String> articleIds = redisTemplate.opsForSet().intersect(keys);
        return articleIds;
    }

    /**
     * 构建全文索引
     * <p>这里采用反向索引</p>
     */
    private static void buildIndex() {
        // 将所有文章分词，将每个分词在 redis 上建立 key 为分词值为文章id的集合
        articles.forEach((id, txt) -> {
            String[] keys = txtSplit(txt);
            for (String key : keys) {
                // 如果需要排序则可使用 zset 进行存储
                redisTemplate.opsForSet().add(generateKey(key), id);
            }
        });
    }

    /**
     * 分词器
     *
     * @return String[]
     */
    private static String[] txtSplit(String txt) {
        return txt.split(" ");
    }

    /**
     * redis key 拼接
     *
     * @return String
     */
    private static String generateKey(String key) {
        return "SEARCH:" + key;
    }

}

以上代码运行结果：

[article_id_2]
[article_id_2, article_id_1]

代表 you were crying 查询到文章 article_id_2 ，而 you 则查询到文章 article_id_2、article_id_1。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：redis 库的概念 redis有几个库

下一篇：redis的没存分布 redis内存不足

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯