zombodb 得分以及高光

得分以及高光在搜索引擎中有很重要的作用

得分zdb.score

  • 使用方法
zdb.score(tid)
  • 参考示例
SELECT zdb.score(ctid), * 
     FROM products 
    WHERE products ==> 'sports box' 
 ORDER BY score desc;

结果:

score | id | name | keywords | short_summary | long_description | price | inventory_count | discontinued | availability_date
----------+----+----------+--------------------------------------+--------------------------------+-------------------------------------------------------------------------------------+-------+-----------------+--------------+-------------------
  1.06561 | 4 | Box | {wooden,box,"negative space",square} | Just an empty box made of wood | A wooden container that will eventually rot away. Put stuff it in (but not a cat). | 17000 | 0 | t | 2015-07-01
 0.723777 | 2 | Baseball | {baseball,sports,round} | It's a baseball | Throw it at a person with a big wooden stick and hope they don't hit it | 1249 | 2 | f | 2015-08-21
(2 rows)
  • 说明
    对于进行使用得分进行需要使用dsl 的dsl.min_score()

高光highlight 函数

  • 方法签名
zdb.highlight(tid, fieldname [, json_highlight_descriptor]) RETURNS text[]
  • 使用
SELECT zdb.score(ctid), zdb.highlight(ctid, 'long_description'), long_description 
      FROM products 
     WHERE products ==> 'wooden person' 
  ORDER BY score desc;

结果:

 score | highlight | long_description
----------+--------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------
 0.914156 | {"Throw it at a <em>person</em> with a big <em>wooden</em> stick and hope they don't hit it"} | Throw it at a person with a big wooden stick and hope they don't hit it
 0.243605 | {"A <em>wooden</em> container that will eventually rot away. Put stuff it in (but not a cat)."} | A wooden container that will eventually rot away. Put stuff it in (but not a cat).
(2 rows)

自定义高光函数编写

主要是基于zdb 提供的函数

  • 官方提供的一个高光函数
CREATE TYPE esqdsl_highlight_type AS ENUM ('unified', 'plain', 'fvh');
CREATE TYPE esqdsl_fragmenter_type AS ENUM ('simple', 'span');
CREATE TYPE esqdsl_encoder_type AS ENUM ('default', 'html');
CREATE TYPE esqdsl_boundary_scanner_type AS ENUM ('chars', 'sentence', 'word');

FUNCTION highlight(
    type zdb.esqdsl_highlight_type DEFAULT NULL,
    require_field_match boolean DEFAULT false,
    number_of_fragments int DEFAULT NULL,
    highlight_query zdbquery DEFAULT NULL,
    pre_tags text[] DEFAULT NULL,
    post_tags text[] DEFAULT NULL,
    tags_schema text DEFAULT NULL,
    no_match_size int DEFAULT NULL,

    fragmenter zdb.esqdsl_fragmenter_type DEFAULT NULL,
    fragment_size int DEFAULT NULL,
    fragment_offset int DEFAULT NULL,
    force_source boolean DEFAULT true,
    encoder zdb.esqdsl_encoder_type DEFAULT NULL,
    boundary_scanner_locale text DEFAULT NULL,
    boundary_scan_max int DEFAULT NULL,
    boundary_chars text DEFAULT NULL,
    phrase_limit int DEFAULT NULL,

    matched_fields boolean DEFAULT NULL,
    "order" text DEFAULT NULL) 
RETURNS json
  • 使用
SELECT zdb.score(ctid), 
       zdb.highlight(ctid, 
                     'long_description', 
                     zdb.highlight(pre_tags=>'{<b>}', post_tags=>'{</b>}')
                    ),
       long_description                                             
 FROM products
WHERE products ==> 'wooden person'
ORDER BY score desc;

  score | highlight | long_description
----------+------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------
 0.914156 | {"Throw it at a <b>person</b> with a big <b>wooden</b> stick and hope they don't hit it"} | Throw it at a person with a big wooden stick and hope they don't hit it
 0.243605 | {"A <b>wooden</b> container that will eventually rot away. Put stuff it in (but not a cat)."} | A wooden container that will eventually rot away. Put stuff it in (but not a cat).
(2 rows)

参考资料

https://github.com/zombodb/zombodb/blob/master/SCORING-HIGHLIGHTING.md
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-highlighting.html