导入数据时的注意事项
在笔记 2 中,可能在执行导入时会报错,那是因为还需要将 mysql-connector-java-xxx.jar
放入 solr-xxx/server/lib
文件夹下;
自动增量更新
- 将
solr-dataimport-scheduler.jar
放入solr-xxx/server/solr-webapp/webapp/WEB-INF/lib
文件夹下;
- 在 ``solr-xxx/server/solr-webapp/webapp/WEB-INF/web.xml` 中配置监听;
<listener>
<listener-class>
org.apache.solr.handler.dataimport.scheduler.ApplicationListener
</listener-class>
</listener>
- 在
solr-xxx/server/solr/
下新建文件夹conf
,注意不是solr-xxx/server/solr/weibo/
中的conf
;
- 从
solr-data-importscheduler.jar
中提取出dataimport.properties
放入上一步创建的conf
文件夹中,并根据自己的需要进行修改;比如我的配置如下;
# dataimport.properties example
#
# From this example, copy everything bellow "dataimport scheduler properties" to your
# dataimport.properties file and then change params to fit your needs
#
# IMPORTANT:
# Regardless of whether you have single or multiple-core Solr,
# use dataimport.properties located in your solr.home/conf (NOT solr.home/core/conf)
# For more info and context see here:
# http://wiki.apache.org/solr/DataImportHandler#dataimport.properties_example
#Tue Jul 21 12:10:50 CEST 2010
# metadataObject.last_index_time=2010-09-20 11\:12\:47
# last_index_time=2010-09-20 11\:12\:47
#################################################
# #
# dataimport scheduler properties #
# #
#################################################
# to sync or not to sync
# 1 - active; anything else - inactive
syncEnabled=1
# which cores to schedule
# in a multi-core environment you can decide which cores you want syncronized
# leave empty or comment it out if using single-core deployment
syncCores=weibo
# solr server name or IP address
# [defaults to localhost if empty]
server=localhost
# solr server port
# [defaults to 80 if empty]
port=8983
# application name/context
# [defaults to current ServletContextListener's context (app) name]
webapp=solr
# URL params [mandatory]
# remainder of URL
params=/dataimport?command=delta-import&clean=false&commit=true
# schedule interval
# number of minutes between two runs
# [defaults to 30 if empty]
# 自动增量更新时间间隔,单位为 min,默认为 30 min
interval=5
# 重做索引时间间隔,单位 min,默认 7200,即 5 天
reBuildIndexInterval = 7200
# 重做索引的参数
reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true
# 重做索引时间间隔的开始时间
reBuildIndexBeginTime=1:30:00
总结
到此,我们就可以实现数据库自动增量导入了;