首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 开发语言 > 编程 >

sunspot使用(二)

2013-11-23 
sunspot使用(2)??? ??? filter synonymssynonyms.txt ignoreCasetrue expandfalse/????? /ana

sunspot使用(2)
??? ??? <filter synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
????? </analyzer>
????? <analyzer type='query'>
??? ??? <charFilter isMaxWordLength="true"/>
??? ??? <filter synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
????? </analyzer>
上述代码分为索引(index)和查询(query),对词语进行处理,分为五步:
(1)<charFilter isMaxWordLength="true"/>? IK分词;isMaxWordLength设置是否最大词长切分;
(4)<filter synonyms="synonyms.txt" ignoreCase="true" expand="false"/>? 设置同义词词库;
??? synonyms格式:pvc => 聚氯乙烯
更多过滤分析条件见:http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

IKAnalyzer.cfg.xml 配置:
文件路径(测试机上):/opt/ruby-enterprise-1.8.7/lib/ruby/gems/1.8/gems/sunspot-1.2.1/solr/webapps/solr/WEB-INF/classes
文档: http://ik-analyzer.googlecode.com/files/IKAnalyzer%E4%B8%AD%E6%96%87%E5%88%86%E8%AF%8D%E5%99%A8V3.2.8%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C.pdf
http://linliangyi2007.iteye.com/blog/501228
??? 自定义词库:
<properties>
???? ??? <entry key="ext_dict">/mydict.dic;/com/mycompany/dic/mydict2.dic;</entry>
</properties>
注意:mydict.dic与IKAnalyzer.cfg.xml同目录
屏蔽词配置:
<properties>
???? <entry key="ext_stopwords">/ext_stopword.dic</entry>
</properties>
注意:ext_stopword.dic与IKAnalyzer.cfg.xml同目录


三、索引方法:
?Model索引定义(以price为例):
?? searchable (:auto_index => true,:auto_remove => true) do
??? ?? text :full_name, :stored => true
??? string :company_name, :stored => true
??? string :brand_name, :stored => true do
??? ??? self.try(:product).try(:brand).try(:name)
??? end
???? integer :brand_id,:references => Brand, :stored => true
???? integer :company_id,:references => Company, :stored => true
???? integer :attr_value_ids, :references => BaseMaterialTypeAttrValue, :multiple => true, :stored => true
???? time :updated_at,:trie => true, :stored => true
? ?end

其中full_name、company_name等是自定义的函数,返回相应的值;自定义索引函数中尽量全用try,否则索引过程中会出现索引错误,例如brand_name

?索引定义参数介绍:
?(1)text :表示可以部分匹配,索引时是按分词索引
?(2)string:完全匹配,被完整索引
?(3)integer,long:整数
?(4)float,double:浮点数
?(5)time:data/time,相当于ruby的time class
?(6)boolean:true、false
?属性字段的几个参数:
?(1)multiple:索引多个值
?(2)references:引用类
?(3)stored:是否存储到索引
?(4)trie:Boolean: Numeric and time fields only;能使进行range搜索时速度更快。
?

实时建立索引方法:
(1)Sunspot.index(Objects)
???? Sunspot.commit
(2)Sunspot.index!(Objects)
(3)删除索引Sunspot.remove (*objects) ,然后Sunspot.commit?????
(4)删除索引Sunspot.remove! (*objects)
(5)在索引函数中加入searchable (:auto_index => true,:auto_remove => true)? do 也可实时索引;

四、搜索:
?两种搜索写法,例如:
(1)@results = Sunspot.new_search(Price)
@results.build { keywords(params[:search][:value], :query_phrase_slop => 1000, :phrase_slop => 1000, :highlight => true, :exclude_fields => [:search_key,:notes], :boost_fields => [:full_name => 5, :description => 4, :base_material_type_description => 3])}
@results.build { with(:status).equal_to("released")}
@results.build { without(:product_id).equal_to(nil)}
@results.build { order_by(:score,:desc) }
@results.build { order_by(:updated_at,:desc) }
@results.build { facet(:brand_id) }
@results.build { facet(:attr_value_ids) }
@results.execute!
(2)Sunspot.search(Price) do
keywords(params[:search][:value], :fields => [:full_name, :description], :highlight => true, :minimum_match => 0)
? ? ??? ??? with(:updated_at).greater_than(30.days.ago.to_time)
paginate(:page => params[:page] || 1,:per_page => session[:per_page])
facet(:attr_value_ids)
???? ?end

限制方法有:
?(1)with
?(2)without
?(3)order_by
?(4)facet:面搜索
?(5)paginate:传入rails中的will paginate分页参数
?限制条件有:
?(1)equal_to
?(2)less_than
?(3)greater_than
?(4)between
?(5)any_of? 匹配任意一项,值为数组,如with(:attr_value_ids).any_of(params[:search][:attr_value_ids])
?(6)all_of? 匹配全部,值为数组
??? Any_of 和 all_of 可以分别表示or 、and逻辑,可以互相包含,如:
??? any_of do
??? ??? with(:expired_at).greater_than(Time.now)
??? ??? with(:expired_at, nil)
??? ??? all_of do
????? ??? ??? with(:publshed_at).less_than(Time.now)
????? ??? ??? with(:author_id).equal_to(999)
??? ??? end
? ??? end

keyword的其它参数详见:http://outoftime.github.com/sunspot/docs/classes/Sunspot/DSL/Fulltext.html



五、页面展示部分:
调用检索结果:
??? (1)@search.hits?? 索引的数据,直接从所用文档中得到数据
??? (2)@search.results? 得到的数据集是从数据库中反查出来的数据
??? (3)@search.each_hit_with_result? do? |hit, result| ,同上
??? (4)hit.result? 可用hit从数据库中查出对象
??? (5)hit.score? 匹配度,搜索时可按score排序
??? (6)hit.stored(:name)?? 直接从索引文档中取得存储的参数
??? (7)与will_paginate的结合:will_paginate(@search.hits)
(8)高亮显示关键词:如
??? ??? Search方法中搜索关键词语句中 加 :highlight => true
??? 页面调用:hit.highlight(:body).format { |fragment| content_tag(:em, fragment) }
??? ??? 遇到的问题:页面调用hit.highlight(:body),如果body字段
含有所检索关键词时,hit.highlight(:body)为空会报错,需要对highlight(:body)进行判断,如果为空则调用hit.stored(:body)显示。
??? (9)分面搜索: 调用方法
@search.facet(:attr_value_ids).rows?? do? |row|
??? 值 row.value
??? 对应记录数量row.count
end






solr filter 文档
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

sunspot:
http://outoftime.github.com/sunspot/docs/classes/Sunspot/DSL/Fulltext.html
http://outoftime.github.com/sunspot/docs/classes/Sunspot/Search/AbstractSearch.html

solr dismax格式Besides templates for weekdays you can also have arbitrary named templates.
For example you might want to have a template for "Meeting" or "Journey".
All templates must reside in the directory "/home/zhaol-a/.rednotebook/templates".

The template button gives you the options to create a new template or to visit the templates directory.

If you come up with templates that could be useful for other people as well, I would appreciate if you sent me your template file, so others can benefit from it.

热点排行