Solr configuration

March 22, 2007
 by 

Solr is quite easy to setup once you understand it. It is much like any other database setup. So given the following the table, we have to mimic something similar in Solr. It need to know what is “stored” versus what is “indexed” as well as facets and many other options. I will explain more later. Here is my MySQL table representing ‘things’.

things-table.gif

Okay. Simple enough right? So we want to do here is store most of this data; however, for facets we dont really need to store them. What does that mean? Well we want to be able to search through them and index them but when we ask for all columns of a given field it wont return this field. This will come into play later when I discuss faceting. Here is the schema.xml file for the above MySQL table:

<field name=”id” type=”string” indexed=”true” stored=”true”/>
<field name=”name” type=”text” indexed=”true” stored=”true”/>
<field name=”fileName” type=”text” indexed=”false” stored=”true”/>
<field name=”tags” type=”text” indexed=”true” stored=”true” multiValued=”true”/>
<field name=”isBackground” type=”text” indexed=”false” stored=”true”/>
<field name=”dateCreated” type=”text” indexed=”false” stored=”true”/>
<field name=”dateModified” type=”date” indexed=”false” stored=”true”/>
<!– for faceting –>
<field name=”tagsFaceted” type=”string” indexed=”true” stored=”false” multiValued=”true”/>


Now that we are complete with our basic table structure we have to tell Solr a few things about our index. It wants to know the primary key, or in Solr terms, the unique key. Also we have to tell it our default search field.


<uniqueKey>id</uniqueKey>
<defaultSearchField>tags</defaultSearchField>


Now we have one more very important variable that we have to tackle if want proper faceting on tags. We have to make sure that any time we write to the tags text field we also write to the tags string field. The difference is that the ‘tags’ field is stemmed, i.e. searching for ‘kids’ returns ‘kid’ and so forth. The ‘tagsFaceted’ field will return the whole words. One is human readable and the other is for the machines.


<copyField source=”tags” dest=”tagsFaceted”/>


Categories: Computers, Software


Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Copyright © 2005-2011 John Clarke Mills

Wordpress theme is open source and available on github.