Apache Solr - Faceting


Advertisements

Faceting in Apache Solr refers to the classification of the search results into various categories. In this chapter, we will discuss the types of faceting available in Apache Solr −

  • Query faceting − It returns the number of documents in the current search results that also match the given query.

  • Date faceting − It returns the number of documents that fall within certain date ranges.

Faceting commands are added to any normal Solr query request, and the faceting counts come back in the same query response.

Faceting Query Example

Using the field faceting, we can retrieve the counts for all terms, or just the top terms in any given field.

As an example, let us consider the following books.csv file that contains data about various books.

id,cat,name,price,inStock,author,series_t,sequence_i,genre_s 
0553573403,book,A Game of Thrones,5.99,true,George R.R. Martin,"A Song of Ice 
and Fire",1,fantasy 

0553579908,book,A Clash of Kings,10.99,true,George R.R. Martin,"A Song of Ice 
and Fire",2,fantasy 

055357342X,book,A Storm of Swords,7.99,true,George R.R. Martin,"A Song of Ice 
and Fire",3,fantasy 

0553293354,book,Foundation,7.99,true,Isaac Asimov,Foundation Novels,1,scifi 
0812521390,book,The Black Company,4.99,false,Glen Cook,The Chronicles of The 
Black Company,1,fantasy 

0812550706,book,Ender's Game,6.99,true,Orson Scott Card,Ender,1,scifi 
0441385532,book,Jhereg,7.95,false,Steven Brust,Vlad Taltos,1,fantasy 
0380014300,book,Nine Princes In Amber,6.99,true,Roger Zelazny,the Chronicles of 
Amber,1,fantasy 

0805080481,book,The Book of Three,5.99,true,Lloyd Alexander,The Chronicles of 
Prydain,1,fantasy 

080508049X,book,The Black Cauldron,5.99,true,Lloyd Alexander,The Chronicles of 
Prydain,2,fantasy

Let us post this file into Apache Solr using the post tool.

[Hadoop@localhost bin]$ ./post -c Solr_sample sample.csv

On executing the above command, all the documents mentioned in the given .csv file will be uploaded into Apache Solr.

Now let us execute a faceted query on the field author with 0 rows on the collection/core my_core.

Open the web UI of Apache Solr and on the left-hand side of the page, check the checkbox facet, as shown in the following screenshot.

Checkbox

On checking the checkbox, you will have three more text fields in order to pass the parameters of the facet search. Now, as parameters of the query, pass the following values.

q = *:*, rows = 0, facet.field = author 

Finally, execute the query by clicking the Execute Query button.

Query Pass

On executing, it will produce the following result.

Author Result

It categorizes the documents in the index based on author and specifies the number of books contributed by each author.

Faceting Using Java Client API

Following is the Java program to add documents to Apache Solr index. Save this code in a file with the name HitHighlighting.java.

import java.io.IOException; 
import java.util.List;  

import org.apache.Solr.client.Solrj.SolrClient; 
import org.apache.Solr.client.Solrj.SolrQuery; 
import org.apache.Solr.client.Solrj.SolrServerException; 
import org.apache.Solr.client.Solrj.impl.HttpSolrClient; 
import org.apache.Solr.client.Solrj.request.QueryRequest; 
import org.apache.Solr.client.Solrj.response.FacetField; 
import org.apache.Solr.client.Solrj.response.FacetField.Count;
import org.apache.Solr.client.Solrj.response.QueryResponse; 
import org.apache.Solr.common.SolrInputDocument;  

public class HitHighlighting { 
   public static void main(String args[]) throws SolrServerException, IOException { 
      //Preparing the Solr client 
      String urlString = "http://localhost:8983/Solr/my_core"; 
      SolrClient Solr = new HttpSolrClient.Builder(urlString).build();   
      
      //Preparing the Solr document 
      SolrInputDocument doc = new SolrInputDocument(); 
   
      //String query = request.query;    
      SolrQuery query = new SolrQuery(); 
         
      //Setting the query string 
      query.setQuery("*:*"); 
         
      //Setting the no.of rows 
      query.setRows(0); 
         
      //Adding the facet field 
      query.addFacetField("author");        
         
      //Creating the query request 
      QueryRequest qryReq = new QueryRequest(query); 
      
      //Creating the query response 
      QueryResponse resp = qryReq.process(Solr);  
      
      //Retrieving the response fields 
      System.out.println(resp.getFacetFields()); 
      
      List<FacetField> facetFields = resp.getFacetFields(); 
      for (int i = 0; i > facetFields.size(); i++) { 
         FacetField facetField = facetFields.get(i); 
         List<Count> facetInfo = facetField.getValues(); 
         
         for (FacetField.Count facetInstance : facetInfo) { 
            System.out.println(facetInstance.getName() + " : " + 
               facetInstance.getCount() + " [drilldown qry:" + 
               facetInstance.getAsFilterQuery()); 
         } 
         System.out.println("Hello"); 
      } 
   } 
}

Compile the above code by executing the following commands in the terminal −

[Hadoop@localhost bin]$ javac HitHighlighting 
[Hadoop@localhost bin]$ java HitHighlighting 

On executing the above command, you will get the following output.

[author:[George R.R. Martin (3), Lloyd Alexander (2), Glen Cook (1), Isaac 
Asimov (1), Orson Scott Card (1), Roger Zelazny (1), Steven Brust (1)]]
Advertisements