- > Company
- > Company Blog
- > Blog Detail
Lucene Searching Syntax Part 2
01.06.2010 15:56 ( 0 comments )by Andre Hollist
In last week's blog, we covered four of Lucene's search syntaxes, Fuzzy Searches, Wildcard Searches, Field Searches and Single Term Searches. In this concluding part another four of its syntaxes will be demonstrated.
Proximity Searches
Lucene's Proximity Searches provide a way of searching for words which are a specified distance away from each other. The tilde symbol is used at the end of the phrase followed by the distance of words between each other. For example, to find the words "Query" and "Parser" within 6 words of each other within a document, you would do; "Query Parser"~6.
Range Searches
Range Searches allow you to search documents where a value is between a set upper and lower bound. This is not only restricted to Date fields, but can also work with non date textual fields. It is also worth mentioning that the search will work including or exclusive of both the upper or lower bound.
To perform a date range search, the following syntax can be used: creation_date:[20050101 TO 20061201] to search for all files with a creation date between 1st Jan 2005 and 31st December 2006.
In order to perform a range search on none date fields; name:{Albert TO Fredrick}. Use the "{" to denote exclusive range queries. So the previous example will bring back all names between Albert and Fredrick but not including them. To bring back those two names do name:[Albert TO Fredrick].
Sorting is done in lexicographic order or in other words dictionary order.
Lucene Boosting
Boosting a search term allows Lucene to search documents with relevance level to the term you are boosting. The default boost factor for terms is 1 and while the boosting factor must be a positive number, it can be less that 1 so for example 0.3 can be set. The higher the number the more relevance you will put on the term.
In order to set boosting on a term, you can use the following "^" symbol followed by the number you wish to boost by at the end of the search term.
So for example to make the term "apache" more relevant in a search term of "apache lucene" the syntax; apache^6 lucene will be used.
Escaping Special Characters
Lucene supports escaping special characters that are part of the query syntax. These characters are:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
To escape these character use the \ before the character. For example to search for (1+3):4 use the query: \(1\+3\)\:4

Comments