[siren-user] Parsing User-Input Query Text
renaud.delbru at deri.org
Tue Jun 29 14:50:28 IST 2010
sorry for the late reply, see my comments below.
On 23/06/10 17:11, Adam McLellan wrote:
> This project will almost certainly be released as open source once
> complete. I was looking through the available objects in Siren and
> didn't see any extension of Lucene's QueryParser, but I thought I
> should ask since the SirenPhraseQuery's docs made mention of automatic
> use via the QueryParser.
Indeed, small problem in the doc, thanks for reporting this.
> What I am primarily hoping to support are queries such as "(tim AND
> berners AND lee) OR timbl OR http://www.w3.org/People/Berners-Lee/card
> which you have as an example for Sindice. The exact search terms
> would differ when trying to locate a Google Gadget, but the idea is
> the same.
Ok. The first question I will ask you is: do you really need SIREn
features to do this ?
I am asking this because the query you are providing is something that
Lucene supports. If your queries will be keyword-based only or if your
data schema is relatively small, then Lucene could be a better choice.
SIREn adds additional features to Lucene for managing large amount of
heterogeneous data in a efficient way. However, if your data is not that
heterogeneous, or relatively small, then Lucene will do the job. In
addition, you'll be able to use the original QueryParser of Lucene out
of the box.
Also, SIREn will be a better choice if you are expecting a group of
keywords to match a specific value. For example, if you want to restrict
(tim AND berners AND lee) to match only one value (or object in an RDF
triple), then SIREn could be a better choice.
If you need some indications on how to index RDF data using Lucene, just
asks your questions. I can help. We can also discuss your use cases or
scenario to see if Lucene or SIREn fits better.
More information about the siren