[siren-user] Parsing User-Input Query Text
akmclell at lakeheadu.ca
Wed Jun 23 17:11:14 IST 2010
Thanks very much for your quick reply.
I'm currently performing undergraduate research as part of the NSERC USRA
program, and part of my work is to crawl publicly available metadata for
Google Gadgets, transform it into a semantic form, and index it for
searching. This is all sort of a pre-requisite for later portions of the
project to be feasibly accessible. Google provides a search engine for
publicly listed gadgets, but private gadget servers such as WSO2's don't
provide any real search capability for users to find locally-listed
gadgets. This presents a problem for organizations who want to provide more
than a token quantity of internal gadgets which for various reasons cannot
be publicly listed.
This project will almost certainly be released as open source once
complete. I was looking through the available objects in Siren and didn't
see any extension of Lucene's QueryParser, but I thought I should ask since
the SirenPhraseQuery's docs made mention of automatic use via the
QueryParser. What I am primarily hoping to support are queries such as "(tim
AND berners AND lee) OR timbl OR
which you have as an example for Sindice. The exact search terms would
differ when trying to locate a Google Gadget, but the idea is the same.
Right now I just have a super-basic stand-in which turns "term1 term2 term3"
into term1 OR term2 OR term3. If you're willing, I'd be very happy to take
a look at a snapshot of your Query Parser. I don't really need all of the
most advanced features Sindice provides since I'm working with a somewhat
more domain-specific dataset, but it sounds much better than starting from
scratch =). On the other hand, if the advanced features work on my dataset
without too much fiddling, there's certainly no harm in extra functionality
=). I would of course provide any feedback/bug reports/etc that I could.
Please let me know,
On Wed, Jun 23, 2010 at 10:55 AM, Renaud Delbru <renaud.delbru at deri.org>wrote:
> Hi Adam,
> On 23/06/10 13:51, Adam McLellan wrote:
>> Good morning,
>> Since there was a time lag between my submission of this message and my
>> list subscription approval, I just wanted to check if this has already been
>> responded to, but I missed the response. Sorry if this is not the case.
> Sorry, I didn't received your previous mail. See my answers below.
>> I am performing undergraduate research work involving semantic
>> data, and I am currently working to implement SIREn as a backend
>> for searching some RDF data I have generated. I am interested in
>> how best to process user-input search query text. I see from the
>> Sindice page
>> that fairly complex support for this has been implemented, and I
>> see a reference in the SirenPhraseQuery comments to Lucene's
>> QueryParser. However, when I tried to simply use a QueryParser
>> built with a TupleAnalyzer, I received no results for any
>> attempted query.
> The Lucene's Query Parser is not aware of the existence of the Siren*Query
> classes, so it cannot translate the user query into Siren*Query objects,
> only into Lucene's Query objects.
> You need to write your own query parser (or extend the one from Lucene) to
> build Siren*Query objects from the a user query input. For the moment, we
> don't have publicly available code for this yet. We have a query parser, but
> it needs a bit of work before being able to publish it as open source.
> Moreover, this query parser will be tied to the Sindice query syntax (if you
> need other functionalities, you will have to implement it yourself).
> There is maybe possibilities to provide you a snapshot of the query parser
> if you are interested (in exchange of some debug feedbacks ;o))
> Maybe you could tell me more about what kind of queries you want to
> support. Do you want that the user types queries using syntax such as the
> "advanced query" of Sindice ? This is normally reserved for machines, nor
> really for humans. Or do you want to support only full-text search queries ?
> In that case, maybe a simpler solution is possible.
> Renaud Delbru
> siren mailing list
> siren at lists.deri.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the siren