[siren-user] Exact cell query?
Jeroen Steggink
jeroen at stegg-inc.com
Mon Nov 22 13:53:29 GMT 2010
Hi Renaud,
Adding hidden tokens does not solve the problem in this case. I still want
to be able to find the other matches too, but with a lower score. If that
query can do that, then that's exactly what I need.
Using Solr (I'm using the 0.2.1 SNAPSHOT), how do I execute the query? What
query parser do I use?
tuple(label, exact_cell(a)) OR tuple(label, a)
Cheers,
Jeroen
>Hi Jeroen,
>
>On 21/11/10 21:49, Jeroen Steggink wrote:
>> Hi Mike and Renaud,
>>
>> I experienced a similar problem like yours Mike.
>>
>> When searching using the Solr implementation several questions have
>> arisen.
>>
>> Let's say I have the following 3 documents.
>>
>> doc1:
>> url1 label "a"
>> url1 label "b"
>> url1 label "c"
>>
>> doc2:
>> url2 label "a b"
>> url2 label "a c"
>> url2 label "a a c b"
>>
>> doc3:
>> url3 label "a c"
>>
>> When searching for the term "a", doc2 will get a higher score than
>> doc1 and doc3 will have the same score as doc1.
>>
>> Firstly, when searching for a term, the documents with multiple
>> occurrence of that term will get a higher score than the documents
>> that only have one occurence and one term in total. Normally I would
>> prefer this. However, in this case I'm not interested in the matches
>> over the whole document, but the match in one triple.
>>
>> Secondly, higher scores for an exact match than a not-exact match is
>> not possible. I would like doc1 to have a higher score than doc2 and
>> doc3, since doc1 has an exact match in the first triple.
>Yes, since exact match is not implement yet. As I explained previously,
>it will be overkill to add in the index the necessary information to
>answer such queries (for the details, I need to store one integer per
>term occurence).
>In the future, it will be possible, when I will move from the Lucene's
>Payload interface (not very efficient and compcat) to our own index data
>structure.
>For the moment, you could try the trick of adding hidden tokens at the
>beginning and end of the cell. Therefore, you will be able to emulate
>exact cell query, and in addition, if you query:
>tuple(label, exact_cell(a)) OR tuple(label, a)
>then the scoring mechanism will rank higher doc 1 than doc 3. however, I
>am not sure if doc 1 will be rank higher than doc 3. If doc 3 is ranked
>higher, you could add a negative boost to the clause tuple(label, a).
>
>hope this helps,
>cheers
>--
>Renaud Delbru
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.deri.org/pipermail/siren/attachments/20101122/433b2556/attachment.htm>
More information about the siren
mailing list