Issue with binding values from sub selection in Jena ARQ

73 views Asked by At

I want to run the following simple testing query:

PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX  vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT  ?givenName ?name_count ?temp
WHERE
  { BIND(if(( ?name_count = 2 ), "just two", "definitely not 2") AS ?temp)
    { SELECT DISTINCT  ?givenName (COUNT(?givenName) AS ?name_count)
      WHERE
        { ?y  vcard:Family  ?givenName }
      GROUP BY ?givenName
    }
  }

The graph I am querying is this from the tutorial https://jena.apache.org/tutorials/sparql_data.html:

@prefix vCard:   <http://www.w3.org/2001/vcard-rdf/3.0#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<http://somewhere/MattJones/>  vCard:FN   "Matt Jones" .
<http://somewhere/MattJones/>  vCard:N    _:b0 .
_:b0  vCard:Family "Jones" .
_:b0  vCard:Given  "Matthew" .


<http://somewhere/RebeccaSmith/> vCard:FN    "Becky Smith" .
<http://somewhere/RebeccaSmith/> vCard:N     _:b1 .
_:b1 vCard:Family "Smith" .
_:b1 vCard:Given  "Rebecca" .

<http://somewhere/JohnSmith/>    vCard:FN    "John Smith" .
<http://somewhere/JohnSmith/>    vCard:N     _:b2 .
_:b2 vCard:Family "Smith" .
_:b2 vCard:Given  "John"  .

<http://somewhere/SarahJones/>   vCard:FN    "Sarah Jones" .
<http://somewhere/SarahJones/>   vCard:N     _:b3 .
_:b3 vCard:Family  "Jones" .
_:b3 vCard:Given   "Sarah" .

Now the problem is that running it with Jena:

Query query = QueryFactory.create(theAboveQueryAsString);
QueryExecution qexec = QueryExecutionFactory.create(query, theAboveGraphmodel);
ResultSet execSel = qexec.execSelect();
ResultSetRewindable results = ResultSetFactory.copyResults(execSel);;
ResultSetFormatter.out(System.out, results, query);

gives off this result in console:

----------------------------------
| givenName | name_count | temp  |
==================================
| "Smith"   | 2          |       |
| "Jones"   | 2          |       |
----------------------------------

having the temp values as null.

On the other hand running the same query on the the same graph in Ontotext GraphDb enviroment i get the correct result (saved as CSV):

givenName  |  name_count  |  temp
------------------------------------
Jones      |       2      |  just two
Smith      |       2      |  just two

Could there be a bug in the ARQ engine or am I missing something? Thanks in advance.

I am using jena-arq 3.12.0 Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Eclipse Version Version: 2019-06 (4.12.0)

1

There are 1 answers

0
AndyS On

There is a join between BIND and sub-select. The arguments to the join step are calculated before the join is done. So the BIND is evaluated, the sub-select is evaluated separately and the results joined. ?name_count isn't set in the BIND assignment. If you move it after the sub-select, it will apply to the results of the sub-select.

BIND adds a binding to the result of the pattern before it.

(base <http://example/base/>
  (prefix ((rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
           (vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>))
    (project (?givenName ?name_count ?temp)
      (join
        (extend ((?temp (if (= ?name_count 2) "just two" "definitely not 2")))
          (table unit))
        (distinct
          (project (?givenName ?name_count)
           (extend ((?name_count ?.0))
             (group (?givenName) ((?.0 (count ?givenName)))
               (bgp (triple ?y vcard:Family ?givenName))))))))))

Here, the (extend...) is one of two argument to the (join...). (table unit) is the "nothing" before the BIND.

If put afterwards, the algebra is:

(base <http://example/base/>
  (prefix ((rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
           (vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>))
    (project (?givenName ?name_count ?temp)
      (extend ((?temp (if (= ?name_count 2) "just two" "definitely not 2")))
        (distinct
          (project (?givenName ?name_count)
            (extend ((?name_count ?.0))
              (group (?givenName) ((?.0 (count ?givenName)))
                (bgp (triple ?y vcard:Family ?givenName))))))))))

and the extend (which is from the syntax BIND) is working on the (distinct ... of the sub-query.