{ "" /> { "" /> { ""/>

Is it possible to directly run SPARQL query against webpages with JSON-LD data?

46 views Asked by At

For example, this page https://www.bobdc.com/blog/json-ld/ , when viewing page source, there is:

<html>
    <head>
      <script type="application/ld+json">
    {
        "@context" : "http://schema.org",
        "@type" : "BlogPosting",
        "mainEntityOfPage": {
             "@type": "WebPage",
             "@id": "https:\/\/www.bobdc.com\/"
        },
        "articleSection" : "blog",
        "name" : "Exploring JSON-LD",
        "headline" : "Exploring JSON-LD",
        "description" : "And of course, querying it with SPARQL.",
        "inLanguage" : "en",
        "author" : "Bob DuCharme",
        "creator" : "",
        "publisher": "",
        "accountablePerson" : "",
        "copyrightHolder" : "",
        "copyrightYear" : "2019",
        "datePublished": "2019-04-21 11:20:00 \u002b0000 UTC",
        "dateModified" : "2019-04-21 11:20:00 \u002b0000 UTC",
        "url" : "https:\/\/www.bobdc.com\/blog\/json-ld\/",
        "wordCount" : "1283",
        "keywords" : [ "RDF","JSON","SPARQL","Blog" ]
    }
    </script>
......

Can we use SPARQL query against the page directly? If not, are there some elegant workarounds?

I googled without satisfying results. Thank you in advance!

1

There are 1 answers

0
Jan Martin Keil On

This is not possible with plain SPARQL. One needs to preprocess the page and load the JSON-LD into some kind of in-memory triplestore, as suggested by @UninformedUser in the comments. However, one does not need to do that manually, but could use some ready made tools for that:

SPARQL Anything

It overloads the SPARQL SERVICE operator to parse many kinds of files from web or local storage. In your case, create a following query file json-ld-in-html.rq:

# vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# vvv prefixes for your query vvv
# vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

# e.g.
PREFIX schema: <http://schema.org/>

# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# ^^^ prefixes for your query ^^^
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

SELECT *
WHERE {
    SERVICE <x-sparql-anything:location=https://www.bobdc.com/blog/json-ld/,triplifier=io.github.sparqlanything.html.HTMLTriplifier,html.metadata=true> {
        # vvvvvvvvvvvvvvvvvv
        # vvv your query vvv
        # vvvvvvvvvvvvvvvvvv
        
        # e.g.
        [] schema:name ?title .
        [] schema:author ?author .
        
        # ^^^^^^^^^^^^^^^^^^
        # ^^^ your query ^^^
        # ^^^^^^^^^^^^^^^^^^
    }
}

Then execute the query:

java -jar sparql-anything-0.9.0.jar -q json-ld-in-html.rq -f TEXT

Result:

----------------------------------------
| title               | author         |
========================================
| "Exploring JSON-LD" | "Bob DuCharme" |
----------------------------------------

With a few changes, it is also possible to provide the URL as parameter or return the output in another format.

It is also possible to run SPARQL anything as a web service and send the query via HTTP/SPARQL protocol.