Lab: Web APIs and JSON-LD: Difference between revisions

From info216
(Created page with " =Lab 8: Accessing and lifting Web APIs (RESTful web services)= ==Topics== Programming regular (non-semantic) as well as semantic Web APIs (RESTful web services) with Jena,...")
 
No edit summary
(46 intermediate revisions by the same user not shown)
Line 1: Line 1:


=Lab 8: Accessing and lifting Web APIs (RESTful web services)=
=Lab 12: Accessing and lifting Web APIs (RESTful web services)=


==Topics==  
==Topics==  
Programming regular (non-semantic) as well as semantic Web APIs (RESTful web services) with Jena, JSON and JSON-LD.
Programming regular (non-semantic) as well as semantic Web APIs (RESTful web services) with JSON and JSON-LD.


''Tip:'' Newer versions of JSON-LD have been released since INFO216 started. apache-jena-3.2.0 uses a newer JSON-LD library (0.9.0) than the previous verion. Because JSON-LD is so new, this may make a difference, so please consider upgrading!
We will use Web APIs to retrieve regular JSON data, and then append it with a semantic context (@context).  
Finally we will parse it with RDFlib.  


==Classes/interfaces==
@context: signifies a JSON object that contains the
* Object, Map HashMap (put, get, remove), List, Vector
context (or semantic mapping) for the other objects in
* Model (read)
the same JSON array. (Similar to namespaces)
* JsonUtils (toPrettyString)
* JsonLdOptions (setExpandContext)
* JsonLdProcessor (compact, expand, flatten)
* IOUtils (toInputStream)


Object, Map HashMap (add, get), List, Vector are parts of the basic Java API (JavaDoc here: https://docs.oracle.com/javase/8/docs/api). The other classes are all available through Jena, but Jena does not include JavaDoc for all of them. [[:File:jsonld-java-0.9.0-javadoc.zip | This ZIP-archive]] contains JavaDoc for the Java implementation of JSON-LD (unpack it, for example to your Jena-folder, and open the file index.html in a browser).


Also, because JSON-LD is quite new, there are not yet many good tutorials available. This lab outline is therefore a little more detailed than the previous ones!
==Imports==
* import requests
* import json
* import pprint
* from rdflib import Graph, Namespace, RDFS
 


==Tasks==
==Tasks==
===Regular JSON web APIs===
===Regular JSON web APIs===
Write a small program that accesses a regular (non-semantic) web API. The GeoNames web API (http://www.geonames.org/export/ws-overview.html) offers many services. and download the result. For example, you can use this URL to access more information about Ines' neighbourhood in Valencia: http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&country=ES&username=demo (register to get your own username instead of "demo").  
Write a small program that accesses a regular (non-semantic) web API and download the result. The "json" library in python can be used to load a json string as a json object (json.loads(data)).
Use the the prettyprint import to print a readable version of the json object.


You can use the getJsonBody method (attached to the end of this message) to write this program. (If you call getJsonBody from the static main method in your program, you must define getJsonBody as static too). The getJsonBody method returns a JSON object, which is either a Java List or a Map. Use the toPrettyString method in the JsonUtils class to format and then print your JSON object.
The GeoNames web API (http://www.geonames.org/export/ws-overview.html) offers many services. For example, you can use this URL to access more information about Ines' neighbourhood in Valencia: http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&country=ES&username=demo (You might need to register a username instead of using "demo"). You can register here if you want to: https://www.geonames.org/login.  
You also need to enable the webservice here: https://www.geonames.org/manageaccount.


You do not have to use the GeoNames web API. There are lots and lots of other web APIs out there. But we want something simple that does not require registration (HTTPS can also make things more complex when the certificates are outdated). Here are some examples to get you started if you want to try out other APIs: http://opendata.app.uib.no/ , http://data.ssb.no/api , http://ws.audioscrobbler.com/2.0/ , http://www.last.fm/api /intro , http://wiki.musicbrainz.org/Development/JSON_Web_Service .
You do not have to use the GeoNames web API. There are lots and lots of other web APIs out there. But we preferably want something simple that does not require extensive registration (HTTPS can also make things more complex when the certificates are outdated). Here are some examples to get you started if you want to try out other APIs: http://opendata.app.uib.no/ , http://data.ssb.no/api , http://ws.audioscrobbler.com/2.0/ , http://www.last.fm/api /intro , http://wiki.musicbrainz.org/Development/JSON_Web_Service .


Be nice! While you are testing things, write a new method getJsonBodyProxy. This method takes a URL parameter just like the original getJsonBody. But it never connects to that URL. Instead, it returns a jsonObject created locally from a results string you have copied into your program. By letting the rest of your program call the new getJsonBodyProxy instead of getJsonBody while you are debugging your code, you do not need to call the GeoNames or other API over and over.
While you are testing and debugging things, it is good to make measures so that you do not need to call the GeoNames or other API over and over. A solution can be writing the returned data to a file, or copying it into a variable.  


Here is an example of a results string you can use, if you have trouble connecting to GeoNames (note that you have to escape all the quotation marks inside the Java string):
Here is an example of a results string you can use, if you have trouble connecting to GeoNames (note that you have to escape all the quotation marks inside the Java string):
Line 33: Line 36:


===Lifting JSON to JSON-LD===
===Lifting JSON to JSON-LD===
So far we have only used plain JSON. Now we want to move to JSON-LD. Make a new HashMap (and therefore also a JSON object) called context. Put a single entry into this map, with "@context" as the key and another HashMap as the value. It is this second map that contains the actual mappings. Put at least one pair of strings into it. For example, if you used the postcode API, the pair "lat" and "http://www.w3.org/2003/01/geo/wgs84_pos#lat". You can also put the pair "lng" and "http://www.w3.org/2003/01/geo/wgs84_pos#long".


Create a JsonLdOptions object and set its expand context to be the context object with the pair of strings in. Use the JsonLdProcessor to expand your jsonObject and pretty print the result. Has anything happened? Why/why not?!
In python we can represent JSON objects as dictionaries ({}) and JSON Arrays as lists ([]).
 
So far we have only used plain JSON. Now we want to move to JSON-LD, the semantic version of JSON. Make a new JSON object (dictionary/{} in python) that will contain the context key-value pairs (context_data). This data has to eventually be added to out JSON data, with "@context" as the key and context_data as the value.  


Add this pair too to the context object: "postalcodes" and "http://dbpedia.org/ontology/postalCode". Rerun. Has anything happened now? Why/why not?!
Put at least one pair of strings into it. For example, if you used the postcode API, the pair "lat" and "http://www.w3.org/2003/01/geo/wgs84_pos#lat". You can also put the pair "lng" and "http://www.w3.org/2003/01/geo/wgs84_pos#long".


''Explanation:'' Did you JSON object contain other (nested) objects as values? If you try to map the names inside such a nested object, the expansion will only work if you map the name of the nested object itself too.
Add this pair too to the context object: "postalcodes" and "http://dbpedia.org/ontology/postalCode".  


Add more string pairs, using existing or inventing new terms as you go along, to the context object and rerun expand. The expanded JSON object lifts the data from the web API. It can be used to provide a semantic version of the original web API.
Add more string pairs, using existing or inventing new terms as you go along, to the context object.


In addition to expand, try the compact and flatten operations on the JSON object. What do they do?
We will now make a RDFlib Graph from the JSON-LD object.


Go back to the RDF/RDFS programs your wrote in labs 2 and 3. Extend the program so that it adds further information about the post codes of every person in your graph.
First you need to pip install the json-ld portion of rdflib if you have not already:
<syntaxhighlight>
pip install rdflib-jsonld
</syntaxhighlight>


We will now make a Jena model from the JSON-LD object. To do this, first create a new default Jena model. Then convert the JSON-LD object to a string (use JsonUtils.toPrettyString). Then turn the string into an input stream (use IOUtils.toInputStream, with "UTF-8" as character set). Then read the input stream into your Jena model (use model.read). (There may be other ways to move from JSON object to Jena models, but this is a simple and straightforward way to start.)
Now, create a new Graph. Then convert the JSON-LD object to a string (use json.dumps() and write it to a file). Then parse the file with Rdflib (g.parse()).


Congratulations - you have now gone through the steps of accessing a web API over the net, lifting the results using JSON-LD, manipulating the in JSON-LD and reading them into a Jena RDF model. Of course, it is easy to convert the Jena model back into JSON-LD using model.write(..., "JSON-LD") ...
Congratulations - you have now gone through the steps of accessing a web API over the net, lifting the results using JSON-LD, manipulating the in JSON-LD and reading them into a RDF Graph. Of course, it is easy to convert the RDFlib graph back into JSON-LD using g.serialize("json-ld")


<nowiki>
===If You have more time===
  /**    We access the web APIs in a rather simple way, because our focus in INFO216
Try to download a new JSON from a different API and lift its data to the rdflib Graph, without making a context. This mean you must iterate/access each data point that you need with the json library.
        is on Jena and JSONLD, not on web APIs in themselves.  
e.g http://api.geonames.org/weatherJSON?formatted=true&north=44.1&south=-9.9&east=-22.4&west=55.2&username=demo&style=full
  */


    static Object getJsonBody(URL serverAddress) {
Which approach do you find to be easiest?
        Object jsonObject = null;
        HttpURLConnection connection = null;


        try {
==Code to get started==
            // send GET request
<syntaxhighlight>
            connection = null;
import requests
            connection = (HttpURLConnection)serverAddress.openConnection();
import json
            connection.setRequestMethod("GET");
import pprint
            connection.setDoOutput(true);
from rdflib import Graph, Namespace, RDFS
            connection.setReadTimeout(10000);
            connection.connect();


            // parse JSON reponse
g = Graph()
            jsonObject = JsonUtils.fromInputStream(connection.getInputStream());
dbp = Namespace("http://example.org/")
ex = Namespace("http://example.org/")
geo = Namespace("http://www.w3.org/2003/01/geo/wgs84_pos#")
g.bind("ex", ex)
g.bind("dbp", ex)
g.bind("geo", ex)


        } catch (MalformedURLException e) {
#start here with making API requests:
            e.printStackTrace();
</syntaxhighlight>
        } catch (ProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            // close the connection
            connection.disconnect();
            connection = null;
        }


        return jsonObject;
===Useful Reading===
    }</nowiki>
* [https://stackabuse.com/reading-and-writing-json-to-a-file-in-python/ - Reading and writing with JSON - stackabuse.com]
* [https://wiki.uib.no/info216/index.php/Python_Examples - Examples]
* [https://realpython.com/python-requests/ Requests - realpython.com]

Revision as of 10:15, 22 April 2020

Lab 12: Accessing and lifting Web APIs (RESTful web services)

Topics

Programming regular (non-semantic) as well as semantic Web APIs (RESTful web services) with JSON and JSON-LD.

We will use Web APIs to retrieve regular JSON data, and then append it with a semantic context (@context). Finally we will parse it with RDFlib.

@context: signifies a JSON object that contains the context (or semantic mapping) for the other objects in the same JSON array. (Similar to namespaces)


Imports

  • import requests
  • import json
  • import pprint
  • from rdflib import Graph, Namespace, RDFS


Tasks

Regular JSON web APIs

Write a small program that accesses a regular (non-semantic) web API and download the result. The "json" library in python can be used to load a json string as a json object (json.loads(data)). Use the the prettyprint import to print a readable version of the json object.

The GeoNames web API (http://www.geonames.org/export/ws-overview.html) offers many services. For example, you can use this URL to access more information about Ines' neighbourhood in Valencia: http://api.geonames.org/postalCodeLookupJSON?postalcode=46020&country=ES&username=demo (You might need to register a username instead of using "demo"). You can register here if you want to: https://www.geonames.org/login. You also need to enable the webservice here: https://www.geonames.org/manageaccount.

You do not have to use the GeoNames web API. There are lots and lots of other web APIs out there. But we preferably want something simple that does not require extensive registration (HTTPS can also make things more complex when the certificates are outdated). Here are some examples to get you started if you want to try out other APIs: http://opendata.app.uib.no/ , http://data.ssb.no/api , http://ws.audioscrobbler.com/2.0/ , http://www.last.fm/api /intro , http://wiki.musicbrainz.org/Development/JSON_Web_Service .

While you are testing and debugging things, it is good to make measures so that you do not need to call the GeoNames or other API over and over. A solution can be writing the returned data to a file, or copying it into a variable.

Here is an example of a results string you can use, if you have trouble connecting to GeoNames (note that you have to escape all the quotation marks inside the Java string): {\"postalcodes\":[{\"adminCode2\":\"V\",\"adminCode1\":\"VC\",\"adminName2\":\"Valencia\",\"lng\":-0.377386808395386,\"countryCode\":\"ES\",\"postalcode\":\"46020\",\"adminName1\":\"Comunidad Valenciana\",\"placeName\":\"Valencia\",\"lat\":39.4697524227712}]}"

Lifting JSON to JSON-LD

In python we can represent JSON objects as dictionaries ({}) and JSON Arrays as lists ([]).

So far we have only used plain JSON. Now we want to move to JSON-LD, the semantic version of JSON. Make a new JSON object (dictionary/{} in python) that will contain the context key-value pairs (context_data). This data has to eventually be added to out JSON data, with "@context" as the key and context_data as the value.

Put at least one pair of strings into it. For example, if you used the postcode API, the pair "lat" and "http://www.w3.org/2003/01/geo/wgs84_pos#lat". You can also put the pair "lng" and "http://www.w3.org/2003/01/geo/wgs84_pos#long".

Add this pair too to the context object: "postalcodes" and "http://dbpedia.org/ontology/postalCode".

Add more string pairs, using existing or inventing new terms as you go along, to the context object.

We will now make a RDFlib Graph from the JSON-LD object.

First you need to pip install the json-ld portion of rdflib if you have not already:

pip install rdflib-jsonld

Now, create a new Graph. Then convert the JSON-LD object to a string (use json.dumps() and write it to a file). Then parse the file with Rdflib (g.parse()).

Congratulations - you have now gone through the steps of accessing a web API over the net, lifting the results using JSON-LD, manipulating the in JSON-LD and reading them into a RDF Graph. Of course, it is easy to convert the RDFlib graph back into JSON-LD using g.serialize("json-ld")

If You have more time

Try to download a new JSON from a different API and lift its data to the rdflib Graph, without making a context. This mean you must iterate/access each data point that you need with the json library. e.g http://api.geonames.org/weatherJSON?formatted=true&north=44.1&south=-9.9&east=-22.4&west=55.2&username=demo&style=full

Which approach do you find to be easiest?

Code to get started

import requests
import json
import pprint
from rdflib import Graph, Namespace, RDFS

g = Graph()
dbp = Namespace("http://example.org/")
ex = Namespace("http://example.org/")
geo = Namespace("http://www.w3.org/2003/01/geo/wgs84_pos#")
g.bind("ex", ex)
g.bind("dbp", ex)
g.bind("geo", ex)

#start here with making API requests:

Useful Reading