Quick Start Guide - Lens Writer v2.0

This is a quick start guide to get the Lens Writer up and running in the quickest and simplest possible way so you can start publishing data to your Knowledge or Property Graph straight away. For a more in-depth set of instructions go to the User Guide.

In this guide we will be setting up and running the Writer as a docker image deployed to your local machine or as the AWS Marketplace offering run on ECS. Once deployed, you can utilise any of our ready-made sample output NQuads files to test your Writer.

 

 

1. Configuring the Writer

As with the Lenses supplied by Data Lens, the Lens Writer is also configurable through the use of Environment Variables. The following config options are required for running the Writer:

  • License - LICENSE

    • This is the license key required to operate the Lens when being run on a local machine outside of AWS, request your new unique license key here.

  • Graph Database Endpoint - GRAPH_DATABASE_ENDPOINT

    • This is the endpoint for your Graph Database that you wish to upload your source data to. It is therefore required for the Lens Writer to operate.

    • For this quick start, we will use an example of a GraphDB endpoint.

  • Graph Database Type - GRAPH_DATABASE_TYPE

    • This is your Graph Database type, some graphs will support the default sparql type (e.g. AllegroGraph), however certain graphs require specific type declaration, these include graphdb, stardog, blazegraph, neptune-sparql, and rdfox.

    • If you are using a Property Graph, you can set a specific Graph provider, including neo4j, neptune-cypher, neptune-gremlin, or the traversal language cypher or gremlin.

    • For this quick start, for this example, we specify the Graph Database type of graphdb.

  • Graph Database Username and Password - GRAPH_DATABASE_USERNAME and GRAPH_DATABASE_PASSWORD

    • This is the username and password of your Graph. You can leave these fields blank if your Graph does not require any authentication.

  • Run Standalone Mode - LENS_RUN_STANDALONE

    • The Lens Writer and each of the Lenses are designed to run as part of a larger end to end system, with the Lens providing the Writer with RDF or CSV files to write to a Knowledge Graph as part of this process, Kafka message queues are used to communicate between services.

    • For this quick start, we are going to ensure that standalone mode is left to its default state (true) so that the Writer won't attempt to connect to external services.

 

2. Running the Writer

The Writer and all of our Lenses are designed and built to be versatile, allowing them to be set up and ran on a number of environments, including in cloud or on-premise. This is achieved through the use of Docker Containers. For this quick start guide, we are going to use the simplest method of deployment, and this is to run the Writer’s Docker image locally. To do this, please first ensure you have Docker installed. Once installed, simply by running a command with the following structure, Docker will start the container and run the Lens from your downloaded image.

 

For UNIX based machines (macOS and Linux):

docker run \ --env LICENSE=$LICENSE \ --env GRAPH_DATABASE_ENDPOINT=https://graphdb.example.com:443/repositories/test \ --env GRAPH_DATABASE_TYPE=graphdb \ --env GRAPH_DATABASE_USERNAME=test \ --env GRAPH_DATABASE_PASSWORD=test \ -p 8080:8080 \ -v /var/local/:/var/local/ \ lens-writer-api:latest

For Windows

docker run ^ --env LICENSE=%LICENSE% ^ --env GRAPH_DATABASE_ENDPOINT=https://graphdb.example.com:443/repositories/test ^ --env GRAPH_DATABASE_TYPE=graphdb ^ --env GRAPH_DATABASE_USERNAME=test ^ --env GRAPH_DATABASE_PASSWORD=test ^ -p 8080:8080 ^ -v /data/:/data/ ^ lens-writer-api:latest

The above examples demonstrate how to override configuration options using environment variables in your Lens Writer. Line 2 shows the use of passing in an environment variable saved to the machine, whereas lines 3-6 simply show a string value being passed to it. Given the Writer is ran on port 8080, line 7 exposes and binds that port of the host machine so that the APIs can be triggered. The -v flag seen on line 8 mounts the working directory into the container; when the host directory of a bind-mounted volume doesn’t exist, Docker will automatically create this directory on the host for you. And finally, line 9 is the name and version of the Docker image you wish to run.

For more information of running Docker Images, see the official Docs.

 

3. Ingesting RDF / Triggering the Writer

The easiest way to ingest source data file into the Lens Writer is to use the built-in endpoint. Using the process endpoint, you can specify the URL of an RDF or CSV file to ingest, and in return, you will be provided with the success status of the operation.

For example, using a GET request:

http://<writer-ip>:<writer-port>/process?inputRdfURL=<input-rdf-file-url>http://127.0.0.1:8080/process?inputRdfURL=/var/local/input/input-rdf-data.nq

Once an input file has successfully been processed, the response returned from the Writer is in the form of a JSON, containing both the input data URL and the URL of the target Triple Store.

Sample output:

{     "input": "/var/local/input/input-rdf-data.nq", "graphDatabaseEndpoint": "https://graphdb.example.com:443/repositories/test", "graphDatabaseProvider": "graphdb", "databaseType": "SEMANTIC_GRAPH" }

Now by logging in to your Knowledge Graph and making the necessary queries, you will be able to see the newly inserted source data.