Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This is a quick start guide to get the Document Lens up and running in the quickest and simplest possible way so you can start ingesting and transforming data straight away. For a more in-depth set of instructions go to the User Guide.

In this guide we will be setting up and running the Lens as a docker image deployed to your local machine, however we support a number of cloud deployments technologies, including full support of AWS.

...

All Lenses supplied by Data Lens are configurable through the use of Environment Variables. Setting up these environment variables will differ depending on how you choose to run the lens, see Running the Lens for more info.

Mandatory Configuration

...

Environment Variable

Description

[LICENSE]

This is the license key required to operate the lens, request your new unique license key here.

[OUTPUT_DIR_URL]

This is the directory where all generated RDF files are saved to. This also supports local and remote URLs.

[PROV_OUTPUT_DIR_URL]

This is the directory where all generated provenance files are saved to. This also supports local and remote URLs. If you do not wish to generate Provenance, you can turn it off by setting the [RECORD_PROVO] variable to false.

[LENS_RUN_STANDALONE]

Each of the Lenses are designed to be run as part of a larger end-to-end system, with the end result of data being uploaded into Semantic Knowledge Graphs or Property Graphs. As part of this process, Apache Kafka message queues are used for communicating between services.

While not a compulsory config option, for this quick start, we are going to enable standalone mode by setting this value to true, so that the Lens won't attempt to connect to external services.

...

The structure and parameters for the GET request is as follows: http://<lens-ip>:<lens-port>/process?inputFileURL=<input-file-url>, for example: http://127.0.0.1:8080/process?inputFileURL=file:///var/local/input-document.pdf, where the response is in the form of a JSON.

...

To learn more about the content of the output file, see the Endpoint subsection Input/Output Data Example section of the User Guide.