Local Docker Deployment

All of our Lenses are designed and built to be versatile, allowing them to be set up and ran on a number of environments, including in cloud or on-premise. This is achieved through the use of Docker Containers. For these local on-premise deployment strategies, all containers will run on the same machine, and this assumes that all files will be written from and to local locations which are accessible from the host machine. Therefore the following prerequisites are required:

  1. Install Docker Desktop must be installed

  2. Get the IP address of your local machine (not localhost/127.0.0.1)

 

 

1. Running Kafka

To run Kafka, we are going to use the jonnypark/kafka-zookeeper docker image. This image is for single Kafka message broker of which also includes Zookeeper.

For UNIX based machines (macOS and Linux):

1 2 3 4 5 docker run \ -p 2181:2181 \ -p 9092:9092 \ -e ADVERTISED_HOST=<local-ip-address> \ johnnypark/kafka-zookeeper:latest

On Windows:

1 2 3 4 5 docker run ^ -p 2181:2181 ^ -p 9092:9092 ^ -e ADVERTISED_HOST=<local-ip-address> ^ johnnypark/kafka-zookeeper:latest

Once your Kafka Cluster is up and running, you will be able to access them from your local machine. If you are well versed in Kafka, this can be done through the Command Line Interface, if however, you are not an advanced user, we recommend using the tool Conduktor to communicate with your cluster.

 

 

2. Running your Lens

If you are running multiple Lenses and Writers on the same machine, please ensure your port numbers do not clash.

Structured File Lens

Further instructions on running the Structured File Lens can be found in the user guide.

For UNIX based machines (macOS and Linux):

1 2 3 4 5 6 7 8 9 10 docker run \ --env LICENSE=<license-key> \ --env MAPPINGS_DIR_URL=/var/local/mapping-files/ \ --env OUTPUT_DIR_URL=/var/local/output/ \ --env PROV_OUTPUT_DIR_URL=/var/local/prov-output/ \ --env KAFKA_BROKERS=<local-ip-address>:9092 \ --env PROV_KAFKA_BROKERS=<local-ip-address>:9092 \ -p 8080:8080 \ -v /var/local/:/var/local/ \ lens-static:latest

On Windows

1 2 3 4 5 6 7 8 9 10 docker run ^ --env LICENSE=<license-key> ^ --env MAPPINGS_DIR_URL="/data/mapping-files/" ^ --env OUTPUT_DIR_URL="/data/output/" ^ --env PROV_OUTPUT_DIR_URL="/data/prov-output/" ^ --env KAFKA_BROKERS=<local-ip-address>:9092 ^ --env PROV_KAFKA_BROKERS=<local-ip-address>:9092 ^ -p 8080:8080 ^ -v C:\data\:/data/ ^ lens-static:latest

SQL Lens

Further instructions on running the SQL Lens can be found in the user guide.

For UNIX based machines (macOS and Linux):

1 2 3 4 5 6 7 8 9 10 11 12 docker run \ -e LICENSE=<license-key> \ -e CRON_EXPRESSION="*/50 * * ? * * *" \ -e SQL_LIMIT=20000 \ -e MAPPINGS_DIR_URL=/data/mapping-files/ \ -e OUTPUT_DIR_URL=/data/output/ \ -e PROV_OUTPUT_DIR_URL=/data/prov-output/ \ -e KAFKA_BROKERS=<local-ip-address>:9092 \ -e PROV_KAFKA_BROKERS=<local-ip-address>:9092 \ -p 8080:8080 \ -v /var/data:/data \ lens-sql:latest

On Windows:

1 2 3 4 5 6 7 8 9 10 11 12 docker run ^ -e LICENSE=<license-key> ^ -e CRON_EXPRESSION="*/50 * * ? * * *" ^ -e SQL_LIMIT=20000 ^ -e MAPPINGS_DIR_URL=/data/mapping-files/ ^ -e OUTPUT_DIR_URL=/data/output/ ^ -e PROV_OUTPUT_DIR_URL=/data/prov-output/ ^ -e KAFKA_BROKERS=<local-ip-address>:9092 ^ -e PROV_KAFKA_BROKERS=<local-ip-address>:9092 ^ -p 8080:8080 ^ -v C:\data\:/data/ ^ lens-sql:latest

RESTful Lens

Further instructions on running the RESTful Lens can be found in the user guide.

For UNIX based machines (macOS and Linux):

1 2 3 4 5 6 7 8 9 10 11 docker run \ -e LICENSE=<license-key> \ -e JSON_API_CONFIG_URL=/data/api-config/multipage-config.json \ -e MAPPINGS_DIR_URL=/data/mapping-files/ \ -e OUTPUT_DIR_URL=/data/output/ \ -e PROV_OUTPUT_DIR_URL=/data/prov-output/ \ -e KAFKA_BROKERS=<local-ip-address>:9092 \ -e PROV_KAFKA_BROKERS=<local-ip-address>:9092 \ -p 8080:8080 \ -v /var/data:/data \ lens-restful:latest

On Windows:

1 2 3 4 5 6 7 8 9 10 11 docker run ^ -e LICENSE=<license-key> ^ -e JSON_API_CONFIG_URL=/data/api-config/multipage-config.json ^ -e MAPPINGS_DIR_URL=/data/mapping-files/ ^ -e OUTPUT_DIR_URL=/data/output/ ^ -e PROV_OUTPUT_DIR_URL=/data/prov-output/ ^ -e KAFKA_BROKERS=<local-ip-address>:9092 ^ -e PROV_KAFKA_BROKERS=<local-ip-address>:9092 ^ -p 8080:8080 ^ -v C:\data\:/data/ ^ lens-restful:latest

Document Lens

Further instructions on running the Document Lens can be found in the user guide.

For UNIX based machines (macOS and Linux):

1 2 3 4 5 6 7 8 9 docker run \ --env LICENSE=<license-key> \ --env OUTPUT_DIR_URL=/var/local/output/ \ --env PROV_OUTPUT_DIR_URL=/var/local/prov-output/ \ --env KAFKA_BROKERS=<local-ip-address>:9092 \ --env PROV_KAFKA_BROKERS=<local-ip-address>:9092 \ -p 8080:8080 \ -v /var/local/:/var/local/ \ lens-unstructured:latest

On Windows:

1 2 3 4 5 6 7 8 9 docker run ^ --env LICENSE=<license-key> ^ --env OUTPUT_DIR_URL="/data/output/" ^ --env PROV_OUTPUT_DIR_URL="/data/prov-output/" ^ --env KAFKA_BROKERS=<local-ip-address>:9092 ^ --env PROV_KAFKA_BROKERS=<local-ip-address>:9092 ^ -p 8080:8080 ^ -v C:\data\:/data/ ^ lens-static:latest

 

 

3. Running your Writer

Further instructions on running the Writer can be found in the user guide.

Lens Writer

This Writer will pick up the messages from the success queue and write the RDF to the triplestore specified

For UNIX based machines (macOS and Linux):

1 2 3 4 5 6 7 8 9 10 docker run \ --env LICENSE=<license-key> \ --env TRIPLESTORE_ENDPOINT=<The URL endpoint of the triplestore to write to> \ --env TRIPLESTORE_TYPE=<one of graphdb,stardog,blazegraph,neptune,neo4j,sparql> \ --env TRIPLESTORE_USERNAME=<triplestore-username> \ --env TRIPLESTORE_PASSWORD=<triplestore-password> \ --env KAFKA_BROKERS=<local-ip-address>:9092 \ -p 8080:8081 \ -v /var/local/:/var/local/ \ lens-writer-api:latest

On Windows:

1 2 3 4 5 6 7 8 9 10 docker run ^ --env LICENSE=<license-key> ^ --env TRIPLESTORE_ENDPOINT=<The URL endpoint of the triplestore to write to> ^ --env TRIPLESTORE_TYPE=<one of graphdb,stardog,blazegraph,neptune,neo4j,sparql> ^ --env TRIPLESTORE_USERNAME=<triplestore-username> ^ --env TRIPLESTORE_PASSWORD=<triplestore-password> ^ --env KAFKA_BROKERS=<local-ip-address>:9092 ^ -p 8080:8081 ^ -v C:\data\:/data/ ^ lens-writer-api:latest

Provenance Writer

This second Writer is defined to pick up messages off of the (default named) provenance success queue and write the RDF to the triplestore specified.

For UNIX based machines (macOS and Linux):

1 2 3 4 5 6 7 8 9 10 11 docker run \ --env LICENSE=<license-key> \ --env TRIPLESTORE_ENDPOINT=<The URL endpoint of the triplestore to write to> \ --env TRIPLESTORE_TYPE=<one of graphdb,stardog,blazegraph,neptune,neo4j,sparql> \ --env TRIPLESTORE_USERNAME=<triplestore-username> \ --env TRIPLESTORE_PASSWORD=<triplestore-password> \ --env KAFKA_BROKERS=<local-ip-address>:9092 \ --env KAFKA_TOPIC_NAME_SUCCESS=prov_success_queue \ -p 8080:8082 \ -v /var/local/:/var/local/ \ lens-writer-api:latest

On Windows:

1 2 3 4 5 6 7 8 9 10 11 docker run ^ --env LICENSE=<license-key> ^ --env TRIPLESTORE_ENDPOINT=<The URL endpoint of the triplestore to write to> ^ --env TRIPLESTORE_TYPE=<one of graphdb,stardog,blazegraph,neptune,neo4j,sparql> ^ --env TRIPLESTORE_USERNAME=<triplestore-username> ^ --env TRIPLESTORE_PASSWORD=<triplestore-password> ^ --env KAFKA_BROKERS=<local-ip-address>:9092 ^ --env KAFKA_TOPIC_NAME_SUCCESS=prov_success_queue ^ -p 8080:8082 ^ -v C:\data\:/data/ ^ lens-writer-api:latest

 

 

4. Running the Mapping Config Tool

As defined in the Mapping Tool user guide, you can deploy this tool to your own instance. This enables additional functionality such as the ability to update mapping files on a running Lens.

For UNIX based machines (macOS and Linux):

1 2 3 docker run \ -p 3000:3000 \ config-app:latest

On Windows:

1 2 3 docker run ^ -p 3000:3000 ^ config-app:latest