Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Below is a table containing all of the configurable options within the Lens Writer. To see how to set config variables, see the Quick Start Guide or the full User Guide. Mandatory variables are highlighted in red..

With version 2.0 and beyond, no configuration is required at startup as config can be updated on a running Writer. To do this, simply call the endpoint /updateConfig?configEntry=<entry>&configValue=<value> where the entry the config item as seen below, and value is the new value you wish to set. Any configuration changed while the Lens is running can also be backed up and restored.

 

Table of Contents

Lens Writer Configuration

Environment Variable

Entry

Default Value

Description

Version

FRIENDLY_NAME

friendlyName

Lens-Writer

The name you wish to set your Writer up with.

v1.3+

LICENSE

license

 

The License key provided required for running the Writer. Only required when running a non AWS Marketplace version of the Writer.

v1.3+

TRIPLESTORE

GRAPH_DATABASE_ENDPOINT

graphDatabaseEndpoint

 

The endpoint for your

Triple Store

Knowledge Graph you wish to upload your

RDF

data to.

v1

v2.

3

0+

TRIPLESTORE

GRAPH_DATABASE_TYPE

graphDatabaseType

sparql

The

Triple Store

Knowledge Graph type, some

graphs

Semantic Graphs will support the default sparql type (e.g. AllegroGraph), however certain graphs require specific type declaration, these include graphdb, stardog, blazegraph, neptune

, and neo4j.

v1.3+

TRIPLESTORE_REASONING

-sparql and rdfox. If you are using a Property Graph, you can set a specific Graph provider, including neo4j, neptune-cypher, neptune-gremlin, or the traversal language cypher or gremlin.

v2.0+

GRAPH_DATABASE_REASONING

graphDatabaseReasoning

false

Whether you want reasoning enabled or disabled. This only applies to Semantic Graphs.

v1

v2.

3

0+

TRIPLESTORE

GRAPH_DATABASE_USERNAME

graphDatabaseUsername

The username of your

Triple Store

Knowledge Graph. Leave blank if your

Triple Store

Knowledge Graph does not require any authentication.

v1

v2.

3

0+

TRIPLESTORE

GRAPH_DATABASE_PASSWORD

graphDatabasePassword

The password of your

Triple Store

Knowledge Graph. Leave blank if your

Triple Store

Knowledge Graph does not require any authentication.

v1

v2.

3

0+

S3

CONFIG_

REGION

us-east-1

The region in AWS where your files and services reside. Note: all services must be in the same region.

v1.3+

AWS_ACCESS_KEY

 

Your access key for AWS.

v1.3+

AWS_SECRET_KEY

 

Your secret key for AWS.

v1.3+

DELETE_SOURCE

BACKUP

configBackup

file:///var/local/config-backup/

The URL directory where the config will be backed up to when calling the upload config endpoint

v2.0+

DELETE_SOURCE

deleteSourceFile

false

Whether you wish to delete the source

NQuads

data file after it has been written to the

Triple Store

Knowledge Graph

v1.3+

LENS_RUN_STANDALONE

runStandalone

false

true

The Lens Writer is designed to run as part of a larger end to end system with the Lens providing the Writer with RDF or CSV files to write to a

Triple Store

Knowledge Graph. As part of this process, Kafka is used to communicate between services. This is

enabled

turned off by default, however if you want to

run

enable the running of the Lens Writer

as standalone without communicating to other

with connected services, set this property to

true

false.

v1.3+

INGESTION_MODE

ingestionMode

insert

How to process the ingested data.

  • 'insert': the new data

are
  • is ingested in full and

are no linked with already
  • do not replace existing data. The new dataset adds new

value
  • values to already existing subject-

predicate
  • predicates.

  • 'update': the new data

are
  • is used for updating the existing data. The new dataset replaces value in existing subject-predicate.

Please note this only applies to Semantic Graphs, in Property Graphs the Writer defaults to an upsert pattern, updating the properties for a given node or edge.

v1.4+

 

Kafka Configuration

Environment Variable

Entry

Default Value

Description

Version

KAFKA_BROKERS

kafkaBrokers

localhost:9092

The Kafka Broker is what tells the Writer where to look for your Kafka Cluster. Set with the following structure <kafka-ip>:<kafka-port>. The recommended port is 9092.

v1.3+

KAFKA_TOPIC_NAME_SOURCE

topicNameSource

source

success_

urls

queue

The topic used for the Consumer to read messages from containing

input file URLs in order

the URLs of the source data files to ingest

data

.

v1.3+

KAFKA_TOPIC_NAME_DLQ

topicNameDLQ

dead_letter_queue

The topic used to push messages containing reasons for failure within the Writer. These messages are represented as a JSON.

v1.3+

KAFKA

_TOPIC_NAME

_

SUCCESS

success_queue

The topic used for the messages sent containing the file URLs of the successfully transformed RDF data files.

v1.3+

KAFKA_

GROUP_ID_CONFIG

groupIdConfig

consumerGroup1

The identifier of the group this consumer belongs to.

v1.3+

KAFKA_AUTO_OFFSET_RESET_CONFIG

autoOffsetResetConfig

earliest

What to do when there is no initial offset in Kafka or if an offset is out of range.

earliest: automatically reset the offset to the earliest offset

latest: automatically reset the offset to the latest offset

v1.3+

KAFKA_MAX_POLL_RECORDS

maxPollRecords

100

The maximum number of records returned in a single call to poll.

v1.3+

KAFKA_TIMEOUT

timeout

1000000

1000

Kafka consumer polling time out.

v1.3+

 

...

Provenance Configuration

Environment Variable

Environment Variable

Default Value

Description

Version

NEO4J_HANDLE_VOCAB_URIS

KEEP

  • 'SHORTEN': Full URIs are shortened using prefixes for property names, relationship names and labels

  • 'IGNORE': URIs are ignored and only local names are kept

  • 'MAP': Vocabulary element mappings are applied on import

  • 'KEEP': URIs are kept unchanged

v1.3+

NEO4J_APPLY_NEO4J_NAMING

false

When set to true and in combination with handleVocabUris: 'IGNORE', Neo4j capitalisation is applied to vocabulary elements (all caps for relationship types, capital first for labels, etc.)

v1.3+

NEO4J_HANDLE_MULTIVAL

ARRAY

  • 'OVERWRITE': Property values are kept single-valued. Multiple values in the imported RDF are overwritten (only the last one is kept).

  • 'ARRAY': Properties are stored in an array enabling storage of multiple values.

v1.3+

NEO4J_KEEP_LANG_TAG

true

When set to true, the language tag is kept along with the property value. Useful for multilingual datasets.

v1.3+

NEO4J_TYPES_TO_LABEL

false

When set to true, rdf:type statements are imported as node labels in Neo4j.

v1.3+

NEO4J_VERIFY_URI_SYNTAX

true

By default, URI syntax is checked. This can be disabled by setting this parameter to false.

v1.3+

NEO4J_KEEP_CUSTOM_DATA_TYPES

true

When set to true, all properties containing a custom data type will be saved as a string followed by their custom data type IRIs.

v1.3+

Provenance Configuration

Entry

Default Value

Description

Version

RECORD_PROVO

recordProvo

false

Currently, the Lens Writer does not generate its own provenance meta-data and so this option is set to false

v1.3+

 

Logging Configuration

Environment Variable

Default Value

Description

Version

LOGGING

LOG_LEVEL

WARN

Global log level

v1.3+

LOGGING

_

LOGGERS_

DATALENS

DEBUG

INFO

Log level for Data Lens loggers

v1.3+

LOGGING_LOGGERS

- change to DEBUG to see more in depth logs, or to WARN or ERROR to quiet the logging.

LOG_LEVEL_DROPWIZARD

INFO

Log level for Dropwizard loggers

 

Additional Logging Configuration

Environment Variable

Default Value

Description

Version

LOGGING_LEVEL

WARN

Global log level

v1.3+

LOGGING_APPENDERS_CONSOLE_TIMEZONE

UTC

Timezone for console logging

v1.3+

LOGGING_APPENDERS_TXT_FILE_THRESHOLD

ALL

Threashold for text logging

v1.3+

Log Format (not overridable)

%-6level [%d{HH:mm:ss.SSS}] [%t] %logger{5} - %X{code} %msg %n

Pattern for logging messages

v1.3+

Current Log Filename (not overridable)

/var/log/datalens/text/current/application_${applicationName}_${timeStamp}.txt.log

Pattern for log file name

v1.3+

LOGGING_APPENDERS_TXT_FILE_ARCHIVE

true

Archive log text files

v1.3+

Archived Log Filename Pattern (not overridable)

/var/log/datalens/text/archive/application_${applicationName}_${timeStamp}_to_%d{yyyy-MM-dd}.txt.log

Log file rollover frequency depends on pattern in following property. For example %d{yyyy-MM-ww} declares rollover weekly

v1.3+

LOGGING_APPENDERS_TXT_FILE_ARCHIVED_TXT_FILE_COUNT

7

Max number of archived text files

v1.3+

LOGGING_APPENDERS_TXT_FILE_TIMEZONE

UTC

Timezone for text file logging

v1.3+

LOGGING_APPENDERS_JSON_FILE_THRESHOLD

ALL

Threashold for text logging

v1.3+

Log Format (not overridable)

%-6level [%d{HH:mm:ss.SSS}] [%t] %logger{5} - %X{code} %msg %n

Pattern for logging messages

v1.3+

Current Log Filename (not overridable)

/var/log/datalens/json/current/application_${applicationName}_${timeStamp}.json.log

Pattern for log file name

v1.3+

LOGGING_APPENDERS_JSON_FILE_ARCHIVE

true

Archive log text files

v1.3+

Archived Log Filename Pattern (not overridable)

/var/log/datalens/json/archive/application_${applicationName}_${timeStamp}_to_%d{yyyy-MM-dd}.json.log

Log file rollover frequency depends on pattern in following property. For example %d{yyyy-MM-ww} declares rollover weekly

v1.3+

LOGGING_APPENDERS_JSON_FILE_ARCHIVED_FILE_COUNT

7

Max number of archived text files

v1.3+

LOGGING_APPENDERS_JSON_FILE_TIMEZONE

UTC

Timezone for text file logging

v1.3+

LOGGING_APPENDERS_JSON_FILE_LAYOUT_TYPE

json

The layout type for the json logger

v1.3+