Configurable Options - SQL Lens v2.0
Below is a table containing all of the configurable options within the SQL Lens. To see how to set config variables, see the Quick Start Guide or the full User Guide. With version 2.0 and beyond, fewer configuration is required to have your Lens operational, just setting the Lens Directory is enough. In addition, no configuration is required at startup as config can be updated on a running Lens.
To do this, simply call the endpoint /updateConfig?configEntry=<entry>&configValue=<value>
where the entry the config item as seen below, and value is the new value you wish to set. Any configuration changed while the Lens is running can also be backed up and restored.
Lens Configuration
Environment Variable | Entry | Default Value | Description |
---|---|---|---|
FRIENDLY_NAME | friendlyName | SQL-Lens | The name you wish to set your Lens up with. |
LICENSE | license |
| The License key provided required for running the Lens. Only required when running a non AWS Marketplace version of the Lens |
CRON_EXPRESSION | cronExpression | Switched off by default | The Quartz Cron Expression. Used by the Lens to set up a time-based job scheduler which will schedule the Lens to ingest your specified data from your database(s) periodically at fixed times, dates, or intervals. For example, this cron expression value 0 */30 * ? * * * translates to triggering the Lens every 30 minutes starting at :00 or :30 minutes after the hour. |
SQL_LIMIT | sqlLimit | 0 | The SQL Limit provides the maximum limit to the number of records that can be processed in any one query. This means that if your database contains more records that this set variable, the Lens will batch process the records from the query and output multiple RDF files. This value must be an integer greater than zero. It defaults to zero, meaning that iterative queries are switched off. |
SQL_OFFSET | sqlOffset | 0 | The SQL Offset provides the ability to offset the start index of the iterative processing. |
CONCURRENT_THREADS | concurrentThreads | 4 | When the Lens is executing an iterative query, these iterations will be ran in parallel. This option allows you to select how many threads will be used concurrent iteration executions. |
CONTINUE_ON_ERROR | continueOnError | false | If there is an error during an iteration, this determines whether the execution will continue or halt. |
LENS_DIRECTORY | lensDirectory | file:///var/local/ | This is the directory where all Lens files are stored (assuming individual file dir config haven’t been edited). On Lens startup, if this has been declared, it will create folders at the specified location for mapping, output, yaml-mapping, prov output, and config backup. |
MAPPINGS_DIR_URL | mappingsDirUrl | file:///var/local/mapping/ | The URL of the directory containing the mapping file(s). Can be local or remote, see here for more details. |
MASTER_MAPPING_FILE | masterMappingFile | mapping.ttl | The filename of the master mapping file. |
OUTPUT_DIR_URL | outputDirUrl | file:///var/local/output/ | The URL of the directory you wish the generated RDF to be output to. Can be local or remote, see here for more details. |
OUTPUT_FILE_FORMAT | outputFileFormat | nquads | The file type that will be constructed when the RDF is created. The options are: |
CONFIG_BACKUP | configBackup | file:///var/local/config-backup/ | The URL directory where the config will be backed up to when calling the upload config endpoint |
LENS_RUN_STANDALONE | runStandalone | true | Each of the Lenses are designed to run as part of a larger end to end system with the end result being the data is uploaded to a Knowledge or Property Graph. As part of this process, Kafka is used to communicate between services. This is enabled by default, however if you want to run the Lens as standalone without communicating to other services, set this property to true. |
CUSTOM_FUNCTION_JAR_URL | customFunctionJarUrl |
| If you require a function to be executed that doesn’t perform the required operation using the built-in functions, it is possible to create and use your own. To do this, set this variable to the URL of your jar file containing the functions (S3 is also supported), and follow the instructions laid out in this guide. |
CUSTOM_FUNCTION_TTL_URL | customFunctionTtlUrl |
| If you require a function to be executed that doesn’t perform the required operation using the built-in functions, it is possible to create and use your own. To do this, set this variable to the URL of your ttl file containing the mappings to your functions (S3 is also supported), and follow the instructions laid out in this guide. |
AWS Configuration
One of the methods for connecting to AWS and S3 on a locally running Lens is by using environment variables to set the region, access key, and secret key. This must be done when running your docker container. Alternate methods can be found in the AWS documentation.
Environment Variable | Description |
---|---|
AWS_REGION | The region in AWS where your S3 buckets and files reside, for example “us-east-1”. |
AWS_ACCESS_KEY_ID | Your access key for AWS. |
AWS_SECRET_ACCESS_KEY | Your secret key for AWS. |
Property Graph Configuration
Environment Variable | Entry | Default Value | Description |
---|---|---|---|
PROPERTY_GRAPH_MODE | propertyGraphMode | false | If you are using a Property Graph as your target graph type, set this configuration to true, otherwise leave as false. When set to true, the Lens will output Nodes and Edges CSV files instead of RDF files. Please ensure you have constructed the correct mapping file to support property graphs. |
PG_GRAPH | pgGraph | default | Set this property to the Property Graph provider you wish to use: |
Kafka Configuration
Environment Variable | Entry | Default Value | Description |
---|---|---|---|
KAFKA_BROKERS | kafkaBrokers | localhost:9092 | The Kafka Broker is what tells the Lens where to look for your Kafka Cluster. Set with the following structure |
KAFKA_TOPIC_NAME_SOURCE | topicNameSource | source_urls | The topic used for the Consumer to read messages order to ingest data. Any message can trigger the Lens including an empty message. |
KAFKA_TOPIC_NAME_DLQ | topicNameDLQ | dead_letter_queue | The topic used to push messages containing reasons for failure within the Lens. These messages are represented as a JSON. |
KAFKA_TOPIC_NAME_SUCCESS | topicNameSuccess | success_queue | The topic used for the messages sent containing the file URLs of the successfully transformed RDF data files. |
KAFKA_GROUP_ID_CONFIG | groupIdConfig | consumerGroup1 | The identifier of the group this consumer belongs to. |
KAFKA_AUTO_OFFSET_RESET_CONFIG | autoOffsetResetConfig | earliest | What to do when there is no initial offset in Kafka or if an offset is out of range.
|
KAFKA_MAX_POLL_RECORDS | maxPollRecords | 100 | The maximum number of records returned in a single call to poll. |
KAFKA_TIMEOUT | timeout | 1000 | Kafka consumer polling time out. |
Provenance Configuration
Environment Variable | Entry | Default Value | Description |
---|---|---|---|
RECORD_PROVO | recordProvo | true | Parameter indicating if the provenance meta-data should be generated. |
PROV_OUTPUT_DIR_URL | provOutputDirUrl | file:///var/local/prov-output/ | The URL of the directory for the provenance meta-data. |
PROV_KAFKA_BROKERS | provKafkaBrokers | localhost:9092 | This is the location of your Kafka Cluster for provenance. This can be the same or different as your broker for the Lens. |
PROV_KAFKA_TOPIC_NAME_DLQ | provTopicNameDLQ | prov_dead_letter_queue | The topic used for your dead letter queue provenance messages. This can be the same or different as your DLQ topic for the Lens |
PROV_KAFKA_TOPIC_NAME_SUCCESS | provTopicNameSuccess | prov_success_queue | The topic used for the messages sent containing the file URLs of the successfully generated provenance files. This can be the same or different as your success queue topic for the Lens |
SWITCHED_OFF_ACTIVITIES | switchedOffActivities |
| A comma-separated list of the provenance processes you which to turn off. The Lens contains the following processes: |
Logging Configuration
Environment Variable | Default Value | Description |
---|---|---|
LOG_LEVEL_DATALENS | INFO | Log level for Data Lens loggers - change to DEBUG to see more in depth logs, or to WARN or ERROR to quiet the logging. |
LOG_LEVEL_RMLMAPPER | INFO | Log level for RML Mapping loggers |
LOG_LEVEL_DROPWIZARD | INFO | Log level for Dropwizard loggers |
Additional Logging Configuration
Environment Variable | Default Value | Description |
---|---|---|
LOGGING_LEVEL | WARN | Global log level |
LOGGING_APPENDERS_CONSOLE_TIMEZONE | UTC | Timezone for console logging |
LOGGING_APPENDERS_TXT_FILE_THRESHOLD | ALL | Threashold for text logging |
Log Format (not overridable) | %-6level [%d{HH:mm:ss.SSS}] [%t] %logger{5} - %X{code} %msg %n | Pattern for logging messages |
Current Log Filename (not overridable) | /var/log/datalens/text/current/application_${applicationName}_${timeStamp}.txt.log | Pattern for log file name |
LOGGING_APPENDERS_TXT_FILE_ARCHIVE | true | Archive log text files |
Archived Log Filename Pattern (not overridable) | /var/log/datalens/text/archive/application_${applicationName}_${timeStamp}_to_%d{yyyy-MM-dd}.txt.log | Log file rollover frequency depends on pattern in following property. For example %d{yyyy-MM-ww} declares rollover weekly |
LOGGING_APPENDERS_TXT_FILE_ARCHIVED_TXT_FILE_COUNT | 7 | Max number of archived text files |
LOGGING_APPENDERS_TXT_FILE_TIMEZONE | UTC | Timezone for text file logging |
LOGGING_APPENDERS_JSON_FILE_THRESHOLD | ALL | Threashold for text logging |
Log Format (not overridable) | %-6level [%d{HH:mm:ss.SSS}] [%t] %logger{5} - %X{code} %msg %n | Pattern for logging messages |
Current Log Filename (not overridable) | /var/log/datalens/json/current/application_${applicationName}_${timeStamp}.json.log | Pattern for log file name |
LOGGING_APPENDERS_JSON_FILE_ARCHIVE | true | Archive log text files |
Archived Log Filename Pattern (not overridable) | /var/log/datalens/json/archive/application_${applicationName}_${timeStamp}_to_%d{yyyy-MM-dd}.json.log | Log file rollover frequency depends on pattern in following property. For example %d{yyyy-MM-ww} declares rollover weekly |
LOGGING_APPENDERS_JSON_FILE_ARCHIVED_FILE_COUNT | 7 | Max number of archived text files |
LOGGING_APPENDERS_JSON_FILE_TIMEZONE | UTC | Timezone for text file logging |