You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »


To enhance your DataSync integration for Parquet, you can configure the Parquet Bulk Load Meshlet to the directives listed below:

(warning) WARNING:

  • If you will be setting up the DataSync Agent to also connect to the same database, you will want to configure the Agent to use the same column cases. By default, the Parquet Bulk Load Meshlet uses lowercase for the column names. Also, the QUOTED_IDENTIFIERS_IGNORE_CASE setting in Parquet should be kept with its default value of false.
  • The meshlet uses the TIMEZONE session parameter to connect to Parquet in order to save all timestamps in UTC . By default, glide_date_time field is mapped to the TIMESTAMP_LTZ(9) column type. This allows you to query using a local timezone as needed. 
DirectiveDefault ValueDescription

maxFileSize


Required Configuration. This configuration specifies the max size for temporary files created as the meshlet pushes records to parquet. 10000 will be used if input is over 10000 to prevent possible performance and memory issues. A suggested value is 5000.

perspectium:
	filesubscriber: 
		maxFileSize: 15000

customFileName

$table-$randomid

Dynamic naming convention that will be replaced when creating files using the following keywords. File names MUST be unique.

KeyDescription
$tableName of table file
$zonedatetime

Time of file writing  - Default format: yyyy-MM-dd'T'HH:mm:ss.SSSZ

(info) NOTE: Format can be changed (see dateFormat configuration)

$randomIDRandom ID to ensure unique file naming
perspectium:
	filesubscriber: 
		customFileName: $table-$randomid

fileDirectory

/files

Directory where the locally created files get made. (In respects to where application is running)

perspectium:
	filesubscriber: 
		fileDirectory: /files

postInterval

2

Minutes to check dead periods. Check every x minutes to compare if the in memory collection is the same as the last x minutes. If so, write records to file and push to parquet

perspectium:
	parquet: 
		postInterval: 2
dateFormatyyyy-MM-dd'T'HH:mm:ss.SSSZDate format used to create the file name. A valid SimpleDateFormat required.
timeZoneGMTID of timezone to be used in $zonedatetime.
file_prefix

Prefix used for file naming.

file_suffix
Suffix used for file naming.