To enhance your DataSync integration for Parquet, you can configure the Parquet Bulk Load Meshlet to the directives listed below.
To check out the general meshlet configurations, see General Meshlet Configurations for DataSync.
Directive | Default Value | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
maxFileSize | Required Configuration. This configuration specifies the max size of records for each parquet file. 10000 will be used if input is over 10000 to prevent possible performance and memory issues. A suggested value is 5000.
perspectium: filesubscriber: maxFileSize: 5000 | |||||||||
customFileName | $table-$randomid | Dynamic naming convention that will be replaced when creating files using the following keywords. File names MUST be unique.
perspectium: filesubscriber: customFileName: $table-$randomid | ||||||||
fileDirectory | /files | Directory where the locally created files get made. (In respects to where application is running) perspectium: filesubscriber: fileDirectory: /files | ||||||||
postInterval | 2 | Minutes to check dead periods. Check every x minutes to compare if the in memory collection is the same as the last x minutes. If so, write records to file and push to parquet perspectium: parquet: postInterval: 2 | ||||||||
dateFormat | yyyy-MM-dd'T'HH:mm:ss.SSSZ | Date format used to create the file name. A valid SimpleDateFormat required. perspectium: parquet: postInterval: 2 | ||||||||
timeZone | GMT | ID of timezone to be used in $zonedatetime. perspectium: parquet: timeZone: GMT | ||||||||
file_prefix | Prefix used for file naming. perspectium: parquet: file_prefix: psp | |||||||||
file_suffix | Suffix used for file naming. perspectium: parquet: file_suffix: bt |
Azure External Storage
To enable sharing Parquet files to Azure, use the following directives:
Directive | Default Value | Description |
---|---|---|
connectionString | Connection URL for your Azure. To access the URL, go to Azure Portal > Storage Account > Access Keys > Show Keys > Connection String. perspectium: azure: connectionString: DefaultEndpointsProtocol=.....EndpointSuffix=core.windows.net | |
destinationContainer | Name of your Azure Blob Storage container, including subdirectories if desired, to specify where the records will be uploaded e.g. container/folder1/folder2. For example, the following will save records into the pspcontainer blob storage container: perspectium: azure: destinationContainer: pspcontainer If the following is configured: perspectium: azure: destinationContainer: pspcontainer/tables/$table If an incident record is being processed and uploaded to the Azure Blob Storage container, then the record will be saved in the pspcontainer container and in the /tables/incident directory in that container, creating the directoriy incident automatically.
|
To enable the sharing of Parquet files to Azure, the spring profile will need to include azure:
java -jar -Dspring.profiles.active=dev,azure
No valid Data Center license found
Please go to Atlassian Marketplace to purchase or evaluate Refined Toolkit for Confluence Data Center.Please read this document to get more information about the newly released Data Center version.