Page History

Versions Compared

Old Version 1

changes.mady.by.user Paul Nguyen

Saved on Jan 14, 2025

compared with

New Version Current

changes.mady.by.user Paul Nguyen

Saved on Jan 16, 2025

Key

This line was added.
This line was removed.
Formatting was changed.

HTML
<style> .release-box { height: 30px; width: 100px; padding-top: 8px; text-align: center; border-radius: 5px; font-weight: bold; background-color: #828995; border-color: #FCE28A;} .release-box:hover { cursor: hand; cursor: pointer; opacity: .9; } </style> <meta name="robots" content="noindex"> <div class="release-box"> <a href="https://docs.perspectium.com/display/Lithium" style="text-decoration: none; color: #FFFFFF; display: block;"> Lithium </a> </div>

HTML

<style>
.release-box {
	height: 30px; 
	width: 100px; 
	padding-top: 8px;
	text-align: center; 	border-radius: 5px; 
	font-weight: bold; 
	background-color: #828995;  
	border-color: #FCE28A;}

.release-box:hover {
  	cursor: hand;
    cursor: pointer;
    opacity: .9; 
}
</style>
<meta name="robots" content="noindex">

<div class="release-box">
<a href="https://docs.perspectium.com/display/Lithium" style="text-decoration: none; color: #FFFFFF; display: block;">
Lithium
</a>
</div>

Perspectium DataSync Agents support the replication of data from your app to an Amazon Web Services (AWS) S3 bucket, or an AWS S3 Subscriber Agent. By configuring your AWS S3 Subscriber Agent, data from your app can be replicated and then saved as either .json or .xml file(s) in your AWS S3 bucket.

Prerequisites

First, you will need to set up the Perspectium DataSync Agent.

You should also stop running your DataSync Agent before making any Agent configuration changes.

Procedure

To configure your The DataSync Agent to run as an AWS S3 Subscriber Agent, follow these steps:supports saving records into files that can be pushed to AWS S3 buckets as follows:

One Record Per File - Each record is saved into its own file
Multiple Records Per File - Multiple records are saved into a file before a new file is created and you can control the max number of records or the maximum file size along with separating the files by table
One File for All Records - All the records of a batch, regardless of which ServiceNow table it belongs to, will be saved into one file for pushing to an AWS S3 bucket

See the Sample Configurations section for sample configurations for each.

To configure your DataSync Agent to run as an AWS S3 Subscriber Agent, follow these steps:

UI Steps

size	small

UI Step

Add the following .jar files to your DataSync Agent's extlib directory:

joda-time (v2.10.3

UI Steps

size	small

UI Step

Add the following .jar files to your DataSync Agent's extlib directory:

joda-time (v2.10.3)
aws-java-sdk-core (v1.11.729)
aws-java-sdk-kms (v1.11.729)
aws-java-sdk-s3core (v1.11.729)
aws-java-sdk-kms (v1.11.729)
aws-java-sdk-s3 (v1.11.729)

While newer versions While newer versions may work, they have not been tested and it is suggested you use the versions listed above which have been confirmed to work with this release.

UI Step
Navigate to the directory where you saved your agent.xml file when installing your DataSync Agent.

UI Step

Open your agent.xml file in a text editing application and delete the following directives nested within the <task> tag:

<database_type>
<database_server>
<database_port>
<database_user>
<database_password>
<database_parms>
<database_column_max_size>
<database>

UI Step

Locate the <task_name> and <handler> directives nested within the <task> tag and update their values as follows:

Directive	Update value to...
<task_name>	s3_agent_subscribe
<handler>	com.perspectium.replicator.file.S3Subscriber

Save the changes you've made to your agent.xml and close the file. An example agent.xml configuration for an AWS S3 Subscriber Agent is shown below:

UI Step

Within the <task> tag, nest the following directives :for how you want to configure saving to AWS S3. You can configure to save one record per file, multiple records per file and all records in one file. See the Sample Configurations below for a sample configuration for each option.

Directive	Description	Required?
<access_key>	Access Key associated with your AWS
Directive	Description	Required?
<access_key>	Access Key associated with your AWS account NOTE: Cannot be used with <use_instance_credentials/> directive	Yes, ONLY when the DataSync Agent is not installed on an EC2 instance.
<secret_access_key>	Secret Access Key associated with your AWS account NOTE: Cannot be used with <use_instance_credentials/> directive	Yes, ONLY when the DataSync Agent is not installed on an EC2 instance.
<use_instance_credentials/>	Checks the IAM roles on an EC2 instance and instance and uploads to the S3 bucket if the instance has the correct permissions NOTE: Cannot be used with <access_key> or <secret_access_key> directive	No
<region>	Region that your AWS S3 bucket resides in NOTE: If a region is specified for an Agent on an EC2 instance, then the region must match the region of the EC2 instance and the region of the S3 bucket	No
<s3_bucket>	Name of your AWS S3 bucket, including subdirectories if desired, to specify where the records will be uploaded e.g. bucketName/folder1/folder2. For example, with <s3_bucket>psp-bucket</s3_bucket> will save records into the psp-bucket S3 bucket. With <s3_bucket>psp-bucket/datasync-agent/tables/$table</s3_bucket> configured, if an incident record is being processed and uploaded to the AWS S3 bucket, then the record will be saved in the psp-bucket S3 bucket and in the /datasync-agent/tables/incident directory in that bucket, creating the directories datasync-agent, tables and incident automatically. NOTE: Adding the $table token indicates this token will be replaced by the table name of the record.	Yes
<file_format>	Format you want to save your data records in e.g., json or xml	Yes
<s3_bucket_encryption/>	To use AWS server-side encryption when pushing files to the S3 bucket. Configuring this option will have the Agent request S3 to encrypt the data at rest using S3's built-in functionality.	No

Additional directives supported:

Directive

Example

Use

Setups Supported

Required?

<file_prefix>

<file_prefix>record_</file_prefix>

NOTE: Use the value $table_$d{yyyyMMdd}_ to set a dynamic file name where table will be the record's table and yyyyMMdd will be the date format i.e. problem_20200530.json._00b470b7-901c-4447-9316-023a265d632f.json where 00b470b7-901c-4447-9316-023a265d632f is the <randomized_unique_identifier> value to ensure there are no file naming collisions when saving to the S3 bucket

You can modify yyyyMMdd with another date format of your choice. For example, hourly will need a yyyyMMddHH value. For other date format, see Date Format.

Code Block

language	xml

<file_prefix>$table_$d{yyyyMMdd}_</file_prefix>

A prefix for the file name of each record. If this directive is not specified, “psp.replicator.” will be used as the prefix. NOTE: The time period will be configured in this directory.

One Record Per File

Multiple Records Per File

<file_suffix>

<file_suffix>.xml</file_suffix>

A suffix for the file name of each record. If this directive is not specified, “.json” will be used as the suffix.

One Record Per File

Multiple Records Per File

UI Step

Code Block

language	xml
theme	Eclipse

<?xml version="1.0" encoding="UTF-8"?>
<config>
   <agent>
      <share />
      <subscribe>
         <task>
            <task_name>s3_agent_subscribe</task_name>
            <message_connection password="password" user="user">https://mesh.perspectium.net</message_connection>
    	    <instance_connection password="password" user="user">https://myinstance.service-now.com</instance_connection>   
            <handler>com.perspectium.replicator.file.S3Subscriber</handler>
            <decryption_key>The cow jumped over the moon</decryption_key>
            <access_key>AccessKey</access_key>
            <secret_access_key>SecretAccessKey</secret_access_key>
            <region>us-west-2</region>
            <s3_bucket>examples3bucket</s3_bucket>
            <file_format>json</file_format>
         </task>
      </subscribe>
      <polling_interval>40</polling_interval>
   </agent>
</config>

Files saved in the AWS S3 bucket will be named <task_name>.<randomized_unique_identifier>.<file_format>. A randomized unique identifier is used to ensure there are no file naming collisions when saving to the S3 bucket. Using the above configuration example, a file would be named s3_agent_subscribe.00b470b7-901c-4447-9316-023a265d632f.json.

NOTE: In this configuration example, your data records will be saved in your AWS S3 bucket as one file. To save each record from your app as an individual file in your AWS S3 bucket, use the following agent.xml configuration example as a guide:

<files_directory>

For Linux:

<files_directory>/Downloads/subscribefiles/</files_directory>

For Windows:

<files_directory>Users\Downloads\subscribefiles\</files_directory>

NOTE: It is important to have the slash at the end of the directory path entered ('/' for Linux and '\' for Windows) i.e. /Downloads/subscribefiles/

The directory that contains where files will be saved locally before they are pushed to Azure Blob Storage.

If not specified, files will be saved in the files directory where the Agent is installed i.e. <Agent_Installed_Directory>/files

One Record Per File

Multiple Records Per File

One File for All Records

<buffered_writes>

<buffered_writes>250</buffered_writes>

NOTE:

The <file_max_size> directive (see below) takes precedence over the <buffered_writes> directive. If you have both directives configured, <file_max_size> will be used instead of <buffered_writes>.
Once this max has been reached, this will close the current file and create the next file for writing data.
Each batch of records the Agent pulls down from the Perspectium Integration Mesh (MBS), which is controlled by your <polling_interval> configuration for how often the Agent polls MBS for another batch of records, will be pushed to the AWS S3 bucket. So this may result in files being pushed that may not have the number of records specified in this configuration. For example, if you have <buffered_writes>250</buffered_writes> and the Agent pulls down a batch of 900 records, it will create four files with 250, 250, 250 and 150 records (900 total) and then push all four files to the AWS S3 bucket once it has finished processing the batch.

The maximum number of records to write in one file. Once this max has been reached, this will close the current file and create the next file for writing data.

This configuration is used only in the multiple records per file and one file for all records setups.

Multiple Records Per File

One File for All Records

Either <buffered_writes> or <file_max_size> is required

<file_max_size>

<file_max_size>50KB</file_max_size>

NOTE:

The size can be in KB, MB, or GB, i.e. 50KB, 250MB, 1GB. Make sure to have NO space in between the number and the unit.
The minimum value for this directive is 25KB. If you input a value less than 25KB, the value will be set to 25KB.
Conversely, the maximum value for this directive is 10GB. If you input a value greater than 10GB, the value will be set to 10GB.
The <file_max_size> directive (this configuration) takes precedence over the <buffered_writes> directive. If you have both directives configured, <file_max_size> will be used instead of <buffered_writes>.
Once this max has been reached, this will close the current file and create the next file for writing data.
Each batch of records the Agent pulls down from the Perspectium Integration Mesh (MBS), which is controlled by your <polling_interval> configuration for how often the Agent polls MBS for another batch of records, will be pushed to the AWS S3 bucket. So this may result in files being pushed that are not at the maximum file size specified in this configuration. For example, if you have <file_max_size>50KB</file_max_size> and the Agent pulls down a batch of records that end up being 125KB, it will create three files of 50KB, 50KB and 25KB (125KB total) and then push all three records to the AWS S3 bucket once it has finished processing the batch.

Sets the maximum size for each file. Once the maximum size has been reached, a new file will be created using the current timestamp as specified in the <file_prefix> directive.

This configuration is used only in the multiple records per file and one file for all records setups.

NOTE: If the next record to be saved will cause the file to be over the max size specified, then the current file is closed and a new file is created. For example, if <file_max_size> is 100KB, the current file size is 99KB and the next record is 2KB, then the current file will be closed at 99KB and a new file will be created to hold this next record.

Multiple Records Per File

One File for All Records

Either <buffered_writes> or <file_max_size> is required

<valid_json_file/>

Available in Lithium 9.0.2 and newer

Info
This feature will only work when the <file_format> configuration is specified as json

Add this directive if you want to create JSON files that are valid JSON files i.e. JSON files that are JSON arrays with records separated by commas such as:

Code Block
[ {"field1":"record1val1", "field2","record1val2"} ,{"field1":"record2val1", "field2","record2val2"} ,{"field1":"record3val1", "field2","record3val2"} ]

By default (if this directive is not specified), the Agent creates JSON files so each JSON record is on its own line, similar to how CSV files contain a CSV record on each line.

Using this feature will allow you to create the JSON file as a JSON array of the records so the file will be considered a valid JSON file.

Multiple Records Per File

UI Step
Save the changes you've made to your agent.xml and close the file.

UI Step
After configuring your agent.xml file to support your AWS S3 Subscriber Agent, start running your DataSync Agent again.

Anchor
sampleconfigurations
sampleconfigurations

Sample Configurations

You have three different ways of saving files that are then pushed to AWS S3 buckets. Below is a sample configurations for each along with details. In all examples, files will be named with a <randomized_unique_identifier> value (such as 00b470b7-901c-4447-9316-023a265d632f)to ensure there are no file naming collisions when saving to the S3 bucket.

One Record Per File

In this setup, each record is saved into its own file that is then pushed into the AWS S3 bucket.

Code Block

language	xml
theme	Eclipse

<?xml version="1.0" encoding="UTF-8"?>
<config>
   <agent>
      <share />
      <subscribe>
         <task>
            <task_name>s3_agent_subscribe</task_name>
            <message_connection password="password" user="user">https://<customer>.perspectium.net</message_connection>
    	    <instance_connection password="password" user="user">https://<instance>.service-now.com</instance_connection>   
            <handler>com.perspectium.replicator.file.S3Subscriber</handler>
            <decryption_key>My special key for decrypting the data</decryption_key>
            <access_key>AccessKey</access_key>
            <secret_access_key>SecretAccessKey</secret_access_key>
            <region>us-west-2</region>
            <s3_bucket>examples3bucket</s3_bucket>
            <file_format>json</file_format>
            <file_prefix>record_</file_prefix>
			<file_suffix>.json</file_suffix>
		 	<files_directory>datasyncfile/azureblobfiles</files_directory>
            <one_record_per_file/>
         </task>
      </subscribe>
      <polling_interval>40</polling_interval>
   </agent>
</config>

In this case, each record will be saved in its own file named <file_prefix><randomized_unique_identifier><file_suffix>. Using the above configuration example, a file would be named record_00b470b7-901c-4447-9316-023a265d632f.xml.

Multiple Records Per File

In this setup, multiple records are saved into a file before a new file is created. This way we're creating batches of records in one file so we're not creating too many files while also not creating files that are too large as well. This option also allows you to specify multiple records of one table into their own files so each file can only contain multiple records of the same table i.e. one file only contains incident records, one file only contains ticket records, etc.

Code Block

language	xml
theme	Eclipse

<?xml version="1.0" encoding="UTF-8"?>
<config>
   <agent>
      <share />
      <subscribe>
         <task>
            <task_name>s3_agent_subscribe</task_name>
            <message_connection password="password" user="user">https://<customer>.perspectium.net</message_connection>
    	    <instance_connection password="password" user="user">https://<instance>.service-now.com</instance_connection>   
            <handler>com.perspectium.replicator.file.S3Subscriber</handler>
            <decryption_key>My special key for decrypting the data</decryption_key>
            <access_key>AccessKey</access_key>
            <secret_access_key>SecretAccessKey</secret_access_key>
            <region>us-west-2</region>
            <s3_bucket>examples3bucket</s3_bucket>
            <file_format>json</file_format>			
			<file_prefix>$table_$d{yyyyMMddHHmm}_</file_prefix>
            <file_suffix>.json</file_suffix>	
		 	<files_directory>datasyncfile/azureblobfiles</files_directory>					
			<buffered_writes>300</buffered_writes>			
         </task>
      </subscribe>
      <polling_interval>40</polling_interval>
   </agent>
</config>

In the above example, we specify the files to be named <file_prefix><randomized_unique_identifier><file_suffix> which in this case will be based on the table and a datetime value that is today's date with the current hour and minutes, while saving up to 300 records per file using the <buffered_writes> configuration. Once the <buffered_writes> maximum has been reached, a new file at that datetime with a new randomized_unique identifier will be created.

NOTE: <buffered_writes> and <file_max_size> are used while processing each batch of records of records the Agent pulls down from MBS. As the Agent processes each batch, it will push the files created from this batch of records to the AWS S3 bucket and then repeat the process with the next batch of records it pulls down from MBS, creating new files with new <file_prefix><randomized_unique_identifier><file_suffix> names. So this means that there may be files that have less records than <buffered_writes> or smaller in size than <file_max_size> that are pushed to the AWS S3 bucket. For example, if incident records are shared and the Agent processes them in two different batches, this will result in files named:

incident_202501132122_4d747761-62d0-49ae-87fc-4998e259727d.json

incident_202501132122_5af73396-5bd9-49e1-8ac5-da8fb8a12047.json

incident_202501132124_6445a8ab-f694-482c-a315-01192cc84d7a.json

incident_202501132124_e4343ccb-a344-212c-b213-00055cc34a3b.json

See the <buffered_writes> and <file_max_size>configurations above for more information.

One File for All Records

In this setup, when the Agent receives a batch of records from MBS, all the records, regardless of which ServiceNow table it belongs to, will be saved into one file that is then pushed to AWS S3. This process is repeated each time the Agent receives a new batch of records from MBS (which is controlled by your <polling_interval> configuration for how often the Agent polls MBS for another batch of records).

Code Block

language	xml
theme	Eclipse

<?xml version="1.0" encoding="UTF-8"?>
<config>
   <agent>
      <share />
      <subscribe>
         <task>
            <task_name>s3_agent_subscribe</task_name>
            <message_connection password="password" user="user">https://

mesh.perspectium.net</message_connection>
    	    <instance_connection password="password" user="user">https://

myinstance.service-now.com</instance_connection>   
            <handler>com.perspectium.replicator.file.S3Subscriber</handler>
            <decryption_

key>The

key>My special

cow

key

jumped

for

over

decrypting the

moon<

data</decryption_key>
            <access_key>AccessKey</access_key>
            <secret_access_key>SecretAccessKey</secret_access_key>
            <region>us-west-2</region>

<s3_bucket>examples3bucket</s3_bucket> <file_format>json</file_format>

<file

<s3_

prefix>record_<

bucket>examples3bucket</

file_prefix> <file_suffix>.json</file_suffix>

s3_bucket>

<one

<file_

record_per_file/>

format>json</file_format>
		 	<files_directory>datasyncfile/azureblobfiles</files_directory>
         </task>
      </subscribe>
      <polling_interval>40</polling_interval>
   </agent>
</config>

In this case, each record will be saved in its own file named<file_prefix><randomized_unique_identifier><file_suffix>

The files saved in the AWS S3 bucket will be named <task_name>.<randomized_unique_identifier>.<file_format>. Using the above configuration example, a file would be named

record
s3_agent_subscribe.00b470b7-901c-4447-9316-023a265d632f.
xml. UI StepAfter configuring your agent.xml file to support your AWS S3 Subscriber Agent, start running your DataSync Agent again
json and then this file would be pushed into your AWS S3 bucket.

Content

Space Tools

Versions Compared

Old Version 1

New Version Current

Key

Prerequisites

Procedure

Anchor
sampleconfigurations
sampleconfigurations

Sample Configurations

One File for All Records

Content

Space Tools

Page History

Versions Compared

Old Version 1

New Version Current

Key

Prerequisites

Procedure

Anchorsampleconfigurationssampleconfigurations

Sample Configurations

One File for All Records

Anchor
sampleconfigurations
sampleconfigurations