Set up the DataSync Agent to share to files

Lithium

The DataSync Agent can be set up to replicate table records from a Service Now instance to local file(s) on the machine where the DataSync Agent is running. This can be useful for the case where you have a separate application that is able to read files to import data.

Records can be saved in CSV, JSON, and XML formats and each record is inserted into the file (i.e. the previous version is not updated) when the Agent processes a message.

Prerequisites

First, you will need to set up the Perspectium DataSync Agent.

You should also stop running your DataSync Agent before making any Agent configuration changes.

Procedure

To enable file replication for the DataSync Agent, follow these steps:

Navigate to the directory where you saved your agent.xml file when installing your DataSync Agent.

Within the <task> tag, nest the following directives with your choice of how you want to save your records:

All Records in One File
One Record per File
Records to Multiple Files

These three options are detailed below:

All Records in One File

If you want to save all records in one file, use the following directives:

Directive

Example

Use

Required?

<handler>

<handler>com.perspectium.replicator.file.XMLFileSubscriber</handler>

File Type	Value
CSV	com.perspectium.replicator.file.CSVFileSubscriber
JSON	com.perspectium.replicator.file.JSONFileSubscriber
XML	com.perspectium.replicator.file.XMLFileSubscriber

NOTE: Invalid JSON messages, such as contents not properly escaped, will be skipped. An error log will appear when this occurs.

The name of the file handler class

Yes

<file_name>

<file_name>records.csv</file_name>

The name of the file to which you want to save the records

Yes

<files_directory>

For Linux:

<files_directory>/Downloads/subscribefiles/</files_directory>

For Windows:

<files_directory>Users\Downloads\subscribefiles\</files_directory>

The directory that contains the file of the saved records

NOTE: It is important to have the slash at the end of the directory path entered ('/' for Linux and '\' for Windows) i.e. /Downloads/subscribefiles/

Yes

<buffered_writes>

<buffered_writes>250</buffered_writes>

A number of records to buffer before writing to file (to improve performance and not write to the file upon reading each record)

No

<exclude_xml_header>

<exclude_xml_header/>

For use with the XMLFileSubscriber handler, this will only output the xml header tag (i.e. <?xml version=“1.0” encoding=“UTF-8”?>) once at the top of the file. That way you can treat the entire file as one XML file with multiple elements for parsing.

For example, with this configuration, the file will be:
<?xml version=“1.0” encoding=“UTF-8”?>
<incident></incident>
<incident></incident>
<cmdb_ci></cmdb_ci>

versus
<?xml version=“1.0” encoding=“UTF-8”?><incident></incident>
<?xml version=“1.0” encoding=“UTF-8”?><incident></incident>
<?xml version=“1.0” encoding=“UTF-8”?><cmdb_ci></cmdb_ci>

No

One Record Per File

If you want to save one record per file, use the following directives:

Directive

Example

Use

Required?

<handler>

<handler>com.perspectium.replicator.file.XMLFileSubscriber</handler>

File Type	Value
CSV	com.perspectium.replicator.file.CSVFileSubscriber
JSON	com.perspectium.replicator.file.JSONFileSubscriber
XML	com.perspectium.replicator.file.XMLFileSubscriber

NOTE: Invalid JSON messages, such as contents not properly escaped, will be skipped. An error log will appear when this occurs.

The name of the file handler class

Yes

<one_record_per_file>

<one_record_per_file/>

This directive will tell the agent to save each record into its own file instead of saving all records together in a single file.

Yes

<files_directory>

For Linux:

<files_directory>/Downloads/subscribefiles/</files_directory>

For Windows:

<files_directory>Users\Downloads\subscribefiles\</files_directory>

The directory that contains the file of the saved records

NOTE: It is important to have the slash at the end of the directory path entered ('/' for Linux and '\' for Windows) i.e. /Downloads/subscribefiles/

Yes

<file_prefix>

<file_prefix>record</file_prefix>

A prefix for the file name of each record. If this directive is not specified, “psp.replicator.” will be used as the prefix.

Varies

(Required if using <file_suffix>)

<file_suffix>

<file_suffix>.xml</file_suffix>

A suffix for the file name of each record. If this directive is not specified, “.xml” will be used as the suffix.

Varies

(Required if using <file_prefix>)

<translate_newline>

<translate_newline>nbsp</translate_newline>

This directive will replace record content newline entries with a non-breaking space.

No

Records to Multiple Files

If you want to save your records to multiple files, use the following directives.

NOTE: By default the records are saved so that one record is on each line of the file, following how you would create CSV files. For XML and JSON, this isn't considered valid XML/JSON files. If you want the JSON to be a valid JSON file as an array of JSON objects, see the <valid_json_file/> configuration below.

Directive

Example

Use

Required?

<handler>

<handler>com.perspectium.replicator.file.XMLFileSubscriber</handler>

File Type	Value
CSV	com.perspectium.replicator.file.CSVFileSubscriber
JSON	com.perspectium.replicator.file.JSONFileSubscriber
XML	com.perspectium.replicator.file.XMLFileSubscriber

NOTE: Invalid JSON messages, such as contents not properly escaped, will be skipped. An error log will appear when this occurs.

The name of the file handler class

Yes

<buffered_writes>

<buffered_writes>250</buffered_writes>

NOTE:

The <file_max_size> directive (see below) takes precedence over the <buffered_writes> directive. If you have both directives configured, <file_max_size> will be used instead of <buffered_writes>.
Once this max has been reached, this will close the current file and create the next file for writing data.

The maximum number of records to buffer before writing to a file (to improve performance and not write to the file upon reading each record). Once this max has been reached, this will close the current file and create the next file for writing data.

Either <buffered_writes> or <file_max_size> is required

<file_max_size>

<file_max_size>50KB</file_max_size>

NOTE:

The size can be in KB, MB, or GB, i.e. 50KB, 250MB, 1GB. Make sure to have NO space in between the number and the unit.
The minimum value for this directive is 25KB. If you input a value less than 25KB, the value will be set to 25KB.
Conversely, the maximum value for this directive is 10GB. If you input a value greater than 10GB, the value will be set to 10GB.
The <file_max_size> directive (this configuration) takes precedence over the <buffered_writes> directive. If you have both directives configured, <file_max_size> will be used instead of <buffered_writes>.
Once this max has been reached, this will close the current file and create the next file for writing data.

Sets the maximum size for each file. Once the maximum size has been reached, a new file will be created using the current timestamp as specified in the <file_prefix> directive.

NOTE: If the next record to be saved will cause the file to be over the max size specified, then the current file is closed and a new file is created. For example, if <file_max_size> is 100KB, the current file size is 99KB and the next record is 2KB, then the current file will be closed at 99KB and a new file will be created to hold this next record.

Either <buffered_writes> or <file_max_size> is required

<files_directory>

For Linux:

<files_directory>/Downloads/subscribefiles/</files_directory>

For Windows:

<files_directory>Users\Downloads\subscribefiles\</files_directory>

The directory that contains the file of the saved records

NOTE: It is important to have the slash at the end of the directory path entered ('/' for Linux and '\' for Windows) i.e. /Downloads/subscribefiles/

Yes

<file_prefix>

<file_prefix>record</file_prefix>

NOTE: Use the value $table_$d{yyyyMMdd}_$i to set a dynamic file name where table will be the record's table, yyyyMMdd will be the date format, and i will be file number, i.e. problem_20200530_1.json.

You can modify yyyyMMdd with other date format of your choice. For example, hourly will need a yyyyMMddHH value. For other date format, see Date Format.

<file_prefix>$table_$d{yyyyMMdd}_$i</file_prefix>

A prefix for the file name of each record. If this directive is not specified, “psp.replicator.” will be used as the prefix.

NOTE: The time period will be configured in this directory.

Varies

(Required if using <file_suffix>)

<file_suffix>

<file_suffix>.xml</file_suffix>

File Type	Value
CSV	.csv
JSON	.json
XML	.xml

A suffix for the file name of each record. If this directive is not specified, “.xml” will be used as the suffix.

Varies

(Required if using <file_prefix>)

<file_name>

<file_name>records.csv</file_name>

The name of the file to which you want to save the records

Varies

(Required if NOT using <file_prefix> and <file_suffix>)

<separate_files>

<separate_files>table</separate_files>

Indicates that the files will be separated by table.

Yes

<enable_audit_log/>

<enable_audit_log/>

A self-closing directive that will generate an audit file. The audit file has information about when the records are processed, name of the file, and number of records processed.

No

<translate_newline>

<translate_newline>%13</translate_newline>

This directive will replace record content newline entries with the value you input.

Varies

<close_file_interval>

<close_file_interval>180</close_file_interval>

Sets the amount of time (in seconds) the Agent will wait for tmp files to be modified.

If this configuration is not specified, the default interval will be 1800 seconds (30 minutes).

When the DataSync Agent is creating the file, the file name will have a prefix “tmp” in front to indicate that the file is writing records until file_max_size or buffered_writes is reached. Once the DataSync Agent finishes creating the file due to file_max_size or buffered_writes reached, the prefix “tmp” will be removed from the file name.

When the tmp file hasn’t been modified for certain amount of time, the “tmp” prefix will be removed from the filename and the file will be closed.

No

<translations>

This feature will only works with the JSON handler.

<translations>
	<translation>
		<search_value>\</search_value>
	    <replacement_value>\\\</replacement_value>
	    <order>7</order>
	</translation>		        		                
	<translation>
		<search_value>\r</search_value>
		<replacement_value>\ r</replacement_value>
	    <order>3</order>
	</translation>
	<translation>
		<search_value>\b</search_value>
	    <replacement_value>\ b</replacement_value>
	    <order>2</order>
	</translation>
	<translation>
		<search_value>\n</search_value>
	    <replacement_value>\ n</replacement_value>
	    <order>1</order>
	</translation>	
	<translation>
		<search_value>\t</search_value>
	    <replacement_value>\ t</replacement_value>
	    <order>4</order>
	</translation>
	<translation>
		<search_value>\"</search_value>
	    <replacement_value>\ "</replacement_value>
	    <order>5</order>
	</translation>
	<translation>
	   	<search_value>\f</search_value>
	    <replacement_value>\ f</replacement_value>
	    <order>6</order>
	</translation>	        	        			        	
</translations>

For example, to translate the value \n into \ n (with a space between \ and n) and have this as the first translation to be processed, you would specify:

 <translation>
	<search_value>\n</search_value>
    <replacement_value>\ n</replacement_value>
    <order>1</order>
</translation>

Allows you to specify how values in each record should be "translated" for saving into the file. This is useful for when you want to do your own escaping of the content before its saved into the file. The translation is done on every value in every field of the record.

This feature is intended for translating the content as seen in the ServiceNow UI. That is, if you enter the value abc\ndp\prs in the UI for a field in ServiceNow (such as the Article Body field in a Knowledge Base Article record), these translations will run on this value such as if you had a <translation> for \n and \ which would then translate the content to abc\\ ndp\\prs.

Note that when the content is actually saved to the JSON file, additional escaping will be done to ensure that content is escaped properly for JSON i.e. if you have <p style="line-height: normal; margin: 0in 0in 8pt; font-size: 10pt; font-family: Arial, sans-serif;"> in your content, this will becomes <p style=\"line-height: normal; margin: 0in 0in 8pt; font-size: 10pt; font-family: Arial, sans-serif;\"> to properly escape " for saving it as a value in JSON. With the above example content you escaped, this would become abc\\\\ ndp\\\\prs since it would add an additional \ for each \ to properly escape it for saving to JSON as well.

To configure, under <translations> you specify a <translation> for each translation you want to do. Each <translation> should have the following configurations:

<search_value> - The value in the record that should be searched for and replaced when the record is saved in the file.

<replacement_value> - The value that should be saved as the replacement when the record is saved in the file.

<order> - The order to process the translations. This value is an integer (whole number). The <translation> record with the lowest order number will be processed first (i.e. if you have translations with the orders 1, 2 and 3, 1 will be processed first followed by 2, 3, etc.).

NOTE:

This feature will execute before the <translate_newline> feature runs so if you have a <translation> for \n here, the <translate_newline> may not find anything when it runs if \n was already changed with this feature.
If you want to specify so \ is translated to \\ (one backslash becomes two), you will want to specify the <replacement_value> as \\\ (3 backslashes) because of how Java escapes \ when multiple are together.

No

<valid_json_file/>

Available in Lithium 9.0.2 and newer

This feature will only works with the JSON handler.

Add this directive if you want to create JSON files that are valid JSON files i.e. JSON files that are JSON arrays with records separated by commas such as:

[
{"field1":"record1val1", "field2","record1val2"}
,{"field1":"record2val1", "field2","record2val2"}
,{"field1":"record3val1", "field2","record3val2"} 
]

By default (if this directive is not specified), the Agent creates JSON files so each JSON record is on its own line, similar to how CSV files contain a CSV record on each line.

Using this feature will allow you to create the JSON file as a JSON array of the records so the file will be considered a valid JSON file.

NOTE: This feature will use the <buffered_writes>/<file_max_size> and <close_file_interval> to close each file the Agent creates (at which time the Agent won't write any additional records to a file) and add the closing ] to make each file a valid JSON file. JSON files created by the Agent will not be valid JSON files until the file closing occurs.

No

Save the changes you've made to your agent.xml and close the file.

An example agent.xml configuration for saving all records to a single file is shown below:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<config>
    <agent>
        <max_reads_per_connect>10</max_reads_per_connect>
        <polling_interval>20</polling_interval>
		<subscribe>
            <task>
            	<task_name>file_subscribe</task_name>
	        	<message_connection password="password" user="user">https://<customer>.perspectium.net</message_connection>
    	        <instance_connection password="password" user="user">https://<instance>.service-now.com</instance_connection>            
            	<decryption_key>The cow jumped over the moon</decryption_key>
            	<handler>com.perspectium.replicator.file.XMLFileSubscriber</handler>
            
                <file_name>records.xml</file_name>
                <files_directory>/Users/user/Downloads</files_directory>            	            	
            	<exclude_xml_header/>
            	<buffered_writes>250</buffered_writes>         	
            </task>
    	</subscribe>  
    </agent>
</config>

An example agent.xml configuration for saving one record per file is shown below:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<config>
    <agent>
        <max_reads_per_connect>10</max_reads_per_connect>
        <polling_interval>20</polling_interval>
		<subscribe>
            <task>
            	<task_name>file_subscribe</task_name>
	        	<message_connection password="password" user="user">https://<customer>.perspectium.net</message_connection>
    	        <instance_connection password="password" user="user">https://<instance>.service-now.com</instance_connection>            
            	<decryption_key>The cow jumped over the moon</decryption_key>
            	<handler>com.perspectium.replicator.file.XMLFileSubscriber</handler>
                
                <one_record_per_file/>                
                <files_directory>/tmp</files_directory>            	            	
            	<file_prefix>records</file_prefix>
                <file_suffix>.xml</file_suffix>         	
            </task>
    	</subscribe>  
    </agent>
</config>

An example agent.xml configuration for saving records to multiple files is shown below:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<config>
    <agent>
        <!-- the following subscribe fragment defines subscribing class -->
        <!-- and its arguments -->       
        <subscribe>
            <task>
            	<task_name>test_file_subscriber</task_name>
	            <message_connection password="password_here" user="admin" queue="psp.in.meshlet.example">https://<customer>.perspectium.net</message_connection>
    	        <instance_connection password="Adminadmin1" user="admin">https://<instance>.service-now.com</instance_connection>            
                <max_reads_per_connect>1</max_reads_per_connect>
                <polling_interval>3</polling_interval>             
                <decryption_key>Example_decryption_key_here</decryption_key>   
				<handler>com.perspectium.replicator.file.JSONFileSubscriber</handler>

                <buffered_writes>10</buffered_writes>              
                <files_directory>/Users/You/Downloads/Example</files_directory>            	            	
            	<file_prefix>$table_$d{yyyyMMdd}_$i</file_prefix>
                <file_suffix>.json</file_suffix>
				<file_max_size>50KB</file_max_size> 
                <translate_newline>%13</translate_newline>
                <separate_files>table</separate_files>
                <enable_audit_log/>
				<close_file_interval>180</close_file_interval>
            </task>
        </subscribe>
    </agent>
</config>

After configuring your agent.xml file to enable file replication, start running your DataSync Agent again.

Content

Space Tools

Prerequisites

Procedure