Delta Data Processing with Dell Boomi

Delta data synchronization is a technique of creating harmony and consistency of data among all the systems that have access to that data, thereby avoiding inconsistency in the data record.

Data synchronization has been a growing concern due to increased use of mobile devices, which involves synchronization of users’ data such as email and any operational data they use.

Data synchronization evades data conflicts, which can result in errors and low-quality, low-trust data. Synchronized, reliable data is necessary for security, compliance, and a wide variety of operational functions. Organizations which trusts the quality of their data will drive higher performance, reputation, and cost-efficiency.

What is Delta data?

A Delta or an incremental data processing is one that only processes the data records that have changed (created or modified) since the last time the integration ran which avoids processing the entire data set every time.

Significance of processing Delta:

  • To limit the number of records being processed to only that have been updated/modified.
  • To allow the integration to run quickly and efficiently.
  • Processing only incremental/Delta records can increase the frequency of the integration execution to keep systems up to date.

Techniques we use in Dell Boomi to process Delta

  1. Extract the records By a FLAG from the source system and process.
  2. Extract the records By Last modified Timestamp of the record on the source system.
  3. Dell Boomi CDC (Change Data Capture) mechanism using Find Changes (Extract the whole set of data every time and compare the Delta outside of the source system and process).

Dell Boomi CDC mechanism using Find Changes shape.

  • This approach can be used if we can't use a flag field, and there is no last modified date provisioned in the data.
  • In this approach, the integration must remember the entire data set from the last execution in a “cache” to be able to compare the current data set against it.
  • With this approach, we can handle Deleted records as well as the comparison takes place between the cached data and the current data.

Advantages:

  • No end-application customizations or logic necessary

Disadvantages:

  • Makes the integration very stateful because the entire data set is stored in the integration.
  • Requires the extraction and processing of the entire data set, which could be a very large number of records. This will typically take a longer time to run.
  • The cache can be “corrupted” if a “bad” data set is processed, with unintentional results. For example, if you accidentally synced a small “test” data set against your full production cache, it would look like a large number of records were deleted which could be very bad for the destination application.

Implementation in Dell Boomi:

  • Configure the start shape operation to retrieve the entire data set every time, whether this is an application connector query or a file-based connector.
  • Use the Find Changes step to determine which records should be added, updated, or deleted in the destination application.
  • Find changes shape will do the comparison with the cached data and the current data set and route the records to the respective path (Insert, Update, and Delete).
  • If the Target is database use different database operation to perform each operation in the database table.

Pre-requisites

  • Install MSSQL in local machine or any other database and create a table to write the data into it.
  • Create a local directory to place the file for Boomi to pick in-order to process them.
  • Create a dummy CSV file and place it in the local directory.

Step by step Implementation in Boomi:

  1. Login to Boomi with the URL https://platform.boomi.com and the user credentials.
  2. Click on “process” under create panel of the welcome page of Dell Boomi platform.
  3. You will be directed to a process canvas with a pop up to select to configure the connection, since we are fetching data from a disk location, we need to select connector as Disk from the drop down as shown in the screen shot below.
  4. Select the action as “GET”.
  1. Since we don’t have a Disk connection already configured you need to click on green plus + symbol against the connection column.( If the connection is already configured you can browse it by clicking on the magnifying glass in against the connection column.)
  2. You will now re-directed to the connection configuration window.
  3. Configure a Disk connector by passing the required connection parameters as shown in the below screen shot.
  4. Give a meaningful name for the connection and click on Save and Closer button.
  1. To configure the connection operation, click on green plus + icon against the Operation column in the first screen shot.
  2. You will be directed to the connection operation window as show below.
  3. Enter the appropriate values for the expected parameters in the operation configuration window as show below.
  4. Give appropriate name for the operation and click on Save and Close button.
  1. Now we are set to extract the data from the local Disk into Boomi.
  2. Take the structure of the CSV file an import it into Boomi in order to create a Flat file profile to use it in further steps (Data process, Find Changes and Maps)
  3. Configure a data process shape to split and process the records by dragging “Data Process” shape into process canvas as shown below.
  1. Configure “Find Changes” shape by dragging and dropping it from shapes panel to the process canvas and configure it to reference “EmpId” as a key field for the comparison as shown in the below screen shot.
  1. Now the “Find Changes” shape is appearing with 3 leads (branches) with names “Add”, “Update” and “Delete”.
How Find Changes works:
  • During the initial execution of the process all the documents will pass through Add path because the Find Changes did not store any file before in the CDC directory of the Boomi atom.
  • After completion of the first execution Find Changes will store the source file which is received by Boomi in the CDC directory of the local atom.
  • Starting from the second execution, Find changes will compare the source data against the data stored in the CDC directory and Identify if there are any new records coming or any of the existing records got updated or any of the records got missed from the source file.
  • After completion of this comparison process, if Boomi finds any new records then those will be directed to “Add” path, if any records updated then those will be directed to “Update” path, if Boomi doesn’t find any of the records in source file but available in CDC file then those records will be directed to “Delete” path.
  1. Now configure database operation to Insert (DynamicInsert) to insert the data into the database table
  1. Repeat step 18 for Update (Use DnamicUpdate) and Delete actions using the database connection.
  2. Now configure the 3 maps to convert Flat file data into database data for all the 3 actions (Insert, Update and Delete).
  3. To Configure map drag and drop the map icon from the panel to the process canvas and click on the Green plus icon as shown below.
  1. Upon clicking the + icon you will be directed to the map configuration page as below.
  1. Give appropriate name for the map by clicking on the “New Map” and click on “Choose” on the left hand side and select the CSV Flat file which was created earlier.
  2. Click on “Choose” on the right side and browse of the data base insert operation which was configured early.
  3. Now do the mapping by dragging and dropping from left side fields to the corresponding fields on the right hand side as shown below.
  1. Repeat steps from 21 to 25 to configure for “Update” and “Delete” actions and connect then to the data base connections.
  2. Below is the final process that looks like to process delta data using Find changes (CDC Mechanism).

Royal Cyber with Dynamic Data Processing

Dell Boomi is a one of the leader in the iPaaS platform. Royal Cyber’s Dell Boomi experts are constantly providing content and context to help our clients achieve their integration goals with Dell Boomi. For more information on delta data processing with Dell Boomi, you can email us at info@royalcyber.com or visit www.royalcyber.com.

Leave a Reply