All About Diff

Diff

Diff node lets you identify values that have been added, removed, or updated in a source.

This article covers:

  • • How Do I Add The Diff Node?
  • • How Do I Use the Diff Node?
  • • FAQ

How Do I Add The Diff Node?

Add a Diff to any node in your Dataflow. Select any node and use the Transform (+) button to open the list of options to select Diff. Note that you could work with the node in the Dataflow tab or use the Properties panel.

How Do I Use the Diff Node?

Follow these easy steps

  1. Create a copy of the same source node.
  2. Select which versions to evaluate the difference.
  3. Select a primary key.
  4. Option: Rename indicators for added, removed, updated.
  5. Choose data overlay.

Create a copy of the same source node.

After importing your source in a Dataflow, click on the node, select the Duplicate option in the node toolbar, and place the nods side-by-side.

Select which versions to evaluate the difference.

On the left-hand node, select the previous version or the original version of the file. On the right side node, choose a more recent version.

Select a primary key.

The Diff node uses the Primary Key dropdown to select which field to use as a base for comparison. We recommend choosing a field with unique values.

Optional: Rename the indicators for added, removed, and updated.

There are three indicators of of changes made to data.

  1. (+)  identifies how many records have been added.
  2. (-) identifies how many records have been removed.
  3. (0) identifies how many records have been updated.

You could rename these labels to something more readable like created, deleted, and updated.

Choose data overlay.

The last dropdown has three options to overlay data between the two sources:

  1. none - no overlay, just show the differences between the two sources.
  2. old - the changes will be shown on the older source (i.e., left side)
  3. new - the changes will be shown on the newer source (i.e., right side)

Check out the links below for related articles: