Workflows

Overview

A DataBlend workflow is a sequence of steps which can be scheduled to run together. The most common use case for a workflow is to fetch the latest data from a third party system (collection), transform that data (query), and load it to a third party system (target). Additional operations support this basic use case.

Step Types

Creating a workflow is a simple way to automate multiple Collectors, Queries, Data Targets, Tasks, and Unpivots.

Steps available include:

  • Delete Collection

  • Purge Collection Streams

  • Purge Query Execution Results

  • Run Collector

  • Run Credential Test

  • Run Data Quality Report

  • Run Data Target

  • Run Query

  • Run Task

  • Run Unpivot

Preliminary Steps

Run Credential Test

Data collection and data target jobs will fail if the credentials to the third party systems are not valid. To prevent encountering an authentication error in the middle of a workflow execution, credentials can be explicitly tested at the beginning of a workflow.

  1. Select a step type of Run Credential Test.

  2. Set a name (free-text) to describe what this step accomplishes.

  3. Select the credential to be tested from the list of configured credentials.

Main Steps

Delete Collection

Delete Collection is a step allowing users to delete a specific job based on a collection id.

Collection Ids are found at the end of the collection URL.

https://ml.datablend.com/f7f12366-f8a1-4123-befb-4a3b11325236/collectors/78920b37-9111-4acb-903d-d55e09250f6b/collections/ 6a3015ab-54c0-1111-6696-dc1c89562322

Purge Collection Streams

Purge Collection Stream is a step allowing users to delete a specific data stream based on a collection id.

Collection Ids are found at the end of the collection URL.

https://ml.datablend.com/f7f12366-f8a1-4123-befb-4a3b11325236/collectors/78920b37-9111-4acb-903d-d55e09250f6b/collections/ 6a3015ab-54c0-1111-6696-dc1c89562322

Purge Query Execution Results

Purge Query Execution Results is a step allowing users to delete results based on a query id.

Query Execution Ids are found at the end of the Query URL.

https://ml.datablend.com/f7feca77-f8a1-4123-befb-4a3b1130d8d2/queries/3ff419c6-65f8-4c42-92bc-3f0b76245429/executions/ 6a3015ab-54c0-1111-6696-dc1c89562322

Run Collector

Run Collector is a step allowing users to run collectors as part of a workflow. This allows for simple tracking of multiple steps.

Run Data Quality Report

Run Data Quality Report is a step that helps users stay up to date with the latest information about their data. Data Quality Reports can be set up to alert users anytime for a variety of concerns.

Run Data Target

Run Data Target is a step used to send query results to an external system. The use of Data Targets saves time and eliminates time consuming manual data populating.

Run Query

Run Query is a step that allows users to run a query as part of a workflow. This step allows for simple tracking of multiple steps.

Run Task

Run Task is a step that allows users to do additional steps to their data within the integration process. The Task section features an Delay, Erase Data Type, or Script Type.

Run Unpivot

Run Unpivot is a step that allows users to turn table columns into table rows. This enables simple query creation and data target creation.

Additional Steps

Parameters

Parameters can be applied to a Workflow or Query Result to enable ease of access to variables that may change from time to time (e.g. date ranges, filters, mapping values, etc.)

To add a parameter, navigate to the parameters tab and click Add Parameter

Give the parameter a name and select a Type (Boolean, Date, Relative Date or String). Based on this choice, a specific Value field will be displayed:

String

A string is a parameter that is useful for characters, text, numbers, or symbols.

Date

Date parameters provide users the ability to collect data within a specific window of time. The dates are entered as specific dates such as Month/Day/Year.

Relative Date

Relative Date parameters provide users the ability to collect data within a relative window of time. The dates are entered as within a wide variety of timeframes such as start of the first quarter and end of the last quarter.

Boolean

A Boolean Parameter is useful for users wishing to utilize True, False, or NULL values.

 

Advanced

Field

Required/ Optional

Comments

History Retention (Days)

Optional

Default set as zero. Users may set the days they wish their collector data to be stored.

Timeout (seconds)

Optional

The Timeout section allows users to determine if they would like to timeout collections taking longer than a set number of seconds to collect data.

Run As

Required

Run As allows users to select from a drop-down list of users to run the Workflow.

Schedule

Optional

The Schedule option is a convenient way for users to make sure collections are running at the desired time. Simply select from the presets menu provided.

Triggers

Users may add a trigger by clicking the Add button. The Trigger section lists all current triggers associated with the Workflow. To learn more about Workflow Triggers, please visit https://datablend.atlassian.net/wiki/spaces/DS1/pages/1473609731.

Error Handling

Workflow execution is sequential. Each step must complete successfully before the subsequent step will begin. If a later step fails, earlier steps are not rolled back. To learn more about step errors, please visit https://datablend.atlassian.net/wiki/spaces/DS1/pages/1196458037/Workflows#Logs.

Details

The Details section documents who the workflow was created and updated by and the corresponding times. This allows for easy tracking of multiple workflows.

Latest Execution

The Latest Execution section documents the state of the workflow, created time, and the status of the Workflow. States include Complete, Warning, Error, Pending, Started, Canceled or Rejected.

Executions

The Executions section documents when the Workflow was created, started, completed and the total amount of data scanned. The status includes information regarding the state of the Workflow. This allows for easy tracking of multiple workflows. Executions will display as: Complete, Warning, Error, Pending, Started, Canceled or Rejected. The values within the State field are clickable, which allow for drilling into details of the Log.

Logs

Job logs are easily accessible via the state link in the Latest Execution section. Click the linked state and the user is taken to the Executions section. Here users view items, details and logs related to the ran job. Logs are downloadable via the download log button indicated at the top right of the log section. Logs are useful to see how much data was collected, the steps taken, and the time in which it occurred.

Execution Steps

The steps section allows users to see where a step may have errored. This includes details, total runtime, and any parameters included in the step. Steps also allow for individual log downloads.

Creating a Favorite

Creating a favorite is simple. Users may favorite a Credential, Collector, Data Target, Query, Data Source, or Workflow. To create a favorite, users navigate to the star icon on the upper left next to Edit.

Saved Views

Saved views are a unique feature offered by DataBlend that allow users to quickly view filtered searches. Setting a saved view is simple. Click the gear icon in the upper right corner. A drop-down will appear with option to save the current view, restore the default view, or copy share URL. Copying a Share URL will allow other users with the URL to view the same saved view.