Queries

Overview

Queries prepare data for use in a Data Target. Any transformations required to make collected data conform to the target system are done through a query.

DataBlend provides a user interface - including wizards and shortcuts - to help you build queries. (For more information on the underlying query language, see the Presto SQL documentation.)

Create a Query - Manual

From the Queries section, click Add to create a new Query.

Advanced Queries

Stream Select Types

The Stream Select Type option is available if you use an Aliased Schema in your query From clause. This option determines whether your query is restricted to the latest stream (Default), a specific stream (Specific), or all streams which exist for the configured schema (All). By default, a query will always run against the latest stream (Default).

Default

By default a query runs against the most recent stream collected for a schema. You will see the unique identifier for the stream added to the generated Presto query.

If another stream is added to the schema, then the stream identifier used in the query is automatically updated so the query is always running on the most recent stream.

This is the most common use case; when new data is collected a new stream is created and this stream is used to prepare (query) the data for transmission to the target system.

All

If the query is intended to run against All streams, then no stream condition is added to the query where clause.

When another stream is added to the schema, the new stream will automatically be included in the query results.

Querying across all streams is useful for:

  • pinpointing when a particular record was collected and processed

  • aggregating data across streams, if each stream is known to include a unique data set

Specific

Streams can be uniquely identified by the schema they belong to and the date and time at which they were created. To query a specific stream, select its creation timestamp from the Stream dropdown. A query where condition will be added to limit results to this stream.

The query results will be “fixed”; that is, they will not change even if additional streams are added to the schema.

Querying against a series of specific streams may be helpful in analyzing how values change over time.

Parameters

Parameters can be applied to a query to enable ease of access to variables that may change from time to time (e.g. date ranges, filters, mapping values, etc.)

To add a parameter, navigate to the parameters tab and click Add Parameter

Give the parameter a name and select a Type (Boolean, Date, Relative Date or String). Based on this choice, a specific Value field will be displayed:

 

Within the Query Builder, select a field to apply the parameter to. The Parameter Name will need to be present between braces {{ and }}

 

String

A string is a parameter that is useful for characters, text, numbers, or symbols.

Date

Date parameters provide users the ability to collect data within a specific window of time. The dates are entered as specific dates such as Month/Day/Year.

Relative Date

Relative Date parameters provide users the ability to collect data within a relative window of time. The dates are entered as within a wide variety of timeframes such as start of the first quarter and end of the last quarter.

Boolean

A Boolean Parameter is useful for users wishing to utilize True, False, or NULL values.

Encrypted Value

Encrypted values are useful to pass encrypted information from Query to Data Target. Please note that values will be stored encrypted and not be able to be decrypted by non-admins. Parameters must be used with the {{}} notation. For example, if a header value in the http data target is an API key that a user wants encrypted they can add a parameter.

Query Result

A Query Result parameter is useful to replace a query result with a specific value. Users must select a specific Query from the drop-down menu from which to insert the intended value.

Custom Relative Date

Calendar Type

“Default” calendar type lets users define an offset of a period from a certain point in time and additionally the start or end of that period and how the result is formatted.

The “Fiscal” calendar limits the period type to only Quarters and Years and the year’s start date is based of the group settings.

Period Type

Options are Hour, Day, Week, Month, Quarter and Year

Range Type

Options are Default, Start Of, and End Of. Default does not change the base date and will not truncate any values. This is useful if the parameter should be exactly one week ago instead of the beginning of last week.

Offset

Determines how many periods should be added or subtracted from the base date.

As Of

Allows the user to change the base date, otherwise the time (UTC) the job is run is used.

Format

Useful in scripts to format the date according to dotnet standards.

SQL

The SQL tab is present and available for users to see exactly what is going on. The SQL tab also allows for in-window SQL editing.

Hide SQL

Enabling the hide SQL toggle allows users to stay organized by using the name fields. Users can give their query a name and a description. This allows for more appropriately labeled queries within the query itself. Hiding SQL makes large queries easier to view.

Query Specifications

 

Select

The select section allows users to populate their query with the desired columns via the column section. Designating columns in the select tab is necessary to tell DataBlend what data users want from specific fields.

From

The From section of the Query selects data for use in a Query. This from section allows users to select where the data will be coming from that they wish to transform. The From tab establishes what tables specifically the user would like DataBlend to access data.

Where

To further refine the data that that a user is requesting in a query, WHERE is needed. Users must select from the drop-down menu and complete the empty fields.

Adding a date range in the WHERE section is useful if date times/ timestamps are in the user’s date.

Group By

The Group By section allows users to group their data in a specific way.

Order By

Using the Order By Data Type can help users get data back in a specific order.

Limit

Using the Limit Data Type allows users to limit the amount of rows that are returned from a query.

 

Pivot

Using the Pivot Data Type allows users to pivot similar to excel. The Pivot function allows users to transfer data from a row to a column

Copy and Delete Functions

To copy or delete a query, simply select the desired query, and click copy or delete. When copying a query, you will be taken to a new query screen. This screen will include the same details from the selected query. Using this function makes creating multiple queries less time-consuming.

Query Wizard

Clicking the ellipsis (…) next to add will present you the option to use our Query wizard. The wizard makes creating queries even simpler.

To learn more about our Query wizards, click here. https://datablend.atlassian.net/wiki/spaces/DS1/pages/2732752903

 

Query Express

Query Express is a unique feature offered by DataBlend that allows users to easily map columns. Simply select a Schema from the drop-down menu and a Data Target form the drop-down menu to begin. Please note that currently only the most popular mapping types are offered in the Query Express feature.

Details

The details section documents who the query was created and updated by and the corresponding times. This allows for easy tracking of multiple queries.

Latest Executions

The latest executions section documents the state of the query, created time, and the status of the query. States include complete and error. When clicking on the linked State (i.e. Error, Complete, Pending) users can access the data stream associated with the query. Users can compare a prior stream to a current stream (i.e. the data collected yesterday vs the data collected today).

Execution Details

Job details are easily accessible via the state link in the Latest Execution section. Click the linked state and the user is taken to the Executions section. Users will see who the query was created and updated by and the corresponding times. This allows for easy tracking of multiple queries.

Logs

Job logs are easily accessible via the state link in the Latest Execution section. Click the linked state and the user is taken to the Executions section. Here users view items, details and logs related to the ran job. Logs are downloadable via the download log button indicated at the top right of the log section. Logs are useful to see how much data was collected, the steps taken, and the time in which it occurred.

 

Execution

 

The execution section documents when the query was created, started, completed and the total amount of data scanned. The status includes information regarding the state of the query. This allows for easy tracking of multiple queries.

Creating a Favorite

Creating a favorite is simple. Users may favorite a Credential, Collector, Data Target, Query, Data Source, or Workflow. To create a favorite, users navigate to the star icon on the upper left next to Edit.

Please note that users cannot favorite an Unpivot, Data Quality Report, Schema, Agent or Notification.

Saved Views

Saved views are a unique feature offered by DataBlend that allow users to quickly view filtered searches. Setting a saved view is simple. Click the gear icon in the upper right corner. A drop-down will appear with option to save the current view, restore the default view, or copy share URL. Copying a Share URL will allow other users with the URL to view the same saved view.

 

Want to see more? Visit our helpful demo page or attend an office hour. https://lp.datablend.com/demos