Local SFTP Collectors

Prerequisites

Configuration

Field

Required/ Optional

Comments

Field

Required/ Optional

Comments

Name

Required

Free Text

Data Source

Required

Select from drop-down.

Type

Required

File Collector

File Collector Type

Required

Local SFTP

Credential

Required

Select from drop-down

Path

Required

Enter an exact filename or a subdirectory and filename.

do not start with '/'.

Use Regular Expression

Optional

If FALSE, then the directory/filename are used exactly as entered in Path.

If TRUE, then the value of Path is treated as a Regular Expression:

.*\.csv - matches any file ending in .csv

.*\.xlsx - matches any file ending in .xlsx

/GC.*\.csv - matches any file starting with ‘GC’ and ending in .csv

Archive Option

Required

Select from drop-down:

  • Copy - the file will be copied to “Archive Path”. “Archive Path” is a required field.

  • Delete - any file(s) processed by the collection are deleted when the collection is completed.

  • Move - the file will be moved to “Archive Path”. “Archive Path” is a required field.

  • None - files which were processed by collection will stay in the original SFTP folder.

Bookmark

Optional

This value is updated each time the collection runs. Only files uploaded AFTER this time will be processed. (This prevents double processing of files which were not deleted by previous collections.)

To reprocess files, clear the bookmark.

File Definition

Required

Select from drop-down:

  • Delimited

  • Excel

  • Flat

  • JSON

Delimiter (Optional)

Only if File Definition is Delimited

Optional

If no delimiter is entered, comma delimited is assumed.

For tab-delimited files, enter ‘/t’ or ‘TAB’.

Other common delimiters: pipe ‘|', semi-colon ‘:’, tilde '~’

Is Array

Only if File Definition is JSON.

Optional

 

Name

Only if File Definition is Delimited or JSON.

Optional

The Schema name within the Data Source.

If this Schema already exists, it will be updated. If it does not exist it will be created.

NOTE: If the Name is changed, queries using the old schema name may no longer work.

Collecting Excel files

  1. Collecting data from all Tabs into separate schemas. Don’t “Add Schema” when setting up the Collector. Schemas will be named with the Tabs names. If Schema added and then deleted the collection will error.
    Troubleshooting: Switch to another File Definition → Save → Switch back to Excel.

  2. Collecting data from one Tab.
    a) For .xlsx file schema’s Name should be the same as the name of the Tab in the Excel file you are collecting data from

b) For .xls - either the same as above or if you don’t add schema and it will be added automatically.

c) If you want to give another name to your schema that is different to the Tab name toggle On "Use Sheet Index" and set it to 0 to collect from the first tab, 1 from the second, etc.

Bookmarks

Bookmarks can be used on certain collectors to set a dynamic value each time the collector runs. A field from the collector source will need to be selected to be associated with the Bookmark function. For example, ‘WHENMODIFIED’ is a field in the Sage Intacct GL Detail object that can be used as a Bookmark:

 

Once the Bookmark Field is set, every collection will save the value based on the latest run time. In the example above, the Bookmark is stored as a 'WHENMODIFIED' date of '05/17/2022 04:22:10'. The next time the Collector runs, it will only pull values with a different value, and then will update the Current Value field dynamically.

Other Notes about Bookmarks:

  • Most commonly set as date fields, the Bookmark Field can also be a non-date field.

  • Filters on the collector can still be applied, so even though a Bookmark may be set, if a filter is present it will still only collect data with both the Bookmark value and filters in mind.

  • Data gets collected into a new stream even when Bookmarks are present, the data does not automatically consolidate into one stream. Consolidation would need to be done in a query.

    • To accomplish this consolidation in the query, your FROM section will need to leverage ‘All Streams’. This will also carry a risk that if the bookmark’s ‘Current Value’ is cleared or the bookmark is removed, the same data can be included into a new stream and therefore overstate the dataset that is being used in the query. It is important to be mindful of which collectors and queries are relying on the bookmark functionality and avoid modifying the bookmark while in use. To reset the bookmarked data, make sure to clear the ‘Current Value’ and also purge all streams or create a new Schema so that you’re starting from an empty dataset.

Details

The details section documents who the Collector was created and updated by and the corresponding times. This allows for easy tracking of multiple Collectors.

Latest Collection

The latest collection section documents the state of the collector, created time, and the status of the query. States include complete and error.

Collections

The Collections section documents when the Collector was created, started, completed and the total amount of data scanned. The status includes information regarding the state of the Collector. This allows for easy tracking of multiple collections.