Introduction

The sources page provides an overview of all connected data sources to the Actuals platform. At the top right of the page it is possible to use the magnifying glass to search for a specific source by description and to add a new source by clicking the button. Apart from the description, the status for each source is shown as well. When data for a source is being collected, the status is Live. For some source the data for that source is populated by collecting data from another source. In that case the status is “Processed by other source”. This also means that the source cannot be changed as there is no source data to configure. Every live source has multiple action buttons (from left to right): uploading a file, editing the source and viewing the processing details. Editing the source uses mostly the same wizard/setup as adding a new source, hence we will explain both at the same time.
Image without caption

Source configuration page

When adding a new source, the first step is to choose one of the integration connection that are available on the Actuals platform. The Integrations section contains a list of the specific setup for the specific connection. When working with files, the CSV upload integration is typically the best place to start. Once a connection is chosen for a source, it cannot be changed.
Image without caption
The next step is to setup the general source configuration. While some of the options are specific to the connection, several general settings are always required
  • Source name : the name of the source, used throughout the platform
  • Data processing : definition of how the files should be processed. Incremental means that all new files/data are added to the existing transactions, Full means that only the data from the most recent file is used (and data from all other files are removed)
  • Schedule : when is the data for the source collected and processed into transactions. Apart from a schedule to collect the data daily (at a specific time in GMT+1) or weekly, manually specifies that no data is actively collected. This is typically the case for sources where data is uploaded manually
  • Header : Expected column names/properties when data is collected for a source
  • Separator : When processing files, how are the different values separated. When this is unclear and a sample data is available, we recommend opening it with a text editor and check how the values are split in the file
  • File filter : What type of files should be processed.
Typically when selecting a specific connection (that does not uses manual files), we recommend to leave these settings as is.
Image without caption

Source mapping page

The final step in adding/editing a source is the data mapping. On this page, the data from the source is mapped to the properties of Actuals transactions. All Actuals transactions have transactionid, timestamp and amount as properties as well as many many more. This page not only allows users to map the columns from the input data to these properties but also apply formulas to further transfrom and customize the values. To use the value of a specific property the value __columnXX__ should be used corresponding to the column number (XX is the number of the column) in the input data. Additionally functions can be use to further process the values. An example is the CONCAT function that concatenates the values of a list of properties. If for instance for the transactionid property the formula CONCAT(__column1__,”-“,__column2__) is used, the the transactionid property will be populated for this source by concatenating the values of the first column in the data with a - followed by the value in the second column. Below is a list of common functions and how to use them
TO_TIMESTAMP to convert input timestamps to Actuals timestamps.
CONCAT Concatenate several strings into one string.
  • REPLACE Replace a specific value in a string with another string
  • LEFT Get a fixed number of characters from a string starting from the left
  • RIGHT Get a fixed number of characters from a string starting from the right
  • SUBSTRING_INDEX Return a substring of a string before a specified number of delimiter occurs
  • UPPER Convert the text to upper-case
  • TRIM Remove leading and trailing spaces from a string
  • FROM_UNIXTIME Returns a date /datetime from a version of unix_timestamp
  • STR_TO_DATE Return a date based on a string and a format
  • CONVERT Convert an expression to an integer
  • CAST Convert a value to an int datatype
  • ABS Return the absolute value of a number
  • IF Use specific value per row based on conditions that are met
  • CASE Go through conditions and return a value when the first condition is met
  • OBJECT_CONSTRUCT Returns an OBJECT constructed from the arguments (to work with variant field types).
OBJECT_CONSTRUCT_KEEP_NULL Returns an OBJECT constructed from the arguments that retains key-values pairs with NULL values (to work with variant field types).
OBJECT_INSERT Returns an OBJECT value consisting of the input OBJECT value with a new key-value pair inserted (or an existing key updated with a new value).
OBJECT_PICK Returns a new OBJECT containing some of the key-value pairs from an existing object.
TO_JSON Converts a VARIANT value to a string containing the JSON representation of the value.
When applying these formula’s, the output should be strings / text. The only exception to that are the following parameters
  • timestamp has type datetime and therefore is typically defined using either STR_TO_DATE or FROM_UNITIME
  • amount vatamount quantity source have type int and therefore is typically defined CAST. Note that amount and vatamount are specified in cents so to store 12.50 euro, the integer value 1250 should be provided
Apart from the mapping itself, filters can be applied as well. This means that specific rows are never imported as transactions. As an example a data set can have an column that contains the order status. It makes sense to filter the data for this source to only include rows where the order state column has the value “successful”. The filters applied are combined, so data is only imported if all filter conditions are met,
As getting the right columns, using the right formulas and applying the correct filters can be challenging, the bottom of the source configuration page contains a Test mapping section. After selecting a file, the Test mapping button can be used to apply the formulas and filters on the data in that specific file. This allows users to verify that the formulas and filter are correct in terms of syntax and also that the output is as expect. Note that this option is not available when initially creating the source as no data has been collected then. We highly recommend to always test the mapping before storing the source by clicking on Add at the bottom right.

Working with JSON

Actuals contains functionality to work with JSON and XML structured data:
  1. Extract properties from a JSON or XML column
  1. Construct a JSON column from several properties
  1. Extract properties from a JSON column
The Actuals platform allows to extract data from JSON or XML data inside a column. Imagine a file is provided that contains a column with a string of JSON data, for example a csv file that looks like this:
ID
JSON_DATA
Item 1
{“orderId”:”order123”}
Item 2
{“orderId”:”order124”}
Item 3
{“orderId”:”order125”}
If we want extract the orderId from the json string we can use this formula:
JSON_EXTRACT_PATH_TEXT(<column_identifier>, '<path_name>')
In this example the JSON_DATA column is the column we want to look for. The column is located as second column. We need to define this with the column number, in this case this is the second column so you should use __column2__. The path is orderId, to extract the orderId we will create this formula:
JSON_EXTRACT_PATH_TEXT(__column2__, '"orderId"').
It is also possible to extract nested data, for a more comprehensive guide on how to use this formula see: Snowflake JSON_EXTRACT_PATH_TEXT documentation.
Image without caption
  1. Construct a JSON column from several properties
Actuals also supports the use case to construct a JSON object from source data. Imagine a file that provided that contains a column with a string of JSON data, for example a csv file that looks like this:
ID
payment_method
payment_method_variant
Order 1
Creditcard
VISA
Order 2
Creditcard
VISA
Order 3
Creditcard
Mastercard
If we want construct a json object containing all information about the payment method used for a orders we can use this formula:
OBJECT_CONSTRUCT('<column_key>', '<column_value>', '<column_key2>', '<column_value2>')
In this example we want to construct a object containing columns  payment_method and payment_method_variant The columns are located as second and third column in the source data. We need to define this with the column number, in this case this is the second and third column so you should use __column2__ and __column3__.
To construct the object we will create this formula: OBJECT_CONSTRUCT('payment_method', __column2__,'payment_method_variant',__column3__).
For a more comprehensive guide on how to construct objects see: Snowflake OBJECT_CONSTRUCT documentation.

File upload page

The upload file button creates a popup that can be used to upload a file with data for that specific source. A file can be provided by either dragging the file from explorer to the web page or by clicking the link in the popup and browsing to the file. Important is to check the delimiter set for the source as the uploaded file should have the same delimiter in order for the values to be split properly. There is a limit on the file size of files that can be upload through the web app. If the files are bigger than 20MB, then an sFTP client should be used to upload the files.
Image without caption

Processing details page

The Source processing details shows al files/data batches that have been collected for a source. The search box can be used to search on specific file names and dates. The check box can be used to select any number of files and apply any of the actions as the appear in the Action drop down at the top left corner. The list shows the following properties per item
  • Timestamp : the date / time at which the data was collected in GMT+1
  • Filename : the name of the file including the location on the fileserver
  • #rows : the number of rows that have been stored for this item
  • Status : either Complete, Deleted or Not processed. The status can be changed by applying the actions in the Action dropdown
  • Processed Timestamp : The date / time at which the rows of data from the file were processed into transactions
  • Transaction Count : The number of transactions that were processed. Typically this is equal to or lower that the number of rows as several rows are filtered out
  • Error Count : The number of rows with an error
  • Action : For the item, several specific links can be clicked. View raw data allows users to see some rows that were imported for this item. This give a good impression of the data stored and any basic processing issues. Test raw data applies the current mapping the the data in this specific item, similar to the functionality on the mapping page. View transactions shows several transaction when the rows of the item have been processed into transactions
Image without caption