A data source is a collection of data and is the starting point for all you do in Harmoni. You can use various file types which can be imported files or you can connect to a data store. Harmoni automatically maps source variables into Harmoni types.
Direct connections allow updated data to flow into Harmoni in real time and is achieved using APIs (Application Programming Interface). This article discusses Google BigQuery direct connections.
If an API is available in a data collection system, we can potentially develop a direct connection to Harmoni.
In this article
1. Connect to BigQuery
In Harmoni, you can choose to create a new project that connects to a BigQuery data source, or you can connect from an existing project.
- To create a new Harmoni project, type in your project name and click on Create New
- Choose to Connect
- Select Google BigQuery
- Enter the service key account
BigQuery Service Account Key
To connect to BigQuery, you need to enter a Service Account Key in JSON format.
There are two types of GCP credentials - user credentials and service account credentials. Below an example with service account credentials. To generate the key:
1. Create a service account
2. Give the service account access to the data set
3. Create a key for the service account
4. Key is saved to downloads folder
Notice that the key can be very long. You need to copy the whole key, including the brackets.
2. Select the Dataset
After clicking CONNECT, you are presented with the datasets available in your connection.
It is critical that the dataset and table names in Google Big Query do not include hyphens.
When naming Datasets and Tables in a Big Query, it is critical that the dataset name and table name exclude hyphens (-). The Direct Connection will be unable to retrieve data if the either name includes hyphens. GCP Project names can include hyphens with no negative impact.
There are two options available when selecting your dataset:
- Column-based data (default setting)
- Row-based data
a. Column-based data
The default setting is Column-based.
- Click on a dataset, and you are presented with all the tables and views available in that dataset
- Click ADD to add the required table(s) to your Harmoni project
- Once added, the table will show in the source area
- Click on the source tile to define the new source, as is required with delimited sources
- Map the data type of each column
- Confirm and add the data source to the project
- Click OK to create the Harmoni project
b. Row-based data
Harmoni also has the option for row-based data. Use the switch to change from column-based to row-based.
To make the source selections for row-based data, Harmoni has a three-step process to follow:
1. Select the table/view containing the IDs
2. Select the table/view with the dictionary
3. Lastly, select the data
Once the three-step process is complete, select ADD to add the files to the project.
- Click OK to create the Harmoni project
3. Harmoni Project
After connecting to your sources, you can load your Harmoni project. Direct connections allow for a real-time flow of survey data.