Azure Data Factory Tutorial
Author(s): Kyle Vessey, September 2022
Developer Prerequisites:
- Active Azure Subscription
- Internet connection
Data Schema:
Please read the Data Schema Fundamentals document before using the API. Database CRUD operations are performed via the API and therefore it is important to think of each HTTP REST operation (POST/GET/DELETE) as SQL transaction (UPSERT/SELECT/DELETE)
Visualizing Data:
Data inserted or modified using this tutorial can be visualized using the web portal application(s) at https://<tenant>app.onxecta.com. The portal contains a Data Quality Reporting tool where you can see what data is in the system, the validity of the data and when the data was last modified.
To authenticate with the API through Azure Data Factory, Microsoft requires the Client mTLS certificate to be in a .pfx format. The pfx certificate is just a combination of the private key and the pem file which was generated as part of the mTLS & API Key Creation tutorial. If you have not already generated your CSR file and submitted to Xecta, please do so before continuing with this tutorial.
To generate the pfx with openssl, run the following command in your local repository where your pem and pfx files are located
Remember your Password
Make note of the password which you assigned to the pfx certificate, you will need to add that password in the Azure Portal to authenticate
Login to the Microsoft Azure Portal to begin
If you do not have a Key Vault created in your Azure account, follow the Microsoft Azure Key Vault tutorial to get your Key Vault set up
In your Key Vaults, select the Key vault which you wish to assign your Certificate to
Navigate to the "Certificates" page
Click the "Generate/Import" button to import your PFX certificate
Select the Import option
Enter a name for your certificate and upload your certificate file. In the password field you should enter the password for the pfx file which was generated in a previous step.
Now that your certificate is uploaded in the Azure Key Vault, navigate back to the Data Factory to build your pipeline.
Login to the Microsoft Azure Portal to begin
Click "Create a resource" to begin building a new Data Factory resource
Search for Data Factory in the search bar, and select the Data Factory option
Click the "Create" button
You will now see the Data Factory landing page
To create a new Pipeline, open the Data Factory Studio
Open the "Author" tab
Click the Pipelines tab and select the menu.
Click "New Pipeline"
The first step needed to interact with the API is to build a Web Activity which will be used to return an access_token. We will use a POST web request to return this access_token and pass it on through the pipeline.
Under Pipelines, navigate to the "General" section and select a Web activity
Give the Web activity the name of "PostRequestAccessToken"
Web Activity Name
You can use any name you would like for this Web activity, however make sure you use the same name later on in the tutorial as it will be referenced in some other fields
Select the PostRequestAccessToken activity in the canvas area, then navigate to the Settings tab to configure this activity.
Set the URL to: https://data-sandbox.onxecta.com/authenticate/oauth2/token?grant_type=client_credentials
From the dropdown, set the Method to POST
Enter two double quotes (ie. "") for the Request Body
Empty Request Body:
It is important that you pass the empty pair of double quotes in the Request Body. Azure Data Factory requires a valid body payload for all POST requests, however there are no body parameters that the Xecta API requires
From the dropdown, select Client Certificate
For the Pfx field, click the Azure Key Vault checkbox on the right, then select the Key Vault which contains the Pfx file from the previous Key Vault step
For the Password field, enter the password which was previously defined for your Pfx file above
The following Headers must be added:
Name | Value |
---|---|
Content-Type | application/x-www-form-urlencoded |
Authorization | Basic {base64-Encoded Client Key:Client Secret pair} |
Base64 Encoded Authorization Header:
A limitation for using mTLS with Azure Data Factory is that we require passing both the Client Pfx Certificate AND an Client Id / Client Secret API Keys given to you in this step
Data Factory only allows you to select either a Client Certificate OR Basic Auth, so we have to append our Basic Auth as a Header value
You can encode the Client Id / Client Secret through Data Factory using the @base64() function
Ex: @base64('2nkdkdf402dhfdsjlxecta123456:1903872fekdhfsl8457847932dhsfjdshfefj489xecta12345')
Another option is to use a Base64 utility such as https://www.base64decode.org/, then enter your ClientId:ClientSecret
Ex:
Copy the Decoded String at the bottom and place into your Authorization Header value
Once you configure all of the above, the final result should look like this example:
After we have the Authentication Activity created, we can now push wells to the API
Create a new Web Activity and name it "AddUpsertWells", then connect it via the green "On success" arrow
Set the URL to: https://data.onxecta.com/api/production/wells
Set the method to: POST
Enter a valid payload for adding a Well
Example payload for demo purposes:
Refer to the same steps above
The following Header needs to be added to this activity:
Name | Value |
---|---|
Authorization | @concat('Bearer ',activity('PostRequestAccessToken').output.access_token) |
This Authorization header value takes the access_token from the PostRequestAccessToken Web activity