I gave a session on Azure Data Factory v2; The Data Flows at Data: Scotland on the 13th September.
Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation.
Now with the Data Flows in V2 we have a visual designer native to Microsoft Azure that provides robust and scalable data flow pipeline, like a constantly evolving integration services cloud service. It has a rich graphical interface that enables no code building of Extract, Transform and Load projects.
Concepts
As part of the session I’ll be building a quick Data Flow covering a few simple concepts;
- Flat File data source stored in Azure Blob Storage
- Azure SQL Database – Azure PaaS Database
- Simple expression
- INNER JOIN
- Output to a ‘sink’
Build it!
To help you build it yourself you’ll need the following;
Azure Subscription – Available here https://azure.microsoft.com/en-gb/free/
Microsoft Azure Storage Explorer – Download from here https://azure.microsoft.com/en-gb/features/storage-explorer/
You will also need the following files;
Addresses – Our fictitious customers and their addresses; https://www.sqltomato.com/wp-content/uploads/2019/09/Addresses.txt
PostTownDistricts – A file containing some EH and G Post Towns and Districts; https://www.sqltomato.com/wp-content/uploads/2019/09/PostTownDistrictsEHG-Validated-1.txt
In addition to this you’ll need the simple expression I used to extract the potential Postcode District portion from a potential postcode;
If you attended my session I hope you enjoyed it and thanks for coming!
If you’d like to build the project I hope the links, files and code are helpful. Please comment as appropriate.
Thanks for reading!