Data Flows

I gave a session on Azure Data Factory v2; The Data Flows at Data: Scotland on the 13th September.

Azure Data Factory V2; The Data Flows session graphic

Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation.

Now with the Data Flows in V2 we have a visual designer native to Microsoft Azure that provides robust and scalable data flow pipeline, like a constantly evolving integration services cloud service. It has a rich graphical interface that enables no code building of Extract, Transform and Load projects.

Concepts

As part of the session I’ll be building a quick Data Flow covering a few simple concepts;

  • Flat File data source stored in Azure Blob Storage
  • Azure SQL Database – Azure PaaS Database
  • Simple expression
  • INNER JOIN
  • Output to a ‘sink’

Build it!

To help you build it yourself you’ll need the following;

Azure Subscription – Available here https://azure.microsoft.com/en-gb/free/
Microsoft Azure Storage Explorer – Download from here https://azure.microsoft.com/en-gb/features/storage-explorer/

You will also need the following files;

Addresses – Our fictitious customers and their addresses; https://www.sqltomato.com/wp-content/uploads/2019/09/Addresses.txt
PostTownDistricts – A file containing some EH and G Post Towns and Districts; https://www.sqltomato.com/wp-content/uploads/2019/09/PostTownDistrictsEHG-Validated-1.txt

In addition to this you’ll need the simple expression I used to extract the potential Postcode District portion from a potential postcode;

Postcode District Expression

If you attended my session I hope you enjoyed it and thanks for coming!

If you’d like to build the project I hope the links, files and code are helpful. Please comment as appropriate.

Thanks for reading!

Data Flows

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest