Data movement from MySQL on-premises to Azure SQL Database using Azure Data Factory code-free copy wizard

 

I’ve never blogged in the past about Azure Data Factory, one of the most powerful services available in Cortana Intelligence Suite, useful to orchestrate data movement and transformation of data that reside on-premises and in the cloud. I am using it since Microsoft launched the service and it gives its best when working with heterogeneous data (e.g. Azure Blob Storage, Hadoop HDFS, relational data, etc).

Its core is based on 3 main components :

  • Linked Servers – used to connect to a data source (e.g. a SQL Server server). If the data source is located on-premises or it’s hosted inside a VM, you’d also need a Data Management Gateway, a service that is able to act as a bridge between the local server and Azure.
  • DataSet – used to map a Linked Server with the schema of the object to which you need to get access to read/write data 
  • Pipeline – it is a collection of Activities, and each activity could be simple as a copy task, or it could be more complex if it requires to write code to solve a specific task, like a Hive query on a Hadoop cluster.

All the information regarding the components above are stored using JSON, and I usually rely on the .NET SDK or Powershell to write the logic of my workflows.

 

There have been a few welcomed additions in the last months, and one of them is a code-free copy wizard that could be very useful in many cases, like copying test data to Azure, moving datasets from a source to a destination, etc. It is still in a preview stage, and I suppose that additional features will be added in the future.

Below you could find a few screenshots that I have taken during my step-by-step test and that represent a simple copy from a MySQL Server on-premises to a Azure SQL Database

1 – I have created a new Azure Data Factory service, called adfcopy

ADF_Screenshot0_thumb1

2 – I have created two databases, one on-premises on a MySQL Server, and one on Azure using Azure SQLDB

MySqlWordBench_thumb4

image_thumb3

3 – I’ve installed the Data Management Gateway locally, and I’ve registered it within the ADF service, using a shared key taken from the portal.

DataManagementGateway_thumb7

4 – now it’s time to execute the copy wizard

CopyWizard1_thumb5

First thing to decide is if it has to be run once or based on a schedule

CopyWizard2_thumb2

Data source selection – MySQL is one of the data sources that require a gateway

 

CopyWizard3_thumb1

CopyWizard4_thumb2

tables selection

CopyWizard5_thumb1

Filter selection, useful to select data to copy. I don’t have a Datetime column in my example, but it could be useful to apply a filter on the data to upload.

CopyWizard6_thumb1

SQL Database connection info

CopyWizard7_thumb2

Columns mapping

CopyWizard8_thumb1

Deployment complete

 

5 – monitoring activity

To monitor the copy of data, the Data Factory resource explorer, recently added too, it’s the best web tool to use.

MonitorPipeline_thumb1

6 – JSON

We haven’t written a single line of code to accomplish this task, but the wizard did it for us, and below you could find the JSON generated for the SQL Database dataset

image_thumb6

 

Ciao!

Francesco

@francedit

Add comment