Automation can also be achieved with Azure Data Factory aka ADF and a self-hosted integration runtime that comprises of a vm hosted on-premises and a Script activity. While typically associated with Data Transformation activities, a self-hosted integration runtime can participate in running any scripts and its invocation from ADF guarantees human and programmatic access from anywhere that has cloud connectivity. A self-hosted integration runtime is a component that connects data sources on-premises/ on Azure VM with cloud services in a secure and managed way
The Json syntax for defining a script looks something like this:
{
"name": "<activity name>",
"type": "Script",
"linkedServiceName": {
"referenceName": "<name>",
"type": "LinkedServiceReference"
},
"typeProperties": {
"scripts" : [
{
"text": "<Script Block>",
"type": "<Query> or <NonQuery>",
"parameters":[
{
"name": "<name>",
"value": "<value>",
"type": "<type>",
"direction": "<Input> or <Output> or <InputOutput>",
"size": 256
},
...
]
},
...
],
...
]
},
"scriptBlockExecutionTimeout": "<time>",
"logSettings": {
"logDestination": "<ActivityOutput> or <ExternalStore>",
"logLocationSettings":{
"linkedServiceName":{
"referenceName": "<name>",
"type": "<LinkedServiceReference>"
},
"path": "<folder path>"
}
}
}
}
The output can be collected everytime a script block is executed. There is a 5000 rows/4MB size limit but this is sufficient for most purposes.
A sample curl call would be something like this:
##! /usr/bin/python
import requests
# Set your ADF details
subscription_id = '<subscription_id>'
resource_group = '<resource_group>'
factory_name = '<factory_name>'
# Set the pipeline name you want to trigger
pipeline_name = 'your_pipeline_name'
# Construct the API URL
api_url = f"https://management.azure.com/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.DataFactory/factories/{factory_name}/pipelines/{pipeline_name}/createRun?api-version=2017-03-01-preview"
# Make the POST request
response = requests.post(api_url)
# Check the response status
if response.status_code == 200:
print("Pipeline triggered successfully!")
else:
print(f"Error triggering pipeline. Status code: {response.status_code}")
## EOF
No comments:
Post a Comment