Sample code for reading and
writing data on Azure Machine Learning Workspace notebooks can be found online
but working examples can be elusive because it is not called out as clearly
that azureml.core package has been superceded by azure.ai.ml package. The
following example demonstrated just how to do that.
The core objects used in this
sample are Datastore and Dataset to describe the connection information and the
data associated with a location.
from azure.ai.ml.entities import
AzureDataLakeGen2Datastore
from
azure.ai.ml.entities._datastore.credentials import ServicePrincipalCredentials
from azure.ai.ml import MLClient
ml_client = MLClient.from_config()
store =
AzureDataLakeGen2Datastore(
name="adls_gen2_example",
description="Datastore pointing to an Azure Data Lake Storage
Gen2.",
account_name="mytestdatalakegen2",
filesystem="my-gen2-container",
credentials=ServicePrincipalCredentials(
tenant_id="00000000-0000-0000-0000-000000000000",
client_id="00000000-0000-0000-0000-000000000000",
client_secret="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
)
)
ml_client.create_or_update(store)
from azure.ai.ml import Dataset
dataset_path =
"<fully-qualified-url-from-datastore-path>" # this could also
be substituted with
#
abfss://container@storageaccount.dfs.core.windows.net/path/to/file"
dataset =
Dataset.from_delimited_files(dataset_path)
dataset.take(5).to_pandas_dataframe()
The same Dataset class can be used
to write to a csv using:
import pandas as pd
from azure.ai.ml import Dataset
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22],
'Salary': [60000, 70000, 55000]
}
df = pd.DataFrame(data)
dataset =
Dataset.from_pandas_dataframe(df)
csv_path =
"<fully-qualified-url-from-datastore-path>" # this could also
be substituted with
#
abfss://container@storageaccount.dfs.core.windows.net/path/to/file"
dataset.to_delimited_files(csv_path)
No comments:
Post a Comment