Azure Data Engineering Cookbook
eBook - ePub

Azure Data Engineering Cookbook

  1. 454 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Azure Data Engineering Cookbook

About this book

Over 90 recipes to help you orchestrate modern ETL/ELT workflows and perform analytics using Azure services more easilyKey Featuresโ€ข Build highly efficient ETL pipelines using the Microsoft Azure Data servicesโ€ข Create and execute real-time processing solutions using Azure Databricks, Azure Stream Analytics, and Azure Data Explorerโ€ข Design and execute batch processing solutions using Azure Data FactoryBook DescriptionData engineering is one of the faster growing job areas as Data Engineers are the ones who ensure that the data is extracted, provisioned and the data is of the highest quality for data analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis.It takes you through different techniques for performing big data engineering using Microsoft Azure Data services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You'll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you'll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you'll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You'll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you'll learn how to process streaming data using Azure Stream Analytics and Data Explorer.By the end of this Azure book, you'll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.What you will learnโ€ข Use Azure Blob storage for storing large amounts of unstructured dataโ€ข Perform CRUD operations on the Cosmos Table APIโ€ข Implement elastic pools and business continuity with Azure SQL Databaseโ€ข Ingest and analyze data using Azure Synapse Analyticsโ€ข Develop Data Factory data flows to extract data from multiple sourcesโ€ข Manage, maintain, and secure Azure Data Factory pipelinesโ€ข Process streaming data using Azure Stream Analytics and Data ExplorerWho this book is forThis book is for Data Engineers, Database administrators, Database developers, and extract, load, transform (ETL) developers looking to build expertise in Azure Data engineering using a recipe-based approach. Technical architects and database architects with experience in designing data or ETL applications either on-premise or on any other cloud vendor who wants to learn Azure Data engineering concepts will also find this book useful. Prior knowledge of Azure fundamentals and data engineering concepts is needed.

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Chapter 1: Working with Azure Blob Storage

Azure Blob storage is a highly scalable and durable object-based cloud storage solution from Microsoft. Blob storage is optimized to store large amounts of unstructured data such as log files, images, video, and audio.
It is an important data source in structuring an Azure data engineering solution. Blob storage can be used as a data source and destination. As a source, it can be used to stage unstructured data, such as application logs, images, and video and audio files. As a destination, it can be used to store the result of a data pipeline.
In this chapter, we'll learn to read, write, manage, and secure Azure Blob storage and will cover the following recipes:
  • Provisioning an Azure storage account using the Azure portal
  • Provisioning an Azure storage account using PowerShell
  • Creating containers and uploading files to Azure Blob storage using PowerShell
  • Managing blobs in Azure Storage using PowerShell
  • Managing an Azure blob snapshot in Azure Storage using PowerShell
  • Configuring blob life cycle management for blob objects using the Azure portal
  • Configuring a firewall for an Azure storage account using the Azure portal
  • Configuring virtual networks for an Azure storage account using the Azure portal
  • Configuring a firewall for an Azure storage account using PowerShell
  • Configuring virtual networks for an Azure storage account using PowerShell
  • Creating an alert to monitor an Azure storage account
  • Securing an Azure storage account with SAS using PowerShell

Technical requirements

For this chapter, the following are required:
  • An Azure subscription
  • Azure PowerShell
The code samples can be found at https://github.com/PacktPublishing/azure-data-engineering-cookbook.

Provisioning an Azure storage account using the Azure portal

In this recipe, we'll provision an Azure storage account using the Azure portal. Azure Blob storage is one of the four storage services available in Azure Storage. The other storage services are Table, Queue, and file share.

Getting ready

Before you start, open a web browser and go to the Azure portal at https://portal.azure.com.

How to do itโ€ฆ

The steps for this recipe are as follows:
  1. In the Azure portal, select Create a resource and choose Storage account โ€“ blob, file, table, queue (or, search for storage accounts in the search bar. Do not choose Storage accounts (classic)).
  2. A new page, Create storage account, will open. There are five tabs on the Create storage account page โ€“ Basics, Networking, Advanced, Tags, and Review + create.
  3. In the Basics tab, we need to provide the Azure Subscription, Resource group, Storage account name, Location, Performance, Account kind, Replication, and Access tier values, as shown in the following screenshot:
    Figure 1.1 โ€“ The Create storage account Basics tab
    Figure 1.1 โ€“ The Create storage account Basics tab
  4. In the Networking tab, we need to provide the connectivity method:
    Figure 1.2 โ€“ Create storage account โ€“ Networking
    Figure 1.2 โ€“ Create storage account โ€“ Networking
  5. In the Advanced tab, we need to select the Security, Azure Files, Data protection, and Data Lake Storage Gen2 settings:
    Figure 1.3 โ€“ Create storage account โ€“ Advanced
    Figure 1.3 โ€“ Create storage account โ€“ Advanced
  6. In the Review + create tab, review the configuration settings and select Create to provision the Azure storage account:
Figure 1.4 โ€“ Create storage account โ€“ Review + create
Figure 1.4 โ€“ Create storage account โ€“ Review + create

How it worksโ€ฆ

The Azure storage account is deployed in the selected subscription, resource group, and location. The Performance tier can be either Standard or Premium. A Standard performance tier is a low-cost magnetic drive-backed storage. It's sui...

Table of contents

  1. Azure Data Engineering Cookbook
  2. Contributors
  3. Preface
  4. Chapter 1: Working with Azure Blob Storage
  5. Chapter 2: Working with Relational Databases in Azure
  6. Chapter 3: Analyzing Data with Azure Synapse Analytics
  7. Chapter 4: Control Flow Activities in Azure Data Factory
  8. Chapter 5: Control Flow Transformation and the Copy Data Activity in Azure Data Factory
  9. Chapter 6: Data Flows in Azure Data Factory
  10. Chapter 7: Azure Data Factory Integration Runtime
  11. Chapter 8: Deploying Azure Data Factory Pipelines
  12. Chapter 9: Batch and Streaming Data Processing with Azure Databricks
  13. Other Books You May Enjoy

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, weโ€™ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere โ€” even offline. Perfect for commutes or when youโ€™re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Azure Data Engineering Cookbook by Ahmad Osama in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Modelling & Design. We have over one million books available in our catalogue for you to explore.