eBook - ePub

Apache Kafka Quick Start Guide

Name: Apache Kafka Quick Start Guide
Author: Raúl Estrada

Leverage Apache Kafka 2.0 to simplify real-time data processing for distributed applications

Raúl Estrada,

186 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Apache Kafka Quick Start Guide

Leverage Apache Kafka 2.0 to simplify real-time data processing for distributed applications

Raúl Estrada,

Book details

Book preview

Table of contents

Citations

About This Book

Process large volumes of data in real-time while building high performance and robust data stream processing pipeline using the latest Apache Kafka 2.0

Key Features

Solve practical large data and processing challenges with Kafka
Tackle data processing challenges like late events, windowing, and watermarking
Understand real-time streaming applications processing using Schema registry, Kafka connect, Kafka streams, and KSQL

Book Description

Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the fly. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing pipelines.

This book focuses on programming rather than the configuration management of Kafka clusters or DevOps. It starts off with the installation and setting up the development environment, before quickly moving on to performing fundamental messaging operations such as validation and enrichment.

Here you will learn about message composition with pure Kafka API and Kafka Streams. You will look into the transformation of messages in different formats, such asext, binary, XML, JSON, and AVRO. Next, you will learn how to expose the schemas contained in Kafka with the Schema Registry. You will then learn how to work with all relevant connectors with Kafka Connect. While working with Kafka Streams, you will perform various interesting operations on streams, such as windowing, joins, and aggregations. Finally, through KSQL, you will learn how to retrieve, insert, modify, and delete data streams, and how to manipulate watermarks and windows.

What you will learn

How to validate data with Kafka
Add information to existing data flows
Generate new information through message composition
Perform data validation and versioning with the Schema Registry
How to perform message Serialization and Deserialization
How to perform message Serialization and Deserialization
Process data streams with Kafka Streams
Understand the duality between tables and streams with KSQL

Who this book is for

This book is for developers who want to quickly master the practical concepts behind Apache Kafka. The audience need not have come across Apache Kafka previously; however, a familiarity of Java or any JVM language will be helpful in understanding the code in this book.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes, you can access Apache Kafka Quick Start Guide by Raúl Estrada in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2018

ISBN

9781788992251

Edition

Topic

Computer Science

Subtopic

Data Processing

Index

Computer Science

Accessing and Retrieving Data

In this chapter, we will cover the following recipes:

Viewing and analyzing M functions in the Query Editor
Establishing and managing connections to data sources
Building source queries for DirectQuery models
Importing data to Power BI Desktop models
Applying multiple filtering conditions
Choosing columns and column names
Transforming and cleansing source data
Creating custom and conditional columns
Integrating multiple queries
Choosing column data types
Visualizing the M library

Introduction

Power BI Desktop contains a very rich set of data source connectors and transformation capabilities that support the integration and enhancement of source data. These features are all driven by a powerful functional language and query engine, M, which leverages source system resources when possible and can greatly extend the scope and robustness of the data retrieval process beyond what's possible via the standard query editor interface alone. As with almost all BI projects, the design and development of the data access and retrieval process has great implications for the analytical value, scalability, and sustainability of the overall Power BI solution.

In this chapter, we dive into Power BI Desktop's Get Data experience and walk through the process of establishing and managing data source connections and queries. Examples are provided of using the Query Editor interface and the M language directly to construct and refine queries to meet common data transformation and cleansing needs. In practice and as per the examples, a combination of both tools is recommended to aid the query development process.

A full explanation of the M language and its implementation in Power BI is outside the scope of this book, but additional resources and documentation are included in the There's more... and See also sections of each recipe.

Viewing and analyzing M functions

Every time you click on a button to connect to any of Power BI Desktop's supported data sources or apply any transformation to a data source object, such as changing a column's data type, one or multiple M expressions are created reflecting your choices. These M expressions are automatically written to dedicated M documents and, if saved, are stored within the Power BI Desktop file as Queries. M is a functional programming language like F#, and it's important that Power BI developers become familiar with analyzing and later writing and enhancing the M code that supports their queries.

Getting ready

Build a query through the user interface that connects to the AdventureWorksDW2016CTP3 SQL Server database on the ATLAS server and retrieves the DimGeography table, filtered by United States for English.
Click on Get Data from the Home tab of the ribbon, select SQL Server from the list of database sources, and provide the server and database names.
- For the Data Connectivity mode, select Import.

Figure 1: The SQL Server Get Data dialog

A navigation window will appear, with the different objects and schemas of the database. Select the DimGeography table from the Navigation window and click on Edit.

In the Query Editor window, select the EnglishCountryRegionName column and then filter on United States from its dropdown.

Figure 2: Filtering for United States only in the Query Editor

At this point, a preview of the filtered table is exposed in the Query Editor and the Query Settings pane displays the previous steps.

Figure 3: The Query Settings pane in the Query Editor

How to do it...

Formula Bar

With the Formula Bar visible in the Query Editor, click on the Source step under Applied Steps in the Query Settings pane.
- You should see the following formula expression:

Figure 4: The SQL.Database() function created for the Source step

Click on the Navigation step to expose the following expression:

Figure 5: The metadata record created for the Navigation step

The navigation expression (2) references the source expression (1)
The Formula Bar in the...

Title Page
Copyright
Credits
Foreword
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Configuring Power BI Development Tools
Accessing and Retrieving Data
Building a Power BI Data Model
Authoring Power BI Reports
Creating Power BI Dashboards
Getting Serious with Date Intelligence
Parameterizing Power BI Solutions
Implementing Dynamic User-Based Visibility in Power BI
Applying Advanced Analytics and Custom Visuals
Developing Solutions for System Monitoring and Administration
Enhancing and Optimizing Existing Power BI Solutions
Deploying and Distributing Power BI Content
Integrating Power BI with Other Applications