Apache Kafka Quick Start Guide
eBook - ePub

Apache Kafka Quick Start Guide

Leverage Apache Kafka 2.0 to simplify real-time data processing for distributed applications

  1. 186 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Apache Kafka Quick Start Guide

Leverage Apache Kafka 2.0 to simplify real-time data processing for distributed applications

Book details
Book preview
Table of contents
Citations

About This Book

Process large volumes of data in real-time while building high performance and robust data stream processing pipeline using the latest Apache Kafka 2.0

Key Features

  • Solve practical large data and processing challenges with Kafka
  • Tackle data processing challenges like late events, windowing, and watermarking
  • Understand real-time streaming applications processing using Schema registry, Kafka connect, Kafka streams, and KSQL

Book Description

Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the fly. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing pipelines.

This book focuses on programming rather than the configuration management of Kafka clusters or DevOps. It starts off with the installation and setting up the development environment, before quickly moving on to performing fundamental messaging operations such as validation and enrichment.

Here you will learn about message composition with pure Kafka API and Kafka Streams. You will look into the transformation of messages in different formats, such asext, binary, XML, JSON, and AVRO. Next, you will learn how to expose the schemas contained in Kafka with the Schema Registry. You will then learn how to work with all relevant connectors with Kafka Connect. While working with Kafka Streams, you will perform various interesting operations on streams, such as windowing, joins, and aggregations. Finally, through KSQL, you will learn how to retrieve, insert, modify, and delete data streams, and how to manipulate watermarks and windows.

What you will learn

  • How to validate data with Kafka
  • Add information to existing data flows
  • Generate new information through message composition
  • Perform data validation and versioning with the Schema Registry
  • How to perform message Serialization and Deserialization
  • How to perform message Serialization and Deserialization
  • Process data streams with Kafka Streams
  • Understand the duality between tables and streams with KSQL

Who this book is for

This book is for developers who want to quickly master the practical concepts behind Apache Kafka. The audience need not have come across Apache Kafka previously; however, a familiarity of Java or any JVM language will be helpful in understanding the code in this book.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Apache Kafka Quick Start Guide by Raúl Estrada in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Year
2018
ISBN
9781788992251
Edition
1

Accessing and Retrieving Data

In this chapter, we will cover the following recipes:
  • Viewing and analyzing M functions in the Query Editor
  • Establishing and managing connections to data sources
  • Building source queries for DirectQuery models
  • Importing data to Power BI Desktop models
  • Applying multiple filtering conditions
  • Choosing columns and column names
  • Transforming and cleansing source data
  • Creating custom and conditional columns
  • Integrating multiple queries
  • Choosing column data types
  • Visualizing the M library

Introduction

Power BI Desktop contains a very rich set of data source connectors and transformation capabilities that support the integration and enhancement of source data. These features are all driven by a powerful functional language and query engine, M, which leverages source system resources when possible and can greatly extend the scope and robustness of the data retrieval process beyond what's possible via the standard query editor interface alone. As with almost all BI projects, the design and development of the data access and retrieval process has great implications for the analytical value, scalability, and sustainability of the overall Power BI solution.
In this chapter, we dive into Power BI Desktop's Get Data experience and walk through the process of establishing and managing data source connections and queries. Examples are provided of using the Query Editor interface and the M language directly to construct and refine queries to meet common data transformation and cleansing needs. In practice and as per the examples, a combination of both tools is recommended to aid the query development process.
A full explanation of the M language and its implementation in Power BI is outside the scope of this book, but additional resources and documentation are included in the There's more... and See also sections of each recipe.

Viewing and analyzing M functions

Every time you click on a button to connect to any of Power BI Desktop's supported data sources or apply any transformation to a data source object, such as changing a column's data type, one or multiple M expressions are created reflecting your choices. These M expressions are automatically written to dedicated M documents and, if saved, are stored within the Power BI Desktop file as Queries. M is a functional programming language like F#, and it's important that Power BI developers become familiar with analyzing and later writing and enhancing the M code that supports their queries.

Getting ready

  1. Build a query through the user interface that connects to the AdventureWorksDW2016CTP3 SQL Server database on the ATLAS server and retrieves the DimGeography table, filtered by United States for English.
  2. Click on Get Data from the Home tab of the ribbon, select SQL Server from the list of database sources, and provide the server and database names.
    • For the Data Connectivity mode, select Import.
Figure 1: The SQL Server Get Data dialog
A navigation window will appear, with the different objects and schemas of the database. Select the DimGeography table from the Navigation window and click on Edit.
  1. In the Query Editor window, select the EnglishCountryRegionName column and then filter on United States from its dropdown.
Figure 2: Filtering for United States only in the Query Editor
At this point, a preview of the filtered table is exposed in the Query Editor and the Query Settings pane displays the previous steps.
Figure 3: The Query Settings pane in the Query Editor

How to do it...

Formula Bar

  1. With the Formula Bar visible in the Query Editor, click on the Source step under Applied Steps in the Query Settings pane.
    • You should see the following formula expression:
Figure 4: The SQL.Database() function created for the Source step
  1. Click on the Navigation step to expose the following expression:
Figure 5: The metadata record created for the Navigation step
  • The navigation expression (2) references the source expression (1)
  • The Formula Bar in the...

Table of contents

  1. Title Page
  2. Copyright
  3. Credits
  4. Foreword
  5. About the Author
  6. About the Reviewers
  7. www.PacktPub.com
  8. Customer Feedback
  9. Preface
  10. Configuring Power BI Development Tools
  11. Accessing and Retrieving Data
  12. Building a Power BI Data Model
  13. Authoring Power BI Reports
  14. Creating Power BI Dashboards
  15. Getting Serious with Date Intelligence
  16. Parameterizing Power BI Solutions
  17. Implementing Dynamic User-Based Visibility in Power BI
  18. Applying Advanced Analytics and Custom Visuals
  19. Developing Solutions for System Monitoring and Administration
  20. Enhancing and Optimizing Existing Power BI Solutions
  21. Deploying and Distributing Power BI Content
  22. Integrating Power BI with Other Applications