Securing Hadoop
eBook - ePub

Securing Hadoop

  1. 116 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Securing Hadoop

Book details
Book preview
Table of contents
Citations

About This Book

In Detail

Security of Big Data is one of the biggest concerns for enterprises today. How do we protect the sensitive information in a Hadoop ecosystem? How can we integrate Hadoop security with existing enterprise security systems? What are the challenges in securing Hadoop and its ecosystem? These are the questions which need to be answered in order to ensure effective management of Big Data. Hadoop, along with Kerberos, provides security features which enable Big Data management and which keep data secure.

This book is a practitioners guide for securing a Hadoop-based Big Data platform. This book provides you with a step-by-step approach to implementing end-to-end security along with a solid foundation of knowledge of the Hadoop and Kerberos security models.

This practical, hands-on guide looks at the security challenges involved in securing sensitive data in a Hadoop-based Big Data platform and also covers the Security Reference Architecture for securing Big Data. It will take you through the internals of the Hadoop and Kerberos security models and will provide detailed implementation steps for securing Hadoop. You will also learn how the internals of the Hadoop security model are implemented, how to integrate Enterprise Security Systems with Hadoop security, and how you can manage and control user access to a Hadoop ecosystem seamlessly. You will also get acquainted with implementing audit logging and security incident monitoring within a Big Data platform.

Approach

This book is a step-by-step tutorial filled with practical examples which will focus mainly on the key security tools and implementation techniques of Hadoop security.

Who this book is for

This book is great for Hadoop practitioners (solution architects, Hadoop administrators, developers, and Hadoop project managers) who are looking to get a good grounding in what Kerberos is all about and who wish to learn how to implement end-to-end Hadoop security within an enterprise setup. Its assumed that you will have some basic understanding of Hadoop as well as be familiar with some basic security concepts.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Securing Hadoop by Sudheesh Narayanan in PDF and/or ePUB format, as well as other popular books in Business & Business Intelligence. We have over one million books available in our catalogue for you to explore.

Information

Year
2013
ISBN
9781783285259
Edition
1

Securing Hadoop


Table of Contents

Securing Hadoop
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Errata
Piracy
Questions
1. Hadoop Security Overview
Why do we need to secure Hadoop?
Challenges for securing the Hadoop ecosystem
Key security considerations
Reference architecture for Big Data security
Summary
2. Hadoop Security Design
What is Kerberos?
Key Kerberos terminologies
How Kerberos works?
Kerberos advantages
The Hadoop default security model without Kerberos
Hadoop Kerberos security implementation
User-level access controls
Service-level access controls
User and service authentication
Delegation Token
Job Token
Block Access Token
Summary
3. Setting Up a Secured Hadoop Cluster
Prerequisites
Setting up Kerberos
Installing the Key Distribution Center
Configuring the Key Distribution Center
Establishing the KDC database
Setting up the administrator principal for KDC
Starting the Kerberos daemons
Setting up the first Kerberos administrator
Adding the user or service principals
Configuring LDAP as the Kerberos database
Supporting AES-256 encryption for a Kerberos ticket
Configuring Hadoop with Kerberos authentication
Setting up the Kerberos client on all the Hadoop nodes
Setting up Hadoop service principals
Creating a keytab file for the Hadoop services
Distributing the keytab file for all the slaves
Setting up Hadoop configuration files
HDFS-related configurations
MRV1-related configurations
MRV2-related configurations
Setting up secured DataNode
Setting up the TaskController class
Configuring users for Hadoop
Automation of a secured Hadoop deployment
Summary
4. Securing the Hadoop Ecosystem
Configuring Kerberos for Hadoop ecosystem components
Securing Hive
Securing Hive using Sentry
Securing Oozie
Securing Flume
Securing Flume sources
Securing Hadoop sink
Securing a Flume channel
Securing HBase
Securing Sqoop
Securing Pig
Best practices for securing the Hadoop ecosystem components
Summary
5. Integrating Hadoop with Enterprise Security Systems
Integrating Enterprise Identity Management systems
Configuring EIM integration with Hadoop
Integrating Active-Directory-based EIM with the Hadoop ecosystem
Accessing a secured Hadoop cluster from an enterprise network
HttpFS
HUE
Knox Gateway Server
Summary
6. Securing Sensitive Data in Hadoop
Securing sensitive data in Hadoop
Approach for securing insights in Hadoop
Securing data in motion
Securing data at rest
Implementing data encryption in Hadoop
Summary
7. Security Event and Audit Logging in Hadoop
Security Incident and Event Monitoring in a Hadoop Cluster
The Security Incident and Event Monitoring (SIEM) system
Setting up audit logging in a secured Hadoop cluster
Configuring Hadoop audit logs
Summary
A. Solutions Available for Securing Hadoop
Hadoop distribution with enhanced security support
Automation of a secured Hadoop cluster deployment
Cloudera Manager
Zettaset
Different Hadoop data encryption options
Dataguise for Hadoop
Gazzang zNcrypt
eCryptfs for Hadoop
Securing the Hadoop ecosystem with Project Rhino
Mapping of security technologies with the reference architecture
Infrastructure security
OS and filesystem security
Application security
Network perimeter security
Data masking and encryption
Authentication and authorization
Audit logging, security policies, and procedures
Security Incident and Event Monitoring
Index

Securing Hadoop

Copyright © 2013 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2013
Production Reference: 1181113
Published by Packt Publishing
Ltd.Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78328-525-9
www.packtpub.com
Cover Image by Ravaji Babu ()

Credits

Author
Sudheesh Narayanan
Reviewers
Mark Kerzner
Nitin Pawar
Acquisition Editor
Antony Lowe
Commissioning Editor
Shaon Basu
Technical Editors
Amit Ramadas
Amit Shetty
Project Coordinator
Akash Poojary
Proofreader
Ameesha Green
Indexer
Rekha Nair
Graphics
Sheetal Aute
Ronak Dhruv
Valentina D'silva
Disha Haria
Abhinash Sahu
Production Coordinator
Nilesh R. Mohite
Cover Work
Nilesh R. Mohite

About the Author

Sudheesh Narayanan is a Technology Strategist and Big Data Practitioner with expertise in technology consulting and implementing Big Data solutions. With over 15 years of IT experience in Information Management, Business Intelligence, Big Data & Analytics, and Cloud & J2EE application development, he provided his expertise in architecting, designing, and developing Big Data products, Cloud management platforms, and highly scalable platform services. His expertise in Big Data includes Hadoop and its ecosystem components, NoSQL databases (MongoDB, ...

Table of contents

  1. Securing Hadoop