Apache Cassandra Essentials
Table of Contents
Apache Cassandra Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Your Cassandra Cluster Ready
Installation
Prerequisites
Compiling Cassandra from source and installing
Installation from a precompiled binary
The installation layout
The directory layout in tarball installations
The directory layout in package-based installation
Configuration files
cassandra.yaml
Running a Cassandra server
Running a Cassandra node
Setting up the cluster
Viewing the cluster status
Summary
2. An Architectural Overview
Background
Cassandra cluster overview
The Gossip protocol
Failure detection
Data distribution
Replication
SimpleStrategy
NetworkTopologyStrategy
Snitches
Virtual nodes
Adding nodes to our cluster
Create keyspace and column family
Summary
3. Creating Database and Schema
A database and schema
Keyspace
Column families
Static rows
Wide rows
A primary key
Partition keys and clustering columns
A composite partition key
Multiple clustering columns
Static columns
Modifying a table
Data types
Counters
Collections
Sets
Lists
Map
UDTs
Secondary indexes
Allowing filtering
TTL
Conditional querying
Conditions on a partition key
Conditions on a partition key and clustering columns
Sorting query results
Write operations
Lightweight transactions
Batch statements
Summary
4. Read and Write – Behind the Scenes
Write operations
CommitLog
Anatomy of Memtable
SSTable explained
SSTable Compaction strategies
Size-tiered compaction
Leveled compaction
DateTiered compaction
Read operations
Reads from row cache
Read operations for row cache miss
Key is in KeyCache
Key search miss both the key cache and the row cache
Delete operations
Data consistency
Read operation
Digest reads
Read repair
Consistency levels
Write operation
Hinted handoff
Consistency levels
Tracing Cassandra queries
Summary
5. Writing Your Cassandra Client
Connecting to a Cassandra cluster
Driver Connection policies
Load balancing policies
Retry policies
Reconnection policies
Reading and writing to the Cassandra cluster
QueryBuilder
Reading and writing asynchronously
Prepared statements
Example REST service using prepared statement
Batch statements
Mapping API
Tracing Cassandra queries using Java driver
Summary
6. Monitoring and Tuning a Cassandra Cluster
Monitoring a Cassandra cluster
Use logging for debugging
Monitoring using command-line utilities
nodetool cfstats
nodetool cfhistograms
nodetool netstats
nodetool tpstats
JConsole
Third-party tools
Tuning Cassandra nodes
Configuring Cassandra caches
Tuning Bloom filters
Configuring and tuning Java
Summary
7. Backup and Restore
Taking backup of a Casandra cluster
Manual backup
Deleting snapshots
Incremental backup
Restoring data to Cassandra
The Cassandra bulk loader
Exporting and importing data using the Cassandra JSON utility
Loading external data into Cassandra
Removing nodes from Cassandra cluster
Adding nodes to a Cassandra cluster
Replacing dead nodes in a cluster
Summary
Index
Apache Cassandra Essentials
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2015
Production reference: 1161115
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78398-910-2
www.packtpub.com
Author
Nitin Padalia
Reviewers
Ranjeet Kumar Jha
Sonal Raj
Chaoran Yu
Commissioning Editor
Akram Hussain
Acquisition Editor
Meeta Rajani
Content Development Editor
Aparna Mitra
Technical Editor
Rohan Uttam Gosavi
Copy Editor
Pranjali Chury
Project Coordinator
Mary Alex
Proofreader
Safis Editing
Indexer
Mariammal Chettiyar
Graphics
Disha Haria
Production Coordinator
Nilesh Mohite
Cover Work
Nilesh Mohite
Nitin Padalia is the technical leader at Aricent Group, where he is involved in building highly scalable distributed applications in the field of telecommunications. From the beginning of his career, he has been working in the field of telecommunications and has worked on protocols such as SMPP, RTP, SIP, and VOIP. Since the beginning of his career, he has worked on the development of applications that can scale infinitely with highest performance possible. He has experience of developing applications for bare metal hardware, virtualized environments, and cloud-based applications using various languages and technologies.