Mastering PostgreSQL 9.6
eBook - ePub

Mastering PostgreSQL 9.6

  1. 416 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Mastering PostgreSQL 9.6

Book details
Book preview
Table of contents
Citations

About This Book

Master the capabilities of PostgreSQL 9.6 to efficiently manage and maintain your databaseAbout This Book• Your one-stop guide to mastering the advanced concepts in PostgreSQL with ease• Master query optimization, replication, and high availability with PostgreSQL• Extend the functionalities of PostgreSQL to suit your organizational needs with minimum effortWho This Book Is ForIf you are a PostgreSQL data architect or an administrator who wants to understand how to implement advanced functionalities and master complex administrative tasks with PostgreSQL, then this book is perfect for you. Prior experience of administrating a PostgreSQL database and a working knowledge of SQL is required to make the best use of this book.What You Will Learn• Get to grips with the advanced features of PostgreSQL 9.6 and handle advanced SQL• Make use of the indexing features in PostgreSQL and fine-tune the performance of your queries• Work with the stored procedures and manage backup and recovery• Master the replication and failover techniques• Troubleshoot your PostgreSQL instance for solutions to the common and not-so-common problems• Learn how to migrate your database from MySQL and Oracle to PostgreSQL without any hassleIn DetailPostgreSQL is an open source database used for handling large datasets (Big Data) and as a JSON document database. It also has applications in the software and web domains. This book will enable you to build better PostgreSQL applications and administer databases more efficiently.We begin by explaining the advanced database design concepts in PostgreSQL 9.6, along with indexing and query optimization. You will also see how to work with event triggers and perform concurrent transactions and table partitioning, along with exploring SQL and server tuning. We will walk you through implementing advanced administrative tasks such as server maintenance and monitoring, replication, recovery and high availability, and much more. You will understand the common and not-so-common troubleshooting problems and how you can overcome them.By the end of this book, you will have an expert-level command of the advanced database functionalities and will be able to implement advanced administrative tasks with PostgreSQL.Style and ApproachThis book is a comprehensive guide covering all the concepts you need to master PostgreSQL. Packed with hands-on examples, tips and tricks, even the most advanced concepts are explained in a very easy-to-follow manner. Every chapter in the book does not only focus on how each task is performed, but also why.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Mastering PostgreSQL 9.6 by Hans-Jurgen Schonig in PDF and/or ePUB format, as well as other popular books in Computer Science & Databases. We have over one million books available in our catalogue for you to explore.

Information

Year
2017
ISBN
9781783555369
Edition
1

Making Use of Indexes

In the previous chapter, you learned about concurrency and locking. In this chapter, it is time to attack indexing head on. The importance of this topic cannot be stressed enough—indexing is (and will most likely stay as) one of the most important topics in the life of every database engineer.
After 17 years of professional, full-time PostgreSQL consulting and PostgreSQL 24x7 support, I can say one thing for sure. Bad indexing is the main source of bad performance. Of course, it is important to adjust memory parameters and all that. However, it is all in vain if indexes are not used properly. There is simply no replacement for a missing index.
Therefore, I have dedicated an entire chapter to indexing alone to give you as many insights as possible.
In this chapter, you will learn these topics:
  • When does PostgreSQL use indexes?
  • How does the optimizer handle things?
  • What types of indexes are there and how do they work?
  • Using your own indexing strategies
At the end of the chapter, you will be able to understand how indexes can be used beneficially in PostgreSQL.

Understanding simple queries and the cost model

In this section, we will get started with indexes. To show how things work, some test data is needed. The following code snippet shows how data can be created easily:
 test=# CREATE TABLE t_test (id serial, name text); 
CREATE TABLE
test=# INSERT INTO t_test (name) SELECT 'hans'
FROM generate_series(1, 2000000);
INSERT 0 2000000
test=# INSERT INTO t_test (name) SELECT 'paul'
FROM generate_series(1, 2000000);
INSERT 0 2000000
In the first line, a simple table is created. Two columns are used: an auto increment column, which just keeps creating numbers, and a column that will be filled with static values.
The generate_series function will generate numbers from 1 million to 2 million. So in this example, 2 million static values for hans and 2 million static values for paul are created.
In all, 4 million rows have been added:
 test=# SELECT name, count(*) FROM t_test GROUP BY 1; 
name | count
------+---------
hans | 2000000
paul | 2000000
(2 rows)
These 4 million rows have some nice properties. IDs are ascending and there are only two distinct names.
Let's run a simple query now:
 test=# \timing 
Timing is on.
test=# SELECT * FROM t_test WHERE id = 432332;
id | name
--------+------
432332 | hans
(1 row)

Time: 119.318 ms
In this case, the \timing command will tell psql to show the runtime of a query. Note that this is not the real execution time on the server but the time measured by psql. In case of very short queries, network latency can be a substantial part of the total time, so this has to be taken into account.

Making use of EXPLAIN

In this example, reading 4 million rows has taken more than 100 milliseconds. From a performance point of view, it is a total disaster. To figure out what is going wrong, PostgreSQL offers the EXPLAIN command:
 test=# \h EXPLAIN 
Command: EXPLAIN
Description: show the execution plan of a statement
Syntax:
EXPLAIN [ ( option [, ...] ) ] statement
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement

where option can be one of:

ANALYZE [ boolean ]
VERBOSE [ boolean ]
COSTS [ boolean ]
BUFFERS [ boolean ]
TIMING [ boolean ]
FORMAT { TEXT | XML | JSON | YAML }
When you have the feeling that a query is not performing well, EXPLAIN will help you to reveal the real performance problem.
Here is how it works:
 test=# EXPLAIN SELECT * FROM t_test WHERE id = 432332; 
QUERY PLAN
---------------------------------------------------------------
Gather (cost=1000.00..43463.92 rows=1 width=9)
Workers Planned: 2
-> Parallel Seq Scan on t_test
(cost=0.00..42463.82 rows=1 width=9)
Filter: (id = 432332)
(4 rows)
What you see in this listing is a so called execution plan. In PostgreSQL, an SQL statement will be executed in four stages. The following components are at work:
  • The parser will check for syntax errors and for obvious problems
  • The rewrite system take care of rules (views and other things)
  • The optimizer will figure out how to execute a query in the most efficient way and work out a plan
  • The plan provided by the optimizer will be used by the executor to finally create the result
The purpose of EXPLAIN is to see what the planner has come up with to run the query efficiently. In my example, PostgreSQL will use a parallel sequential scan. This means that two workers will cooperate and work on the filter condition together. The partial results are then united through a thing called gather node, which has been introduced in PostgreSQL 9.6 (it is part of the parallel query infrastructure). If you look at the plan more precisely, you will see how many rows PostgreSQL expects at each stage of the plan (in this example, rows = 1 that is, one row will be returned).
In PostgreSQL 9.6, the number of parallel workers will be determined by the size of the table. The larger an operation is, the more parallel workers PostgreSQL will fire up. For a very small table, parallelism is not used as it would create too much overhead.
Parallelism is not a must. It is always possible to reduce the number of parallel workers in pre-PostgreSQL 9.6 behavior by setting the following variable to 0:
 test=# SET max_parallel_workers_per_gather TO 0; 
SET
Note that this change has no side effect as it is only inside your session. Of course you can also decide the change in the postgresql.conf file, but I would not advise you to do that as you might lose quite a lot of performance provided by parallel queries.

Digging into the PostgreSQL cost model

If only one CPU is used, the execution plan will look like this:
 test=# EXPLAIN SELECT * FROM t_test WHERE id = 43...

Table of contents

  1. Title Page
  2. Copyright
  3. Credits
  4. About the Author
  5. About the Reviewer
  6. www.PacktPub.com
  7. Customer Feedback
  8. Preface
  9. PostgreSQL Overview
  10. Understanding Transactions and Locking
  11. Making Use of Indexes
  12. Handling Advanced SQL
  13. Log Files and System Statistics
  14. Optimizing Queries for Good Performance
  15. Writing Stored Procedures
  16. Managing PostgreSQL Security
  17. Handling Backup and Recovery
  18. Making Sense of Backups and Replication
  19. Deciding on Useful Extensions
  20. Troubleshooting PostgreSQL
  21. Migrating to PostgreSQL