Upload
satoshi-nagayasu
View
621
Download
2
Embed Size (px)
Citation preview
PostgreSQL 9.4 and BeyondJSON, Analytics, and More
Uptime Technologies
Satoshi Nagayasu@snaga
FOSSASIA 2015
Satoshi Nagayasu• Satoshi Nagayasu
– Database enthusiast. DBA and Data Steward.– Traveling Asia: Hong Kong, Shenzhen, Beijing, Singapore
• Uptime Technologies– Co-founder– Providing consulting services around Database and Platform
Technologies.
• PostgreSQL– pgstatindex, pageinspect, xlogdump– PostgresForest, Postgres-XC (clusters)– Organizing Japanese Users Group.
What Iʼm doing on PostgreSQL
• Postgres Toolkit– Brand new PostgreSQL DBA tool– Stay informed at uptime.jp/go/pt
• Postgres Add-on for Hinemos– One of the most popular system management tools
in Japan.– Monitoring, Alerting, Job Management, etc.
PostgreSQL and Hinemos
Number of sessions Database Size
Cache Hit Ratio Number of Written Blocks
Hacking Hardware
RaspberryPi 2&
DE0 (FPGA)
ZigBee(wireless)
Thanks to...• Magnus Hagander• Michael Paquier• Toshi Harada• Noriyoshi Shinoda
• ... and many pg guys!
Agenda• 9.4 Overview• NoSQL (JSON and GIN Index)• Analytics (Aggregation & Mat.View)• Replication and Beyond (Logical
Decoding)• Administration (ALTER SYSTEM)• Infrastructure (For Parallelization)• Beyond 9.4
9.4 Overview
9.4 Overview - Status• The first official release.
– 9.4 released on 18th December.
• The latest stable release– 9.4.1 released on 5th February.
9.4 Overview - Statistics• 9.4.0 - compared to 9.3.5
– 3,750 files changed.– 62,960 insertions (+)– 15,935 deletions (-)
9.4 Overview - Changes
Server
Indexes
General Performance
Monitoring
SSL
Server Settings
Replication and Recovery
Logical Decoding
Queries
Utility Commands
EXPLAIN
Views
Object Manipulation
Data Types
JSON
Functions
System Information Functions
Aggregates
Server‐Side Languages
PL/pgSQL Server‐Side Language
libpq
Client Applications
psql
Backslash Commands
pg_dump
pg_basebackup
Source Code
Additional Modules
pgbench
pg_stat_statements
9.4 Overview - Changes
Categories of Enhancements• NoSQL (JSON and GIN Index)• Analytics (Aggregation & Mat.View)• Replication+ (Logical Decoding)• Administration (ALTER SYSTEM)• Basic Infrastructure (Parallelization)
NoSQL(JSON and GIN Index)
NoSQL - JSONB• JSON vs. JSONB
NoSQL - JSONB• “Binary JSON”
– Different from JSON, a text representation– Faster for searching
• With JSONB...– No duplicated keys allowed. Last wins.– Key order not preserved.– Can take advantages of GIN Index.
NoSQL - GIN Index• JSON+btree vs. JSONB+GIN
– Btree indexes vs. GIN index
http://www.slideshare.net/toshiharada/jpug-studyjsonbdatatype20141011-40103981
Table Index Size Comparison
Analytics(Aggregation & Materialized View)
Analytics - Aggregation• FILTER replaces CASE WHEN.
Analytics - Aggregation• New Aggregate Functions
– percentile_cont()– percentile_disc()– mode()– rank()– dense_rank()– percent_rank()– cume_dist()
Analytics - Aggregation• Ordered-set aggregates
– mode(), most common value in a subset
Analytics - Aggregation• Ordered-set aggregates
– rank(), rank of a value in a subset
Analytics – Materialized Views
• REFRESH MATERIALIZED VIEW CONCURRENTLY myview
• Refreshing a MV concurrently (in background) without exclusive lock.
• Usability and availability improved.
Replication and Beyond(Logical Decoding)
Replication and Beyond –Logical Decoding
• “Logical” representation from replication stream– INSERT/UPDATE/DELETE operations– Can be replayed on different version/platform
• pg_recvlogical command– Shows how it works
• Replication can be more flexible– BDR (Bi-Directional Rep.), Slony, and more ...– Continuous Backup as well
pg_recvlogical (contrib)
Administration(ALTER SYSTEM)
Administration - ALTER SYSTEM
• ALTER SYSTEM SET– puts new value in postgresql.auto.conf– pg_reload_conf() reloads them.– postgresql.auto.conf takes priority over
postgresql.conf.
• ALTER SYSTEM RESET– Remove values from postgresql.auto.conf.
Infrastructure(For Parallelization)
Dynamic Background Workers
• In 9.3, background workers must start at the postmaster startup.
• After 9.4, they can be launched “on-demand” basis.
• From parallelization point of view...– It allows to launch multiple background
processes to execute child queries in parallel.
Dynamic Shared Memory• Shared memory can be allocated “on-demand”
basis– Cf.) by background workers
• Main segment (ex. shared_buffers) still fixed at startup
• Also supports lightweight message queue
• From parallelization point of view...– It allows to share data and communicate with
several bgworker processes.
My Tiny Favorite(pl/pgsql stacktrace)
pl/pgsql stacktrace
http://h50146.www5.hp.com/services/ci/opensource/pdfs/PostgreSQL_9_4%20_Ver_1_0.pdf
There are many other enhancements,
so please try it asap.
Beyond 9.4
BRIN Index• Block Range INdex
– Holds "summary“ data, instead of raw data.– Reduces index size tremendously.– Also reduces creation/maintenance cost.– Needs extra tuple fetch to get the exact record.
0
50,000
100,000
150,000
200,000
250,000
300,000
Btree BRIN
Elap
sed tim
e (m
s)
Index Creation
0
50,000
100,000
150,000
200,000
250,000
300,000
Btree BRIN
Numbe
r of B
locks
Index Size
0
2
4
6
8
10
12
14
16
18
Btree BRIN
Elap
sed tim
e (m
s)
Select 1 record
https://gist.github.com/snaga/82173bd49749ccf0fa6c
Commitfest 2015-2CommitFest is a process to review, fix and commit the submitted patches.
• Parallel Seq Scan• INSERT ... ON CONFLICT {UPDATE | IGNORE}• File level incremental backup• and others..
Still work in progress...
commitfest.postgresql.org
Wrap-up• One of the most developer-friendly
RDBMSes in the world.
• Analytics features and the performance are improving.
• Things are going to parallel.
Resources• www.postgresql.org
• www.planetpostgresql.org
• www.pgcon.org
Any Question?
Thank you!• E-mail: [email protected]• Twitter, Github: @snaga• WeChat: satoshinagayasu