16
Countdown to PostgreSQL v9.5 Foreign Tables can be part of Inheritance Tree

Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree

Embed Size (px)

Citation preview

Countdown to PostgreSQL v9.5

Foreign Tables can be part of Inheritance Tree

PostgreSQL v9.5

• PostgreSQL v9.5 alpha 2 has been released

• There are great new features added in this release

• Foreign Table Inheritance

• Import Foreign Schema

• BRIN Indexes

• GiST index only scans

• GROUPING SET, CUBE, ROLLUP

• UPSERT INSERT… ON CONFLICT DO NOTHING | UPDATE

• JSONB modification functions and operators

What is Table Inheritance

• In PostgreSQL a table can be made to inherit from another table

• The child table inherits all the attributes, data type and not null constraints

• Inherited tables can have check constraints

• The child tables are used to setup partitioning in PostgreSQL

• You can refer to our hangout on Partitioning in PostgreSQL

What are Foreign Tables

• In PostgreSQL you can use Foreign Data Wrappers (FDW) to refer to a remote database

• The tables from remote database can be created as Foreign Tables (i.e. a reference to remote tables in remote database)

• Depending on the implementation of FDW you can do read and write into these Foreign Tables

• Some popular FDW implementations are • postgres_fdw - comes as one of the stardard extensions in PostgreSQL

• file_fdw - comes as one of the stardard extensions in PostgreSQL

• mongo_fdw - Use PostgreSQL and MongoDB for a hybrid store. For a quick demo refer to one of our previous hangout

What’s new in v9.5 - Foreign Table Inheritance• Now foreign tables can participate in inheritance tree

• So a local table can inherit from a foreign table

• Or a local table can be inherited by foreign tables

• You can create check constraints on foreign tables• These constraints are not much useful in enforcing data integrity

• But they are useful for optimizer while partition pruning

• Hybrid systems can be used for hosting different partition sets

• If your foreign tables are created using a FDW extension which allows writes, you can achieve basic sharding

Demo - Multiple PostgreSQL Databases used to host various partition• Create foreign servers referring to all these remote servers

• Create a main table

• Create tables in remote servers

• Create foreign tables inheriting the remote tables

• Create a check constraints on all foreign child tables

• Create a trigger on main table to distribute data

• The process is similar to creating portioned table

• Partitions are on different servers

Demo - Environment

• Server IP Address - 192.168.37.149• Two instance of PostgreSQL are running

• Instance 1 Port - 5432

• Instance 2 Port - 5431

• Master Database (on Instance-1) - postgres• 192.168.37.149 | port - 5432 | DB - postgres | User - postgres

• Shard-1 DB (where child table-1 will reside) - v95 on Instance-1• 192.168.37.149 | port - 5432 | DB - v95_sh1 | User - hangout_demo

• Shard-2 DB (where child table-2 will reside) - v95 on Instance-2• 192.168.37.149 | port - 5431 | DB - v95_sh2 | User - hangout_demo

Demo - Create Extension and Foreign Servers• Connect to Master DB

psql -h 192.168.37.149 -p 5432 -d postgres -U postgres

• Create the Foreign Data Wrapper ExtensionCREATE EXTENSION postgres_fdw;

• Create the link to remote servers• CREATE SERVER server_1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS(

HOST '192.168.37.149',DBNAME 'v95_sh1',port '5432' );• CREATE USER MAPPING FOR POSTGRES SERVER server_1 OPTIONS( user

'hangout_demo' );

• CREATE SERVER server_2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS( HOST '192.168.37.149',DBNAME 'v95_sh2',port '5431' );

• CREATE USER MAPPING FOR POSTGRES SERVER server_2 OPTIONS ( user 'hangout_demo' );

• Now the tables of these servers can be imported

Demo - Create Table and child Tables

• Create a table in Master DB psql -h 192.168.37.149 -p 5432 -d postgres -U postgres -c "CREATE TABLE EMPLOYEES (empno NUMERIC(4,0), ename VARCHAR(10), job VARCHAR(9), hiredate TIMESTAMP )"

• Create a table in Shard-1 DB psql -h 192.168.37.149 -p 5432 -d v95_sh1 -U hangout_demo -c "CREATE TABLE EMPLOYEES (empno NUMERIC(4,0), ename VARCHAR(10), job VARCHAR(9), hiredate TIMESTAMP )“

psql -h 192.168.37.149 -p 5431 -d v95_sh2 -U hangout_demo -c "CREATE TABLE EMPLOYEES (empno NUMERIC(4,0), ename VARCHAR(10), job VARCHAR(9), hiredate TIMESTAMP )"

Demo - Create Foreign Tables

• Connect to the master DB • psql -h 192.168.37.149 -p 5432 -d postgres -U postgres

• Create Child Table with check constraints• CREATE FOREIGN TABLE EMP_1(CHECK (JOB IN

('MANAGER','PRESIDENT'))) INHERITS (EMPLOYEES) SERVER server_1 OPTIONS ( TABLE_NAME 'employees' );

• CREATE FOREIGN TABLE EMP_2(CHECK (JOB IN ('SALESMAN','CLERK'))) INHERITS (EMPLOYEES) SERVER server_2 OPTIONS ( TABLE_NAME 'employees' );

• These check constraints are not enforced

• They need not exist on the respective remote tables but are recommended to avoid any data integrity issues

Demo - Create A trigger function

CREATE OR REPLACE FUNCTION shard_emp()

RETURNS TRIGGER AS $$

DECLARE

BEGIN

if (new.job in ('MANAGER','PRESIDENT'))then

insert into emp_1 select new.*;

elseif (new.job in ('SALESMAN', 'CLERK'))then

insert into emp_2 select new.*;

end if; return null; end;

$$ LANGUAGE plpgsql;

Create the trigger function to move incoming data on main table to correct shard

Demo - Create Trigger

CREATE TRIGGER

employees_shard_trigger

BEFORE INSERT ON employees

FOR EACH ROW EXECUTE PROCEDURE shard_emp();

Demo - Insert Some Data

• Connect to the Master DB

• Insert some data in “employees” table• insert into employees values (1,'XAVI','CLERK','12-MAR-

2013');

• insert into employees values (2,'SAMEER','MANAGER','12-JAN-2013');

• insert into employees values (0,'SULTAN', 'PRESIDENT','12-MAY-2013');

• insert into employees values (0,'SAROJ', 'SALESMAN','12-MAY-2013');

Demo - Query Plan

Use Cases• Create distributed database• Bulk upload and attach the remote database table as a

partition• The same parent table can have two child tables each on

different kind of systems• Attach a comma separated file as a partition for an interim

period and allow users to read it (though slow)• In parallel do a bulk upload from the file• Create Indexes on the uploaded table• Detach the partition referring to csv file and attach the uploaded table as a

partition• Advantage Users don’t have to wait for all data to be uploaded

• Use with other popular FDW e.g. for MongoDB, Elasticsearch, Hadoop, pg_dump etc

Send us your suggestions and questions

[email protected]