If you are relying heavily on the PostgreSQL COPY command to load data into PostgreSQL quickly, PostgreSQL 12 might offer a feature, which is most likely very beneficial to you. Bulkloading is an important operation and every improvement in this area is certainly going to help many people out there, who want to import data into PostgreSQL as fast as possible.

COPY: Loading and unloading data as fast as possible

When taking a closer look at the syntax of the COPY command in PostgreSQL 12 you will quickly see two things:

• \h will now point to the correct page in the documentation

• COPY now supports a WHERE condition

Here is the complete syntax overview:


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28



​db12=# \h COPY​

​Command:     COPY​

​Description: copy data ​​​​between​​ ​​a file ​​​​and​​ ​​a ​​​​table​

​Syntax:​

​COPY table_name [ ( column_name [, ...] ) ]​

​FROM​​ ​​{ ​​​​'filename'​​ ​​| PROGRAM ​​​​'command'​​ ​​| STDIN }​

​[ [ ​​​​WITH​​ ​​] ( ​​​​option​​ ​​[, ...] ) ]​

​[ ​​​​WHERE​​ ​​condition ]​

 

​COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }​

​TO​​ ​​{ ​​​​'filename'​​ ​​| PROGRAM ​​​​'command'​​ ​​| STDOUT }​

​[ [ ​​​​WITH​​ ​​] ( ​​​​option​​ ​​[, ...] ) ]​

 

​where​​ ​​option​​ ​​can be one ​​​​of​​​​:​

 

​FORMAT format_name​

​FREEZE [ boolean ]​

​DELIMITER ​​​​'delimiter_character'​

​NULL​​ ​​'null_string'​

​HEADER [ boolean ]​

​QUOTE ​​​​'quote_character'​

​ESCAPE​​ ​​'escape_character'​

​FORCE_QUOTE { ( column_name [, ...] ) | * }​

​FORCE_NOT_NULL ( column_name [, ...] )​

​FORCE_NULL ( column_name [, ...] )​

​ENCODING ​​​​'encoding_name'​

 

​URL: https://www.postgresql.org/docs/12/sql-copy.html​


While having a link to the documentation around is certainly beneficial, the WHERE condition added to PostgreSQL 12 might even be more important. What is the purpose of this new feature? So far it was possible to completely import a file. However, in some cases this has been a problem: More often than not people only wanted to load a subset of data and had to write a ton of code to filter data before the import or once data has been written into the database already.

COPY … WHERE: Applying filters while importing data

Im PostgreSQL data can be filtered while importing easily. The COPY become is pretty flexible and allows a lot of trickery. To show you, how the new WHERE clause works, I have compiled a simple example:


1

2

3



​db12=# ​​​​CREATE​​ ​​TABLE​​ ​​t_demo ​​​​AS​

​SELECT​​ ​​* ​​​​FROM​​ ​​generate_series(1, 1000) ​​​​AS​​ ​​id;​

​SELECT​​ ​​1000​


First of all 1000 rows are generated to make sure that we got some data to play. Then we export the content of this table to a file:


1

2



​db12=# COPY t_demo ​​​​TO​​ ​​'/tmp/file.txt'​​​​;​

​COPY 1000​


Finally we can try to import this data again:


1

2

3

4

5

6

7

8

9

10

11

12



​db12=# ​​​​CREATE​​ ​​TABLE​​ ​​t_import (x ​​​​int​​​​);​

​CREATE​​ ​​TABLE​

​db12=# COPY t_import ​​​​FROM​​ ​​'/tmp/file.txt'​​ ​​WHERE​​ ​​x < 5;​

​COPY 4​

​db12=# ​​​​SELECT​​ ​​* ​​​​FROM​​ ​​t_import;​

​x ​

​---​

​1​

​2​

​3​

​4​

​(4 ​​​​rows​​​​)​


As you can see filtering data is pretty simple and very straight forward. One important thing to note here is: I exported an “id” column and imported it as “x”. Keep in mind that the textfile does not know the data structure of our target table – you have to make sure that you filter on the column name of the table you want to import.

Old gems revisited …

If you are new to PostgreSQL in general I also want to present one of the older features, which I like a lot personally. COPY can send data to the UNIX pipe or read data from a pipe. Here is how it works:


1

2

3

4

5

6

7

8

9

10

11

12

13



​db12=# COPY t_demo ​​​​TO​​ ​​PROGRAM ​​​​'gzip -c > /tmp/file.txt.gz'​​​​;​

​COPY 1000​

​db12=# COPY t_import ​​​​FROM​​ ​​PROGRAM ​​​​'gunzip -c /tmp/file.txt.gz'​

​WHERE​​ ​​x ​​​​BETWEEN​​ ​​100 ​​​​AND​​ ​​103;​

​COPY 4​

​db12=# ​​​​SELECT​​ ​​* ​​​​FROM​​ ​​t_import ​​​​WHERE​​ ​​x >= 100;​

​x  ​

​-----​

​100​

​101​

​102​

​103​

​(4 ​​​​rows​​​​)​


In some cases you might want to do more than to just export data. In this case I decided to compress the data while exporting. Before the data is imported again it is uncompressed and again filtered. As you can see it is pretty simple to combine those features in a flexible way.

If you want to learn more about PostgreSQL and loading data in general, check out our ​

Hans-Jürgen Schönig has experience with PostgreSQL since the 90s. He is CEO and technical lead of CYBERTEC (www.cybertec-postgresql.com), which is one of the market leaders in this field and has served countless customers around the globe since the year 2000.