Azure DB for PostgreSQL Single Server with pg_dump or a Python script

I did it with DMS, but I’d also like to know how long it takes to do with pg_dump.

Loaded the same dummy data to Single Server with 32vCPU / 4096GB storage.

# time ./ 

real	85m49.610s
user	158m0.038s
sys	16m36.706s

It took about 5 hours to dump.

$ time pg_dump -Fc -h -p 5432 -d postgres -U rifujita@riosourcesingle -f users.sql

real	315m57.967s
user	113m58.987s
sys	7m7.770s

pg_dump has an option, '-j' to run in parallel. But the words, ‘in parallel’ means that each dump processes work for multiple tables. If you have the ‘big’ tables, it doesn’t work well.

So, I wrote a Python script for parallelized '\COPY' for a single big table.

# cat
#!/usr/bin/env python3

import os
import subprocess
import psutil
import time
import datetime

def do_copy(num):
    com = "psql ' port=5432 dbname=postgres user=rifujita@riosourcesingle password={} sslmode=require' -c '\COPY (SELECT * FROM users WHERE id >= {} and id <= {}) TO dump-{:05d}.csv WITH CSV'".format(os.getenv('PGPASSWORD'), num * 1000000 + 1, (num + 1) * 1000000, num)
    subprocess.Popen(com, shell=True)

def count_psql():
    count = 0
    for p in psutil.process_iter():
        if == 'psql':
            count = count + 1
    return count

def main():
    print('{}: started.'.format(
    total = 0
    while total < 1000:
        if count_psql() < 32*8: # for 32vCPU Single Server
            total = total + 1

    while count_psql() != 0:

    print('{}: finished.'.format(

if __name__ == "__main__":

As you can see, it took only 18 minutes.

$ ./

2021-11-24 09:37:31.550243: started.
COPY 1000000
COPY 1000000
COPY 1000000
2021-11-24 09:55:05.465884: finished.