Pivotal Knowledge Base

Follow

HowTo - Find the physical location of a Pivotal Greenplum (GPDB) Database or Table?

Goal

How can I find the physical location of a Pivotal Greenplum (GPDB) Database or Table ?

Solution

The physical location of the Postgresql tables and databases is in the following directory

$MASTER_DATA_DIRECTORY/base

In this base folder, there are a number of sub-folders, the name of which are numbers, as in the following example

[gpadmin:/data/4.1.1.0/master/gpseg-1/base] # ls -l $MASTER_DATA_DIRECTORY/base
total 32
drwx------ 3 gpadmin gpadmin 4096 Jul 26 08:52 1
drwx------ 2 gpadmin gpadmin 4096 May  5 11:43 10890
drwx------ 3 gpadmin gpadmin 4096 Jul 26 08:52 10891
drwx------ 3 gpadmin gpadmin 4096 Jul 26 08:52 16992
drwx------ 3 gpadmin gpadmin 8192 Jul 26 08:52 17040
drwx------ 2 gpadmin gpadmin 4096 Jun 21 14:35 27081

Each of these high folders corresponds to a database. The following query will correlate the database names to the folder names

select oid, datname from pg_database;
  oid  |  datname 
-------+-----------
 16992 | gpadmin
 10891 | postgres
     1 | template1
 10890 | template0
 17040 | gpperfmon
 27081 | premdm
(6 rows)

Inside each of the database folders, there will be a number of files, as in the following example:

[/data/4.1.1.0/master/gpseg-1/base/16992] # ls
10774  10789  10804  17004  2608  2617   2653   2660  2669  2682  2691   27024  2831  2840  3308  5011  5043  6105
10776  10791  10806  2600   2609  2618   2654   2661  2670  2683  2692   2703   2832  2841  3309  5012  5094  6110
10778  10793  10808  2601   2610  2619   26540  2662  2673  2684  2693   2704   2833  2879  3316  5013  5095  pg_internal.init
10779  10794  1247   2602   2611  2620   2655   2663  2674  2685  2696   27069  2834  2895  3317  5014  6040  pgsql_tmp
10781  10796  1248   2603   2612  26494  2656   2664  2675  2686  26978  27178  2835  2898  5002  5015  6041  PG_VERSION
10783  10798  1249   2604   2613  2650   26563  2665  2678  2687  2699   27180  2836  2899  5004  5016  6052
10784  10799  1250   2605   2614  2651   2657   2666  2679  2688  2700   27181  2837  3049  5005  5017  6053
10786  10801  1255   2606   2615  26517  2658   2667  2680  2689  2701   27629  2838  3306  5007  5026  6054
10788  10803  1259   2607   2616  2652   2659   2668  2681  2690  2702   2830   2839  3307  5010  5031  6067

These files correspond to the tables in the database. To find the file for a particular table, that is "test," you can use the following query

select oid, relname from pg_class where relname='test';
  oid  | relname
-------+---------
 27024 | test
(1 row)

The result is, the file 27024 contains the data for the test table in the gpadmin database.

Note:

Note that while a table's file node often matches its OID, this is not necessarily always the case. Some operations, like TRUNCATE, REINDEX, CLUSTER and some forms of ALTER TABLE, can change the file node while preserving the OID. Avoid assuming that file node and table OID are always the same.

Each segment in the Greenplum cluster should have a folder for each database and a file for each table. If the size of these files are inconsistent across the segments, this may indicate that the data in these tables is not distributed evenly.

Comments

Powered by Zendesk