Pivotal Knowledge Base

Follow

Determine when the database relations are modified and if the relation needs an analyze

Environment

Product Version
Pivotal Greenplum (GPDB) 4.3.x
OS RHEL 6.x
Others  

Purpose

Sometimes the Database Administrators (DBA) has concerns like:

  • When was the table / index etc was last modified or updated.
  • If there is a need for analyzing the table
  • If there is any data added to the table for which that relation need a analyze.
  • Which partition table has the data added and which partition table needs an analyze

In this article, we will look at creating a view which will help us obtain the information and answer some of the queries mentioned above.

Procedure

  1. Create the function using the script below
    CREATE OR REPLACE FUNCTION fn_ident_table_last_change() RETURNS void AS
    $$
    DECLARE 
            v_function_name text := 'fn_ident_table_last_change';
            v_location int;
            v_sql text;
            v_db_oid text;
    BEGIN
            v_location := 1000;
            SELECT oid INTO v_db_oid 
            FROM pg_database 
            WHERE datname = current_database();
    
            v_location := 2000;
            v_sql := 'DROP VIEW IF EXISTS v_ident_table_last_change cascade';
    
            v_location := 2100;
            EXECUTE v_sql;
            
            v_location := 2200;
            v_sql := 'DROP EXTERNAL TABLE IF EXISTS ext_db_files_last_change';
    
            v_location := 2300;
            EXECUTE v_sql;
    
            v_location := 3000;
            v_sql := 'CREATE EXTERNAL WEB TABLE ext_db_files_last_change ' ||
            '( ' ||
                    '       segment_id INTEGER, ' ||
                    '       relfilenode TEXT, ' ||
                    '       filename TEXT, ' ||
                    '       size NUMERIC, ' ||
                    '       last_timestamp_change timestamp ' ||
                    ') ' ||
            'EXECUTE E''find $GP_SEG_DATADIR/base/' || v_db_oid ||  
            E' -type f -printf "$GP_SEGMENT_ID|%f|%h/%f|%s|%TY-%Tm-%Td %TX\n" 2> /dev/null || true'' ON ALL  ' ||
            E'FORMAT ''text'' (delimiter ''|'' null E''\N'' escape E''\\'''')';
    
            v_location := 3100;
            EXECUTE v_sql;
    
            v_location := 4000;
            v_sql := 'CREATE OR replace VIEW v_ident_table_last_change ' ||
                    'AS ' ||
                    '  WITH last_change ' ||
                    '       AS (SELECT Split_part(relfilenode, ''.'' :: text, 1) AS table_relfilenode, ' ||
                    '                  SUM(SIZE)                               AS SIZE, ' ||
                    '                  Max(last_timestamp_change)              AS ' ||
                    '                  max_last_timestamp_change ' ||
                    '           FROM   ext_db_files_last_change ' ||
                    '           WHERE  relfilenode NOT IN ( ''PG_VERSION'', ''pg_internal.init'' ) ' ||
                    '           GROUP  BY 1) ' ||
                    '  SELECT (n.nspname ||''.''|| tab.relname ) AS relation_name, ' ||
                    '         T.relpages, ' ||
                    '         T.reltuples, ' ||
                    '         T.max_last_timestamp_change, ' ||
                    '         pslo.statime AS last_analyze_timestamp,' ||
                    '         tab.relhassubclass AS flag_is_partitioned, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = FALSE THEN max_last_timestamp_change ' ||
                    '           ELSE Greatest(Max(max_last_timestamp_change) ' ||
                    '                           over ( ' ||
                    '                             PARTITION BY CASE WHEN tab.relhassubclass = TRUE ' ||
                    '                           THEN ' ||
                    '                           n.nspname ' ||
                    '                           ELSE ' ||
                    '                           part.schemaname END, CASE WHEN tab.relhassubclass = ' ||
                    '                           TRUE THEN ' ||
                    '                           tab.relname ELSE ' ||
                    '                           part.tablename END ), max_last_timestamp_change) ' ||
                    '         END                AS global_max_last_timestamp_change, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = FALSE THEN sum_size_table ' ||
                    '           ELSE SUM(sum_size_table) ' ||
                    '                  over ( ' ||
                    '                    PARTITION BY CASE WHEN tab.relhassubclass = TRUE THEN ' ||
                    '                  n.nspname ELSE ' ||
                    '                  part.schemaname END, CASE WHEN tab.relhassubclass = TRUE THEN ' ||
                    '                  tab.relname ELSE ' ||
                    '                  part.tablename END ) ' ||
                    '         END                AS global_sum_size_table, ' ||
                    '         T.sum_size_table, ' ||
                    '         tab.relkind        AS relkind, ' ||
                    '         tab.relstorage, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = TRUE THEN n.nspname ' ||
                    '           ELSE part.schemaname ' ||
                    '         END                AS part_master_schema, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = TRUE THEN tab.relname ' ||
                    '           ELSE part.tablename ' ||
                    '         END                AS part_master_table ' ||
                    '  FROM   (SELECT Coalesce(aoseg.relid, aovisi.relid, toast.oid, c.oid) AS oid, ' ||
                    '                 SUM(c.relpages)                                       AS ' ||
                    '                 relpages ' ||
                    '                 , ' ||
                    '                 SUM(c.reltuples) ' ||
                    '                 AS reltuples, ' ||
                    '                 Max(last_change.max_last_timestamp_change)            AS ' ||
                    '                        max_last_timestamp_change, ' ||
                    '                 SUM(last_change.SIZE)                                 AS ' ||
                    '                 sum_size_table ' ||
                    '          FROM   last_change ' ||
                    '                 inner join pg_class c ' ||
                    '                         ON last_change.table_relfilenode = c.relfilenode ' ||
                    '                 left outer join pg_appendonly aoseg ' ||
                    '                              ON aoseg.segrelid = c.oid ' ||
                    '                 left outer join pg_appendonly aovisi ' ||
                    '                              ON aovisi.visimaprelid = c.oid ' ||
                    '                 left outer join pg_class toast ' ||
                    '                              ON c.oid = toast.reltoastrelid ' ||
                    '          GROUP  BY 1) T ' ||
                    '         inner join pg_class tab ' ||
                    '                 ON T.oid = tab.oid ' ||
                    '         left join pg_stat_last_operation pslo' ||
                    '               ON pslo.objid = T.oid ' ||
                    '               AND pslo.staactionname = ''ANALYZE''' ||
                    '         inner join pg_namespace n ' ||
                    '                 ON tab.relnamespace = n.oid ' ||
                    '                    AND n.nspname NOT IN ( ''information_schema'', ''pg_catalog'', ' ||
                    '                                           ''pg_toast'' ' ||
                    '                                         ) ' ||
                    '         left outer join pg_partitions part ' ||
                    '                      ON n.nspname = part.partitionschemaname ' ||
                    '                         AND tab.relname = part.partitiontablename';
    
            v_location := 4100;
            EXECUTE v_sql;               
    
    EXCEPTION
            WHEN OTHERS THEN
                    RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm;
           
    END;
    $$
    language plpgsql;
    
  2. Execute the function to create the external table and the view.
    select fn_ident_table_last_change();
  3. Now use the view to obtain the information
    select * from v_ident_table_last_change;

Example

Running the functions mentioned above produce the columns below:

flightdata=# select * from v_ident_table_last_change order by 1;
            relation_name             | relpages | reltuples | max_last_timestamp_change  |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
-----------------------------------+----------+-----------+----------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+---------
 public.test2                      |        5 |      3000 | 2016-02-09 10:52:57.217062 | 2016-02-11 14:50:07.866757+00 | f                   | 2016-02-09 10:52:57.217062       |                163840 |         163840 | r       | h          |                    | 
 public.test3                      |        2 |      1000 | 2016-02-09 10:58:58.243907 | 2016-02-11 14:50:07.676233+00 | f                   | 2016-02-09 10:58:58.243907       |                 65536 |          65536 | r       | h          |                    | 
 public.test4                      |        4 |      2000 | 2016-02-09 11:16:45.776341 | 2016-02-11 14:50:07.597249+00 | f                   | 2016-02-09 11:16:45.776341       |                131072 |         131072 | r       | h          |                    | 
 public.test5                      |        2 |      1000 | 2016-02-09 11:17:45.380956 | 2016-02-11 14:50:07.431719+00 | f                   | 2016-02-09 11:17:45.380956       |                 65536 |          65536 | r       | h          |                    | 
(4 rows)

Below are the explanations for the columns:

Column Name Description
relation_name The name for the relation
relpages Number of OS pages used by the relation ( based on stats )
reltuples Number of records on the relation ( based on stats )
max_last_timestamp_change The maximum time from all the segments when the table's relfilenode was last touched or modified.
last_analyze_timestamp The last timestamp when the relation was last analyzed.
flag_is_partitioned True, if it has a child tables.
global_max_last_timestamp_change If there is a child table then its the max timestamp to when the child table was touched or modified (only for partition table)
global_sum_size_table If there is a child table then it is a sum of all files including child tables (only for partition table) 
sum_size_table The sum of size of its relfilenode
relkind The type of object (r = ordinary table, i = index, S = sequence, v = view, c = composite type, t = TOAST table )
relstorage The storage mode of a table (a = append-optimized, h = heap, v = virtual, x = external table)
part_master_schema Partition schema name
part_master_table Partition table name

 

 

 

 

 

 

 

 

 

 

 

  

 

So taking an example of a newly created table "test"

  • Insert the data and analyze
flightdata=# insert into public.test values (generate_series(1,100000));
INSERT 0 100000
flightdata=# 
flightdata=# analyze test;
ANALYZE
  • The view now shows the time when the timestamp of the file (from all segments ) that has the largest timestamp, which means the time when the file was last touched.
flightdata=# select * from v_ident_table_last_change where relation_name='public.test';
 relation_name | relpages | reltuples | max_last_timestamp_change |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
---------------+----------+-----------+---------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+----------------+---------+------------+--------------------+-------------------
 public.test   |      112 |     99781 | 2016-02-12 08:27:32.89579 | 2016-02-12 08:27:33.150318+00 | f                   | 2016-02-12 08:27:32.89579        |               7274496 |        7274496 | r       | h          |                    | 
(1 row)
  • Insert couple of more rows as shown below:
flightdata=# insert into public.test values (generate_series(1,100000));
INSERT 0 100000
flightdata=# insert into public.test values (generate_series(1,100000));
INSERT 0 100000
  • Now the view explains the timestamp when it was last modified and the last analyzed data, providing you have a clue that there is a data change and an analyze should be done on that table to keep the stats up to date with the changes in the table.
flightdata=# select * from v_ident_table_last_change where relation_name='public.test';
 relation_name | relpages | reltuples | max_last_timestamp_change  |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
---------------+----------+-----------+----------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+----------------+---------+------------+--------------------+-------------------
 public.test   |      112 |     99781 | 2016-02-12 08:32:40.648141 | 2016-02-12 08:27:33.150318+00 | f                   | 2016-02-12 08:32:40.648141       |              10878976 |       10878976 | r       | h          |                    | 
(1 row)

flightdata=# 
  • Perform the analyze and recheck the result as it displayed below:
flightdata=# analyze test;
ANALYZE

flightdata=# select * from v_ident_table_last_change where relation_name='public.test';
 relation_name | relpages | reltuples | max_last_timestamp_change  |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
---------------+----------+-----------+----------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+----------------+---------+------------+--------------------+-------------------
 public.test   |      332 |    301788 | 2016-02-12 08:32:40.648141 | 2016-02-12 08:33:15.441161+00 | f                   | 2016-02-12 08:32:40.648141       |              10878976 |       10878976 | r       | h          |                    | 
(1 row)

 

Comments

Powered by Zendesk