Pivotal Knowledge Base

Follow

How to Determine when Database Relations are Modified and if a Relation Needs an Analysis

Environment

  • Pivotal Greenplum Database (GPDB) 4.3.x
  • Operating System- Red Hat Enterprise Linux (RHEL) 6.x

Purpose

Sometimes the Database Administrators (DBA) has concerns like:

  • When was the table/index etc was last modified or updated?
  • If there is a need for analyzing the table
  • If there is any data added to the table for which that relation need a analyze.
  • Which partition table has the data added and which partition table needs an analyze

In this article, we will look at creating a view which will help us obtain the information and answer some of the queries mentioned above.

Procedure

  1. Create the function using the script below
    CREATE OR REPLACE FUNCTION fn_ident_table_last_change() RETURNS void AS
    $$
    DECLARE 
            v_function_name text := 'fn_ident_table_last_change';
            v_location int;
            v_sql text;
            v_db_oid text;
    BEGIN
            v_location := 1000;
            SELECT oid INTO v_db_oid 
            FROM pg_database 
            WHERE datname = current_database();
    
            v_location := 2000;
            v_sql := 'DROP VIEW IF EXISTS v_ident_table_last_change cascade';
    
            v_location := 2100;
            EXECUTE v_sql;
            
            v_location := 2200;
            v_sql := 'DROP EXTERNAL TABLE IF EXISTS ext_db_files_last_change';
    
            v_location := 2300;
            EXECUTE v_sql;
    
            v_location := 3000;
            v_sql := 'CREATE EXTERNAL WEB TABLE ext_db_files_last_change ' ||
            '( ' ||
                    '       segment_id INTEGER, ' ||
                    '       relfilenode TEXT, ' ||
                    '       filename TEXT, ' ||
                    '       size NUMERIC, ' ||
                    '       last_timestamp_change timestamp ' ||
                    ') ' ||
            'EXECUTE E''find $GP_SEG_DATADIR/base/' || v_db_oid ||  
            E' -type f -printf "$GP_SEGMENT_ID|%f|%h/%f|%s|%TY-%Tm-%Td %TX\n" 2> /dev/null || true'' ON ALL  ' ||
            E'FORMAT ''text'' (delimiter ''|'' null E''\N'' escape E''\\'''')';
    
            v_location := 3100;
            EXECUTE v_sql;
    
            v_location := 4000;
            v_sql := 'CREATE OR replace VIEW v_ident_table_last_change ' ||
                    'AS ' ||
                    '  WITH last_change ' ||
                    '       AS (SELECT Split_part(relfilenode, ''.'' :: text, 1) AS table_relfilenode, ' ||
                    '                  SUM(SIZE)                               AS SIZE, ' ||
                    '                  Max(last_timestamp_change)              AS ' ||
                    '                  max_last_timestamp_change ' ||
                    '           FROM   ext_db_files_last_change ' ||
                    '           WHERE  relfilenode NOT IN ( ''PG_VERSION'', ''pg_internal.init'' ) ' ||
                    '           GROUP  BY 1) ' ||
                    '  SELECT (n.nspname ||''.''|| tab.relname ) AS relation_name, ' ||
                    '         T.relpages, ' ||
                    '         T.reltuples, ' ||
                    '         T.max_last_timestamp_change, ' ||
                    '         pslo.statime AS last_analyze_timestamp,' ||
                    '         tab.relhassubclass AS flag_is_partitioned, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = FALSE THEN max_last_timestamp_change ' ||
                    '           ELSE Greatest(Max(max_last_timestamp_change) ' ||
                    '                           over ( ' ||
                    '                             PARTITION BY CASE WHEN tab.relhassubclass = TRUE ' ||
                    '                           THEN ' ||
                    '                           n.nspname ' ||
                    '                           ELSE ' ||
                    '                           part.schemaname END, CASE WHEN tab.relhassubclass = ' ||
                    '                           TRUE THEN ' ||
                    '                           tab.relname ELSE ' ||
                    '                           part.tablename END ), max_last_timestamp_change) ' ||
                    '         END                AS global_max_last_timestamp_change, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = FALSE THEN sum_size_table ' ||
                    '           ELSE SUM(sum_size_table) ' ||
                    '                  over ( ' ||
                    '                    PARTITION BY CASE WHEN tab.relhassubclass = TRUE THEN ' ||
                    '                  n.nspname ELSE ' ||
                    '                  part.schemaname END, CASE WHEN tab.relhassubclass = TRUE THEN ' ||
                    '                  tab.relname ELSE ' ||
                    '                  part.tablename END ) ' ||
                    '         END                AS global_sum_size_table, ' ||
                    '         T.sum_size_table, ' ||
                    '         tab.relkind        AS relkind, ' ||
                    '         tab.relstorage, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = TRUE THEN n.nspname ' ||
                    '           ELSE part.schemaname ' ||
                    '         END                AS part_master_schema, ' ||
                    '         CASE ' ||
                    '           WHEN tab.relhassubclass = TRUE THEN tab.relname ' ||
                    '           ELSE part.tablename ' ||
                    '         END                AS part_master_table ' ||
                    '  FROM   (SELECT Coalesce(aoseg.relid, aovisi.relid, toast.oid, c.oid) AS oid, ' ||
                    '                 SUM(c.relpages)                                       AS ' ||
                    '                 relpages ' ||
                    '                 , ' ||
                    '                 SUM(c.reltuples) ' ||
                    '                 AS reltuples, ' ||
                    '                 Max(last_change.max_last_timestamp_change)            AS ' ||
                    '                        max_last_timestamp_change, ' ||
                    '                 SUM(last_change.SIZE)                                 AS ' ||
                    '                 sum_size_table ' ||
                    '          FROM   last_change ' ||
                    '                 inner join pg_class c ' ||
                    '                         ON last_change.table_relfilenode = c.relfilenode ' ||
                    '                 left outer join pg_appendonly aoseg ' ||
                    '                              ON aoseg.segrelid = c.oid ' ||
                    '                 left outer join pg_appendonly aovisi ' ||
                    '                              ON aovisi.visimaprelid = c.oid ' ||
                    '                 left outer join pg_class toast ' ||
                    '                              ON c.oid = toast.reltoastrelid ' ||
                    '          GROUP  BY 1) T ' ||
                    '         inner join pg_class tab ' ||
                    '                 ON T.oid = tab.oid ' ||
                    '         left join pg_stat_last_operation pslo' ||
                    '               ON pslo.objid = T.oid ' ||
                    '               AND pslo.staactionname = ''ANALYZE''' ||
                    '         inner join pg_namespace n ' ||
                    '                 ON tab.relnamespace = n.oid ' ||
                    '                    AND n.nspname NOT IN ( ''information_schema'', ''pg_catalog'', ' ||
                    '                                           ''pg_toast'' ' ||
                    '                                         ) ' ||
                    '         left outer join pg_partitions part ' ||
                    '                      ON n.nspname = part.partitionschemaname ' ||
                    '                         AND tab.relname = part.partitiontablename';
    
            v_location := 4100;
            EXECUTE v_sql;               
    
    EXCEPTION
            WHEN OTHERS THEN
                    RAISE EXCEPTION '(%:%:%)', v_function_name, v_location, sqlerrm;
           
    END;
    $$
    language plpgsql;
    
  2. Execute the function to create the external table and the view.
    select fn_ident_table_last_change();
  3. Now use the view to obtain the information
    select * from v_ident_table_last_change;

Additional Information

Running the functions mentioned above produce the columns below:

flightdata=# select * from v_ident_table_last_change order by 1;
            relation_name             | relpages | reltuples | max_last_timestamp_change  |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
-----------------------------------+----------+-----------+----------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+---------
 public.test2                      |        5 |      3000 | 2016-02-09 10:52:57.217062 | 2016-02-11 14:50:07.866757+00 | f                   | 2016-02-09 10:52:57.217062       |                163840 |         163840 | r       | h          |                    | 
 public.test3                      |        2 |      1000 | 2016-02-09 10:58:58.243907 | 2016-02-11 14:50:07.676233+00 | f                   | 2016-02-09 10:58:58.243907       |                 65536 |          65536 | r       | h          |                    | 
 public.test4                      |        4 |      2000 | 2016-02-09 11:16:45.776341 | 2016-02-11 14:50:07.597249+00 | f                   | 2016-02-09 11:16:45.776341       |                131072 |         131072 | r       | h          |                    | 
 public.test5                      |        2 |      1000 | 2016-02-09 11:17:45.380956 | 2016-02-11 14:50:07.431719+00 | f                   | 2016-02-09 11:17:45.380956       |                 65536 |          65536 | r       | h          |                    | 
(4 rows)

Below are the explanations for the columns:

Column Name Description
relation_name The name for the relation
relpages Number of OS pages used by the relation ( based on stats )
reltuples Number of records on the relation ( based on stats )
max_last_timestamp_change The maximum time from all the segments when the table's relfilenode was last touched or modified.
last_analyze_timestamp The last timestamp when the relation was last analyzed.
flag_is_partitioned True, if it has a child tables.
global_max_last_timestamp_change If there is a child table then its the max timestamp to when the child table was touched or modified (only for partition table)
global_sum_size_table If there is a child table then it is a sum of all files including child tables (only for partition table) 
sum_size_table The sum of size of its relfilenode
relkind The type of object (r = ordinary table, i = index, S = sequence, v = view, c = composite type, t = TOAST table )
relstorage The storage mode of a table (a = append-optimized, h = heap, v = virtual, x = external table)
part_master_schema Partition schema name
part_master_table Partition table name

 

So, taking an example of a newly created table "test"

  • Insert the data and analyze
flightdata=# insert into public.test values (generate_series(1,100000));
INSERT 0 100000
flightdata=# 
flightdata=# analyze test;
ANALYZE
  • The view now shows the time when the timestamp of the file (from all segments ) that has the largest timestamp, which means the time when the file was last touched.
flightdata=# select * from v_ident_table_last_change where relation_name='public.test';
 relation_name | relpages | reltuples | max_last_timestamp_change |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
---------------+----------+-----------+---------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+----------------+---------+------------+--------------------+-------------------
 public.test   |      112 |     99781 | 2016-02-12 08:27:32.89579 | 2016-02-12 08:27:33.150318+00 | f                   | 2016-02-12 08:27:32.89579        |               7274496 |        7274496 | r       | h          |                    | 
(1 row)
  • Insert a couple of more rows as shown below:
flightdata=# insert into public.test values (generate_series(1,100000));
INSERT 0 100000
flightdata=# insert into public.test values (generate_series(1,100000));
INSERT 0 100000
  • Now the view explains the timestamp when it was last modified and the last analyzed data, providing you have a clue that there is a data change and an analyze should be done on that table to keep the stats up to date with the changes in the table.
flightdata=# select * from v_ident_table_last_change where relation_name='public.test';
 relation_name | relpages | reltuples | max_last_timestamp_change  |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
---------------+----------+-----------+----------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+----------------+---------+------------+--------------------+-------------------
 public.test   |      112 |     99781 | 2016-02-12 08:32:40.648141 | 2016-02-12 08:27:33.150318+00 | f                   | 2016-02-12 08:32:40.648141       |              10878976 |       10878976 | r       | h          |                    | 
(1 row)

flightdata=# 
  • Perform the analyze and recheck the result as it displayed below:
flightdata=# analyze test;
ANALYZE

flightdata=# select * from v_ident_table_last_change where relation_name='public.test';
 relation_name | relpages | reltuples | max_last_timestamp_change  |    last_analyze_timestamp     | flag_is_partitioned | global_max_last_timestamp_change | global_sum_size_table | sum_size_table | relkind | relstorage | part_master_schema | part_master_table 
---------------+----------+-----------+----------------------------+-------------------------------+---------------------+----------------------------------+-----------------------+----------------+---------+------------+--------------------+-------------------
 public.test   |      332 |    301788 | 2016-02-12 08:32:40.648141 | 2016-02-12 08:33:15.441161+00 | f                   | 2016-02-12 08:32:40.648141       |              10878976 |       10878976 | r       | h          |                    | 
(1 row)

 

Comments

Powered by Zendesk