Pivotal Knowledge Base

Follow

How to collect DDL and statistics information using the minirepro utility

Environment

Product Version
Pivotal Greenplum (GPDB) 4.3.7.x and forth
minirepro 1.xx

Overview

What is Minirepro ?

The minirepro utility is a smaller version of the gpsd (Greenplum Schema Dump) utility that collects the schema and statistics information (without any user data) for reproducing the customer's issue in-house. gpsd gets the complete database metadata dump along with the statistics information for all relations.

Minirepro has made this simple by extracting only the information needed for the affected query and not for the complete database. It reads the query and gets DDL and statistics information for the relations involved. Usage examples are shown below:

[gpadmin@mdw]$ /usr/local/GP-4.3.7.2/bin/minirepro --version
minirepro 1.0
[gpadmin@mdw]$ /usr/local/GP-4.3.7.2/bin/minirepro --help
Usage: minirepro  [options]

Options:
  --version             show program's version number and exit
  -?, --help            Show this help message and exit
  -h HOST, --host=HOST  Specify a remote host
  -p PORT, --port=PORT  Specify a port other than 5432
  -U USER, --user=USER  Database user to connect with.This is mandatory as of now
  -q QUERY_FILE         file name that contains the query
  -f OUTPUT_FILE        minirepro output file name. Must be absolute path 
Example: minirepro database-name -q sql-file-name -f /tmp/output-file-name -U gpadmin

Note: This utility is made available from 4.3.7.0 onwards. Also, the query file should have a single Valid Formatted Query. 

A sample run is shown below:

gpadmin@mdw]$ minirepro prod_db -q query.sql -f /data/oufile_msq -U gpadmin 
Connecting to database: host=mdw, port=5432, user=gpadmin, db=prod_db ...
Extracting metadata from query file query.sql ...
Invoking pg_dump to dump DDL ...
pg_dump -h mdw -p 5432 -U gpadmin -sxO prod_db -t 'ravedw.v_f_dtd_modular_scenario_fdsf|fdsf_scenario_map|mt_position|mt_scenario_surf_dim_msr|mt_modular_scenario' -f /tmp/20160419101152/ravedw.dp.sql
pg_dump -h mdw -p 5432 -U gpadmin -sxO prod_db -t 'dwuser.u_modular_scenarios_fdsf' -f /tmp/20160419101152/dwuser.dp.sql
Writing schema DDLs ...
Writing table & view DDLs ...
Writing table statistics ...
Writing column statistics ...
Attaching raw query text ...
--- MiniRepro completed! ---

Procedure 

Here are some possible errors that you may encounter while running this:

Possible Error 1

Error while running gp_toolkit.gp_dump_query_oids(text).
Please make sure the function is installed and the query file contains single valid query.

error 'ERROR:  Cannot parse query. Please make sure the input contains a single valid query. (gpoptutils.c:53)

Solution

Make sure that the query file used with -q flag has no format/indent errors. Manual Syntax and a Semantic Check is required. Also, as explained in the error, there must be only one query in the file. A simple extra SET command will also error out.

Possible Error 2

Error while running gp_toolkit.gp_dump_query_oids(text).
Please make sure the function is installed and the query file contains single valid query.

Solution

This is possible in case of upgrades since the tool has been introduced from 4.3.7.0. So the functions need to get installed using the script file located at :

cp $GPHOME/share/postgresql/gp_toolkit.sql gp_toolkit_2.sql

Modify the gp_toolkit_2.sql using vi: 

Remove the following statement in the file [This should be in the beginning of the file].

Begin;

Save the file. 

You can interactively run this file in the database using the following:

prod_db=# \i gp_toolkit_2.sql ( or ) 
psql -f gp_toolkit_2.sql <dbname>

When running the gp_toolkit_2.sql, you may see some errors such as :

relation "gp_pgdatabase_invalid" already exists  

or 

function "gp_skew_idle_fraction" already exists with same argument types

You can safely ignore these errors.

Once done, check the minirepro again. 

Possible Error 3

raise ValueError(errmsg("Extra data", s, end, len(s))) 
ValueError: Extra data: line 2 column 1 - line 3 column 1 (char 134 - 150)

Solution:

This issue can be seen if timing has been enabled on the system. You need to edit .psqlrc file to remove the "\timing" option and save the file. After that try minirepro again.

Internal Comments 

Usage

The minerepro utility is needed to reproduce the issue in Pivotal support lab systems, investigate the issue for root cause analysis, and fix. Pivotal support will ask client to run this on the environment and provide the output file.

Comments

Powered by Zendesk