What are Parquet table and what benefits do they have ?
Parquet is a format for column oriented data to be stored in HDFS. It is part of the Apache distribution and is also available in Pivotal HDB. HDB (HAWQ) can store and read data in the Parquet format and it is also available with the open source components of Pivotal HD such as Pig and MapReduce.
Here are 6 part series on the working / benefit of using Parquet table.
Please NOTE: The video series talks about older version of PHD but the concepts remain the same, Please refer to the documentation for updated and new feature added to Parquet table in the newer version of PHD.