Parquet File Format
Hadoop support many File formats. These include plain text files in Hadoop and storing files in Hadoop specific format like Sequence Files. There some more sophisticated file formats like Avro and Parquet. Every File format in Hadoop brings its own strengths. In this blog post we will discuss what is Parquet File Format and how is it useful for us. Parquet File format was created by Twitter and Cloudera to make a efficient file format HDFS. Parquet File format comes from class of columnar file formats. Columnar File formats are more usable when you plan to access only few columns of data. These kind of formats are very useful for columnar databases. It has following advantages. 1. Columnar File formats are more compression friendly, because probability having common values in a column is more as compare to at row level. 2. While reading only those columns are read which are required. so you en...