hadoopmapreducehivelzo

describe extended table in Hive


I am storing the Table as a SequenceFile format and I am setting the below commands to enable Sequence with BLOCK Compression-

set mapred.output.compress=true;
set mapred.output.compression.type=BLOCK;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.LzoCodec;

But when I tried viewing the tables like this-

describe extended lip_table

I got below information in which there is a field called compressed which is set as false, So that means my data doesn't got compressed by setting the above three commands?

Detailed Table Information      Table(tableName:lip_table, dbName:default, owner:uname, 
createTime:1343931235, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:
[FieldSchema(name:buyer_id, type:bigint, comment:null), FieldSchema(name:total_chkout, 
type:bigint, comment:null), FieldSchema(name:total_errpds, type:bigint, comment:null)], 
location:hdfs://ares-nn/apps/hdmi/uname/lip-data, 
inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
**compressed:false**, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:
{serialization.format=   , field.delim=

Solution

  • I found this article that I think gives the solution to your problem. You should rather try to specify the usage of your compression codec at the level of your table definition, either when creating the table or by using the ALTER statement.

    At creation time:

     CREATE EXTERNAL TABLE lip_table (
                                        column1 string
                                      , column2 string 
                                     )
    PARTITIONED BY (date string)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY "\t"
    STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
              OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
    LOCATION '/path/to/hive/tables/lip';
    

    Using ALTER (only affects partitions created subsequently):

    ALTER TABLE lip_table
    SET FILEFORMAT
        INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
        OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
    

    http://www.mrbalky.com/2011/02/24/hive-tables-partitions-and-lzo-compression/