I have a list of files in a directory and I want to create one archive format file. I used CPIO to create the file as
ls | cpio -ov -H crc > demo.cpio
and I have a cpio structure like this
struct cpio_newc_header {
char c_magic[6];
char c_ino[8];
char c_mode[8];
char c_uid[8];
char c_gid[8];
char c_nlink[8];
char c_mtime[8];
char c_filesize[8];
char c_devmajor[8];
char c_devminor[8];
char c_rdevmajor[8];
char c_rdevminor[8];
char c_namesize[8];
char c_check[8];
};
I can able to fetch the metadata, pathname, file data in the header by using the c_filesize,c_namesize.I can fetch the file data based on c_filesize,but after fetching the file data there some extra bits are padded,i.e after the file data and before the next header.
00000230: 6e63 6965 7322 3a5b 5d0d 0a7d 0d0a 0000 ncies":[]..}....
00000240: 3037 3037 3032 3030 3636 4246 3838 3030 0707020066BF8800
here we can observe after the '}' some extra bytes are padded. I taught its rounding by the multiples of four but I observed some other data which is not multiples of four
00000450: 2066 6f72 2063 7279 7074 6f20 7665 7269 for datapo veri
00000460: 6669 6361 7469 6f6e 0a00 0000 3037 3037 fication....0707
Why the extra bytes are padding.Can we avoid while doing CPIO?
From the manpage of cpio (section New ASCII Format):
The pathname is followed by NUL bytes so that the total size of the fixed header plus pathname is a multiple of four. Likewise, the file data is padded to a multiple of four bytes. Note that this format supports only 4 gigabyte files (unlike the older ASCII format, which supports 8 gigabyte files).
See also man 5 cpio
In your second example, it is also padded to be 4-bytes-aligned:
00000460: 6669 6361 7469 6f6e 0a00 0000 3037 3037 fication....0707
You see, the data ends at 0x468
and three extra zero bytes for padding are added, so the next chunk can start at 0x46c
.
This padding is probably performed to avoid unaligned access to header fields after reading it into memory. It is part of the specification, so there is no option to avoid it.
But it's easy to calculate it. If the offset x
is the next byte after the file end, then the next header begins at offset
int nextheader = (x+3)&~3;