.netcompressionbase64gzipstream

GZipStream: why do we convert to base 64 after compression?


I was just looking at a code sample for compressing a string. I find that using the GZipStream class suffices. But I don't understand why we have to convert it to base 64 string as shown in the example.

using System.IO.Compression;
using System.Text;
using System.IO;

public static string Compress(string text)
{
    byte[] buffer = Encoding.UTF8.GetBytes(text);
    MemoryStream ms = new MemoryStream();

    using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
    {
        zip.Write(buffer, 0, buffer.Length);
    }

    ms.Position = 0;
    MemoryStream outStream = new MemoryStream();

    byte[] compressed = new byte[ms.Length];
    ms.Read(compressed, 0, compressed.Length);

    byte[] gzBuffer = new byte[compressed.Length + 4];
    System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
    System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
    return Convert.ToBase64String(gzBuffer);
}

Further, I don't understand my the gzBuffer is initialized to a size compressed.Length + 4. Actually I don't understand why we have the last few statements either. Can someone share some light?


Solution

  • Most likely the base 64 string is so that it can be viewed as plain text, for example for printing, including in an email or something like that. Edit: Now I see the source, they say they want to insert it in an XML file, so that is why they needed to be plain text.

    The compressed.Length + 4 size is required because of the next line - BlockCopy. It starts copying from 4 bytes into the gzBuffer. (The 4th argument is the byte offset into the destination buffer). The second BlockCopy is putting the length of the compressed string into the first four bytes of the destination buffer. I'm not sure why it would need the length here, but there may well be a corresponding decode routine it has to line up with.

    Edit: The length is used in the decompression routine so that the program knows how long the decompressed byte buffer should be.