c++pugixmllibzip

How to construct a zip file with libzip


I try to create a compressed file and insert an xml file in it using two libraries (pugixml / libzip), everything goes without error, but when I open the xml file, the encoding at the beginning of the file is weird :

Main.cpp :

#include <iostream>
#include <sstream>
#include <zip.h>
#include <pugixml.hpp>
#include <memory>

using namespace std;

int main()
{

    auto document = std::unique_ptr<pugi::xml_document>(new pugi::xml_document);
    pugi::xml_node declNode = document->prepend_child(pugi::node_declaration);
    declNode.append_attribute("version") = "1.0";
    declNode.append_attribute("encoding") = "UTF-8";
    declNode.append_attribute("standalone") = "yes";

    pugi::xml_node rootNode = document->append_child("Document");
    rootNode.append_child("Files");


    int err = 0;
    zip_t* zip = zip_open("test.zip", ZIP_CREATE, &err);
    {
        {
            std::stringstream ss;
            document->save(ss, "  ");
            std::string buffer = ss.str();
            auto src = zip_source_buffer_create(buffer.c_str(),buffer.length(),0,0);
            zip_file_add(zip,"Document.xml",src,ZIP_FL_ENC_UTF_8);
        }
    }
    zip_close(zip);

    return 0;
}

Document.xml :

~U Ä U rsion="1.0" encoding="UTF-8" standalone="yes"?>
<Document>
   <Files />
</Document>

Hex : hex


Solution

  • The posted program has undefined behaviour due to reading already freed memory.

    In the example you posted the zip_source_t gets creatd with freep = 0, so you need to make sure that the provided buffer remains valid for the entire lifetime of the zip_source_t object:
    zip_source_buffer

    zip_source_t * zip_source_buffer_create(const void *data, zip_uint64_t len, int freep, zip_error_t *error);

    The functions zip_source_buffer() and zip_source_buffer_create() create a zip source from the buffer data of size len. If freep is non-zero, the buffer will be freed when it is no longer needed.
    data must remain valid for the lifetime of the created source.
    The source can be used to open a zip archive from.

    zip_file_add() (if successfull) will take ownership of the source you give it, but note that it is not required to free the source immediately - it could for example store it within the zip_t.

    As it is currently implemented zip_file_add() does not free the zip_source_t - it instead hangs onto it and writes it out once you call zip_close(). So your buffer needs to remain valid for the entire remaining lifetime of the zip_t object - in this case this is until the call to zip_close().

    If you rewrite your example to keep the std::string buffer; alive until after you've closed the zip_t the resulting file should be correct:

    zip_t* zip = zip_open("test.zip", ZIP_CREATE, &err);
    std::stringstream ss;
    document->save(ss, "  ");
    std::string buffer = ss.str();
    auto src = zip_source_buffer_create(buffer.c_str(),buffer.length(),0,0);
    zip_file_add(zip,"Document.xml",src,ZIP_FL_ENC_UTF_8);
    zip_close(zip); // lifetime of zip_t ends here (which will also end the lifetime of the zip_source_t)
    
    // lifetime of buffer ends at end of scope
    

    Recommendation

    There is no way to know how long the zip_source_t will be alive (it's reference-counted), and so it's not easy to know for how long exactly you need to keep your buffer alive.

    So I would recommend allocating the memory for the zip_source_t separately with malloc() and passing freep=1 to zip_source_buffer_create().

    That way the buffer will remain valid as long as the zip_source_t is alive.