javafilestreamzipzipoutputstream

Prevent removing duplicates from ZipOutPutStream


I am using this functionalities to zip bunch of files, but the problem is if the there are two file names with different content but with the same original name, then only one file is been zipped, how can i prevent this by adding a number to file name before extension e.g. file1.txt a number extension to that dupplicates name?

     ZipOutputStream zipOut = new ZipOutputStream(response.getOutputStream())
    
               files.forEach(file -> {

                        final ZipEntry zipEntry = new ZipEntry(Objects.requireNonNull(file.getOriginalName()));
                        zipOut.putNextEntry(zipEntry);
                        IOUtils.copy(file.getInputStream(), zipOut);
                        file.getInputStream().close();
                        zipOut.closeEntry();
}

for duplication

e.g.

 files.add("hello");
        files.add("hello");
        files.add("hello");
        files.add("name");
        files.add("name");
        files.add("name");

        files.add("hello22");
        files.add("name");

then the result should be

"hello", "hello1", "hello2", "name", "name1", "name2", "name3"

Solution

  • Assuming you are creating a new zip file rather than editing an existing zip archive, then you can iterate through your files list and if any duplicates are found you can note the new name in a HashMap or similar like so duplicateNameMap.put(oldNameString, newNameString);, then inside your zip method you can simply check the HashMap and use the updated name:

    //Hashmap to store updated names
    HashMap<String, String> duplicateNameMap = new HashMap<>();
    
    //compare files to each other file to find duplicate names:
    for (int i = 0; i < files.size(); i++) {
        for (int j = i+1; j < files.size(); j++) {
            //If duplicate exists then save the new name to the HashMap:
            if(files.get(i).getOriginalName().equals(files.get(j).getOriginalName())) {
                //Use substring to get the file name and extension
                String name = files.get(i).getOriginalName().substring(0, files.get(i).getOriginalName().lastIndexOf("."));
                //If the files have no extension ".doc" etc, then you can remove the next line
                String extension = files.get(i).getOriginalName().substring(files.get(i).getOriginalName().lastIndexOf(".")+1);
                //Use a method to count the number of previous files with the same name and set the correct duplicate number
                String duplicateNumber = fixDuplicateName(files.get(i).getOriginalName());
                    
                //Store the new name in a hashmap using the old name as a key
                duplicateNameMap.put(files.get(i).getOriginalName(), name + duplicateNumber + extension));
            }
        }
    }
    
    ZipOutputStream zipOut = new ZipOutputStream(response.getOutputStream());
    
    //Then when saving files we can check the hashmap and update the files accordingly
    files.forEach(file -> {
        String name = file.getOriginalName();
        //Check the HashMap to see if there is a duplicate then get the correct name from the hashmap:
        if (duplicateNameMap.containsKey(file)) {
            //Grab the new name from the hashmap
            String newName = duplicateNameMap.get(name);
            //Remove that entry from the hashmap so that it is not used again
            duplicateNameMap.remove(name, newName);
            //Assign the new name to be used
            name = newName;
        }
        final ZipEntry zipEntry = new ZipEntry(name);
        zipOut.putNextEntry(zipEntry);
        IOUtils.copy(file.getInputStream(), zipOut);
        file.getInputStream().close();
        zipOut.closeEntry();
    }
    
    Here is the method used to count the number of duplicates and return the correct duplicate number:
    
    public static String fixDuplicateName(HashMap<String, String> duplicateNameMap, String name) {
        //Start the count at 1 (The first duplicate should always be 1)
        int count = 1;
        //Find out if there is more than 1 duplicate in the hashmap and increase the count if needed
        for (String key: duplicateNameMap.keySet()) {
            if (key.equals(name)) {
                count++;
            }
        }
        return count+"";
    }
    

    Note a minor side effect of doing it this way is that the first file will have the "1" added, the second will have "2" added etc, and the last will be the original name, but this is hardly an issue. If the order of files in your files list is important, then you can easily fix it by adding every name to the hash map even if it is not a duplicate, and then in the fixDuplicateName method change int count = 1; to int count = 0; so that non duplicates are labeled correctly. Then when writing the zip just grab every files name from the hashmap.