javascriptjszip

Split array of file paths into hierarchical object in JavaScript


Using JSZip which when unziping a file gives me a list of folders and files. For example when I run

files.forEach((relativePath, file) => {
  console.log(relativePath);
});

I get:

three-dxf-master/
three-dxf-master/.DS_Store
three-dxf-master/.gitignore
three-dxf-master/LICENSE
three-dxf-master/README.md
three-dxf-master/bower.json
three-dxf-master/bower_components/

Some of these items are directories and some are files. I can tell which ones are directories by checking file.dir. I would like to split this into a hierarchical data structure. I want to split it up like so:

{
  "three-dxf-master": [
    ".DS_Store",
    ".gitignore",
    "LICENSE",
    "README.md",
    "bower.json",
    {
      "bower_components": [
        ".DS_Store",
        {
          "dxf-parser": [...]
        }
      ]
    }
  ]
}

This way I can send it over to Vue and format it in a nice file viewer. I looked through the docs and I don't see an easy way to create a heirarchical data structure for the files. I started looking into this by grabbing the last one in the file path after a split.


Solution

  • Here is a sample code which also handle files at root.

    See explanation of the code below snippet.

    var paths = [
        "three-dxf-master/",
        "three-dxf-master/.DS_Store",
        "three-dxf-master/.gitignore",
        "three-dxf-master/LICENSE",
        "three-dxf-master/README.md",
        "three-dxf-master/bower.json",
        "three-dxf-master/bower_components/",
        "three-dxf-master/bower_components/.DS_Store",
        "three-dxf-master/bower_components/dxf-parser/",
        "three-dxf-master/bower_components/dxf-parser/foo",
        "three-dxf-master/bower_components/dxf-parser/bar",
        "three-dxf-master/dummy_folder/",
        "three-dxf-master/dummy_folder/foo",
        "three-dxf-master/dummy_folder/hello/",
        "three-dxf-master/dummy_folder/hello/hello",
    ]
    
    // Extract a filename from a path
    function getFilename(path) {
        return path.split("/").filter(function(value) {
            return value && value.length;
        }).reverse()[0];
    }
    
    // Find sub paths
    function findSubPaths(path) {
        // slashes need to be escaped when part of a regexp
        var rePath = path.replace("/", "\\/");
        var re = new RegExp("^" + rePath + "[^\\/]*\\/?$");
        return paths.filter(function(i) {
            return i !== path && re.test(i);
        });
    }
    
    // Build tree recursively
    function buildTree(path) {
        path = path || "";
        var nodeList = [];
        findSubPaths(path).forEach(function(subPath) {
            var nodeName = getFilename(subPath);
            if (/\/$/.test(subPath)) {
                var node = {};
                node[nodeName] = buildTree(subPath);
                nodeList.push(node);
            } else {
                nodeList.push(nodeName);
            }
        });
        return nodeList;
    }
    
    // Build tree from root
    var tree = buildTree();
    
    // By default, tree is an array
    // If it contains only one element which is an object, 
    // return this object instead to match OP request
    if (tree.length == 1 && (typeof tree[0] === 'object')) {
        tree = tree[0];
    }
    
    // Serialize tree for debug purposes
    console.log(JSON.stringify(tree, null, 2));

    Explanation

    function getFilename(path) {
        return path.split("/").filter(function(value) {
            return value && value.length;
        } ).reverse()
        [0];
    }
    

    To get filename, path is splitted by /.

    /path/to/dir/ => ['path', 'to', 'dir', '']

    /path/to/file => ['path', 'to', 'file']

    Only values with a length are kept, this handle dir path.

    The filename is the last value of our array, to get it we simple reverse the array and get the first element.

    function findSubPaths(path) {
        // slashes need to be escaped when part of a regexp
        var rePath = path.replace("/", "\\/");
        var re = new RegExp("^" + rePath + "[^\\/]*\\/?$");
        return paths.filter(function(i) {
            return i !== path && re.test(i);
        });
    }
    

    To find sub paths of a path, we use a filter on paths list.

    The filter use a regular expression (a demo is available here) to test if a path is starting with the parent path and ending with either a / (this is a dir path) or end of line (this is a file path).

    If the tested path isn't equal to parent path and match the regexp, then it's accepted by the filter. Otherwise it's rejected.

    function buildTree(path) {
        path = path || "";
        var nodeList = [];
        findSubPaths(path).forEach(function(subPath) {
            var nodeName = getFilename(subPath);
            if(/\/$/.test(subPath)) {
                var node = {};
                node[nodeName] = buildTree(subPath);
                nodeList.push(node);            
            }
            else {
                nodeList.push(nodeName);
            }   
        });
        return nodeList;
    }
    

    Now that we have methods to extract a filename from a path and to find sub paths, it's very easy to build our tree. Tree is a nodeList.

    If sub path ends with / then it's a dir and we call buildTree recursively before appending the node to nodeList.

    Otherwise we simply add filename to nodeList.

    Additional code

    if (tree.length == 1 && (typeof tree[0] === 'object')) {
        tree = tree[0];
    }
    

    By default, returned tree is an array.

    To match OP request, if it contains only one element which is an object, then we return this object instead.