node.jsoffsetopendir

Nodejs - iterate directory contents with offset


I know how to stream directory children:

let dir = await require('fs').promises.opendir('/path/to/some/dir');
for await (let child of dir) console.log('Child:', dir);

I am interested to allow access to this functionality with db-style "offset" and "limit" params, but I am unsure of how to apply the "offset" apart from skipping the first number of results from the opendir iterator (and this seems very inefficient, especially for large offsets):

let iterateDir = async function*({ offset=0, limit=100 }={}) {
  
  let dir = await require('fs').promises.opendir('/path/to/some/dir');
  let it = dir.entries();
  
  for (let skip = 0; skip < offset; skip++) await it.next();
  
  for (let lim = 0; lim < limit; lim++) yield await it.next();
  
};

Is there native filesystem functionality for streaming directory contents from a particular offset, and if so, how can I access this functionality from nodejs? Thanks!


Solution

  • The node https://nodejs.org/docs/latest-v16.x/api/fs.html#fspromisesopendirpath-options simply point to the Posix docs for opendir, which has no parameters: https://man7.org/linux/man-pages/man3/opendir.3.html

    Node has to work with a variety of filesystems. While this behavior may exist in some filesystem (I'm not aware of any), it's definitely not universal. Different filesystems can choose whatever order they please. Perhaps it's insertion order, or perhaps it's last modified. It might not even have a consistent order between calls at all.

    Limit and offset only make sense when ordering is guaranteed. There is no contract about the order in which filesystems return the values, so it's not a safe assumption that offsets work even if you ARE iterating over the contents (in fact, this is often true for DBs as well, since rows could be added/deleted in between requests).

    Honestly, though, this feels like a premature optimization. The time to iterate over the list of files should be extremely small compared to almost any operation on the files themselves. If you think you'll be skipping thousands and thousands in a row, you can play around with the buffer size and see if it makes a difference:

    const dir = await opendir('./', { bufferSize: 1024 })