In order to be able to infer a schema of a csv file passed via /dev/stdin
before converting the whole csv file to parquet, I have implemented a wrapper that buffers the input and implements Seek
as required by the crate arrow2
. This all works.
This wrapper is, however, not necessary in certain situations like when a file is redirected to stdin: my_binary /dev/stdin <test.csv
. I would like to only use that wrapper when it is really necessary, as in this case zstdcat test.csv.zst | my_binary /dev/stdin
. As a result, I need to know whether the file I open errors on seek
or not.
The following method I came up with seems to work. Is there a better way? Is this idiomatic for Rust?
fn main() {
let mut file = std::fs::File::open("/dev/stdin").unwrap();
let seekable = match std::io::Seek::rewind(&mut file) {
Ok(_) => true,
Err(_) => false,
};
println!("File is seekable: {:?}", seekable);
}
There is a similar question for C, but the solution doesn't transfer blindly to Rust: How to determine if a file descriptor is seekable? - or is this effectively what file.rewind()
does under the hood?
There is a similar question for C, but the solution doesn't seem to transfer to Rust: How to determine if a file descriptor is seekable? - or is this effectively what file.rewind() does under the hood?
rewind
actually performs a lseek(fd, 0, SEEK_SET)
, so it'll have the side-effect of, well, rewinding (hence the name) the fd's cursor. I assume the reason the original uses SEEK_CUR
is to avoid moving the cursor on seekable files for maximum generality.
If you want to match the original question exactly you should use seek(SeekFrom::Current(0))
. If it doesn't matter then rewind
is fine.
Additionally:
match
, just call is_ok
on the result of rewind (/ seek)std::io::Seek::rewind(&mut file)
, if you use std::io::Seek
then you can just call the provided methods on any seekable objects, like filesSo:
use std::io::{Seek, SeekFrom::Current};
fn main() {
let mut file = std::fs::File::open("/dev/stdin").unwrap();
let seekable = file.seek(Current(0)).is_ok();
println!("File is seekable: {:?}", seekable);
}
matches the C answer exactly.
Though for what it's worth on my mac the device files are seekable by default.
Only way I can get it to fail is if I pipe (not redirect):
> ./test
File is seekable: true
> </dev/null ./test
File is seekable: true
> </dev/null cat | ./test
File is seekable: false