Using Zig, as a new to the language, trying to perform few simple actions on a file.
I have a function with the purpose of getting a path to a file and returning path to a sorted file.
The function get a path to a file after the file been cleaned from special character, and each line has a single word.
While running the code below, the Array List always end up full of fractions of the last word.
const std = @import("std");
const GPA = std.heap.GeneralPurposeAllocator;
const log = std.log;
const File = std.fs.File;
pub fn sortFile(file_path: []u8, buffer: []u8) ![]u8 {
// creating general purpose allocator
var general_purpose_allocator = GPA(.{}){};
const gpa = general_purpose_allocator.allocator();
// Getting the path fitting base on OS
var encoded_path_buffer = gpa.alloc(u8, file_path.len) catch unreachable;
const encoded_file_sub_path = try encodePathForOs(file_path, encoded_path_buffer);
// Creating reader
var file_read: File = try cwd().openFile(encoded_file_sub_path, .{});
defer file_read.close();
const file_reader = file_read.reader();
gpa.free(encoded_new_file_sub_path);
// ArrayList for the words
var lines = std.ArrayList([]u8).init(gpa);
// Looping line by line (word per line) and inserting
while (try file_reader.readUntilDelimiterOrEof(buffer, '\n')) |line| {
log.warn("line: {s}\n", .{line});
lines.append(line) catch |err| {
return err;
};
log.warn("lines: {s}\n", .{lines.items});
}
const items = lines.items;
log.warn("items: {s}\n", .{items});
/// rest of function non relevant - for it to be reproducable:
defer lines.deinit();
return file_path;
}
Util — Encode for OS function:
const builtin = @import("builtin");
const os_tag = builtin.os.tag;
const unicode = std.unicode;
const std = @import("std");
fn encodePathForOs(path: []u8, encoded_path_buffer: []u8) ![]u8 {
if (os_tag == .windows) {
var i: usize = 0;
while (i < path.len) : (i += 1) {
const codepoint = try unicode.utf8Decode(path[i .. i + 1]);
_ = try unicode.wtf8Encode(codepoint, encoded_path_buffer[i..]);
}
return encoded_path_buffer;
} else {
return path;
}
}
The input file:
hello
this
is
a
test
this
is
a
test
this
is
a
test
this
is
a
test
Logs of each line before appending, and the state of the ArrayList at each iteration:
line: hello
lines: { hello }
line: this
lines: { this
, this }
line: is
lines: { is
s
, is
s, is }
line: a
lines: { a
s
, a
s, a
, a }
line: test
lines: { test
, test, te, t, test }
line: this
lines: { this
, this, th, t, this, this }
line: is
lines: { is
s
, is
s, is, i, is
s, is
s, is }
line: a
lines: { a
s
, a
s, a
, a, a
s, a
s, a
, a }
line: test
lines: { test
, test, te, t, test, test, te, t, test }
line: this
lines: { this
, this, th, t, this, this, th, t, this, this }
line: is
lines: { is
s
, is
s, is, i, is
s, is
s, is, i, is
s, is
s, is }
line: a
lines: { a
s
, a
s, a
, a, a
s, a
s, a
, a, a
s, a
s, a
, a }
line: test
lines: { test
, test, te, t, test, test, te, t, test, test, te, t, test }
line: this
lines: { this
, this, th, t, this, this, th, t, this, this, th, t, this, this }
line: is
lines: { is
s
, is
s, is, i, is
s, is
s, is, i, is
s, is
s, is, i, is
s, is
s, is }
line: a
lines: { a
s
, a
s, a
, a, a
s, a
s, a
, a, a
s, a
s, a
, a, a
s, a
s, a
, a }
line: test
lines: { test
, test, te, t, test, test, te, t, test, test, te, t, test, test, te, t, test }
The ArrayList at the end:
items: { test , test, te, t, test, test, te, t, test, test, te, t, test, test, te, t, test }
Tried many of the tricks.
Suspected that the word enter as a pointer and not a value, so tried to clone.
Tried to increase ArrayList
size before appending.
Tried to use insert instead of append with index, try to make each word a slice by its own and insert it.
And many more hours of debugging with success.
Looked for similar questions, of course, but not much on Zig around.
I would be very happy to solve it, and even more to understand what is going on here.
Thanks!
So apparently the issue is about copying memory — the line
is the same buffer that get overrides again and again, so the entries are all pointers to the same location which get change with the loop.
To solve the issue, I used:
while (try file_reader.readUntilDelimiterOrEof(buffer, '\n')) |line| {
const new_line: []u8 = try gpa.alloc(u8, line.len);
@memcpy(new_line, line);
try lines.append(new_line);
}
Which copy the line into another memory buffer.