I have the following line of code insider of a jupyter notebook:
!ls data/kapitel*.txt \
| while read file; do \
dirname="${file%.*}"; \
mkdir -p "$dirname"; \
awk \
-v dir="$dirname" \
'
/^Artikel$/ {
if (f) close(f);
f = sprintf("data/kapitel1/Artikel%d.txt", ++n)
}
{
print >> f
}
' \
"$file"; \
echo "Writing to file: $f"; \
echo "Creating directory: $dirname"; \
done
and I get the following prints:
Writing to file:
Creating directory: data/kapitel10
awk: cannot open "" for output (No such file or directory)
Writing to file:
Creating directory: data/kapitel11
awk: cannot open "" for output (No such file or directory)
Writing to file:
...
As you can see, I hardcoded a file output for test purposes but the string seems to be empty, which leaves me fairly confused, Creating the directories works. Any tips regarding the empty string would be appreciated!
Further Context: In this case, multiple Files ranging from kapitel1.txt to kapitel11.txt lie in the current directory. Inside these files, you can find lines starting with "Artikel". I am trying to create a folder for each kapitel*.txt file with a corresponding name and then split the kapitel.txt files at each abstract starting with "Artikel". Afterwards, I want to save these abstract to the corresponding created folder.
You're trying to produce output before Artikel
appears in your input so f
isn't yet populated when you first do print >> f
. Try this instead, assuming you don't want to print any input before Artikel
appears:
/^Artikel$/ {
close(f)
f = sprintf("data/kapitel1/Artikel%d.txt", ++n)
}
f {
print >> f
}
Other than that, this:
ls data/kapitel*.txt \
| while read file; do
is fragile, buggy, and employs at least 2 antipatterns, see https://mywiki.wooledge.org/ParsingLs and why-is-using-a-shell-loop-to-process-text-considered-bad-practice, so don't do that, do this instead:
for file in data/kapitel*.txt; do
You also seem to be confusing awk
and bash
- they are 2 totally different tools with their own syntax, semantics, variable scope, etc. echo "Writing to file: $f";
is a line of bash
code that's trying to print the value of an awk
variable, f
- you can't do that any more than you could try to print the value of a C
variable from a python
script that happens to call a C
program. Also printf "foo" >
and printf "foo" >>
in awk do not mean the same thing as printf "foo" >
and printf "foo" >>
in bash so I corrected your usage of >>
below, see the awk man page for details on its syntax.
I think what you might be trying to do overall is:
for file in data/kapitel*.txt; do
dirname="${file%.*}"
printf 'Creating directory: %s\n' "$dirname" >&2
mkdir -p "$dirname" &&
awk \
-v dir="$dirname" \
'
/^Artikel$/ {
close(f)
f = sprintf("%s/Artikel%d.txt", dir, ++n)
printf "Writing to file: %s\n", f | "cat>&2"
}
f {
print > f
}
' \
"$file"
done
but without sample input and expected output that's just an untested guess.