[SOLVED] Split the word using bigram, trigram

Split the word using bigram, trigram

I have this text file:

worked
working
works
tested
tests
find
found

It contains a million words without spaces. It may contain unicode characters.

The longest word is "working":

awk '{print length, $0}' test.txt | sort -nr | head -1
7 working

I need to create bigram, trigram (Max 7 columns)

w,wo,wor,work,worke,worked,
w,wo,wor,work,worki,workin,working
w,wo,wor,work,works,,
t,te,tes,test,teste,tested,
t,te,tes,test,tests,,
f,fi,fin,find,,,,
f,fo,fou,foun,found,,

preferably in awk (because it's fast)

Solution

A straightforward approach would be:

awk -v n=7 -v OFS=, \
  '{s=$0; len=length(s); for (i=1;i<=len;i++) $i=substr(s,1,i); $n=$n}1'

w,wo,wor,work,worke,worked,
w,wo,wor,work,worki,workin,working
w,wo,wor,work,works,,
t,te,tes,test,teste,tested,
t,te,tes,test,tests,,
f,fi,fin,find,,,
f,fo,fou,foun,found,,

Tested on GNU Awk 5.3.0, mawk 1.3.4 20240819, and The One True Awk version 20240728.