I have this file.log
Sep 16 16:18:49 abcd 123 456
Sep 16 16:18:49 abcd 123 567
Sep 17 16:18:49 abcd 123 456
Sep 17 16:18:49 abcd 123 567
I want to split based on date partition so I get,
Sep_16.log
Sep 16 16:18:49 abcd 123 456
Sep 16 16:18:49 abcd 123 567
Sep_17.log
Sep 17 16:18:49 abcd 123 456
Sep 17 16:18:49 abcd 123 567
I search in the forum, that it's supposed to be using csplit
and regex ^.{6}
, but the answer that I got only for the regex to be used as delimiter, which is not what I intended.
Also, I want to split 10k rows per date partition, so the filename will be something like
Sep_17_part001.log
, which will then using something like prefix and suffix option.
Does anybody know the full command for doing this? And if I do this one time thing on one log, how can I make it to run daily, without csplit overwrite previous days?
So in the end, I decided to create a simple Python script after searching through csplit
documentation and find nothing that suitable to my needs.
Something like,
with open(args.logfile) as f:
for line in f:
timef = datetime.strptime(str(datetime.utcnow().year) + line[:6], '%Y%b %d').strftime('%Y%m%d')
t_dest_path = os.path.join(date_path, timef + '-browse.log')
with open(t_dest_path, "a") as fdest:
fdest.write(line)