I'm quite new to Python and generally used to Java. I'm currently trying to parse a text file outputted by Praat that is always in the same format and looks generally like this, with a few more features:
-- Voice report for 53. Sound T1_1001501_vowels --
Date: Tue Aug 7 12:15:41 2018
Time range of SELECTION
From 0 to 0.696562 seconds (duration: 0.696562 seconds)
Pitch:
Median pitch: 212.598 Hz
Mean pitch: 211.571 Hz
Standard deviation: 23.891 Hz
Minimum pitch: 171.685 Hz
Maximum pitch: 265.678 Hz
Pulses:
Number of pulses: 126
Number of periods: 113
Mean period: 4.751119E-3 seconds
Standard deviation of period: 0.539182E-3 seconds
Voicing:
Fraction of locally unvoiced frames: 5.970% (12 / 201)
Number of voice breaks: 1
Degree of voice breaks: 2.692% (0.018751 seconds / 0.696562 seconds)
I would like to output something that looks like this:
0.696562,212.598,211.571,23.891,171.685,265.678,126,113,4.751119E-3,0.539182E-3,5.970,1,2.692
So essentially I want to print out a string of just the numbers between the colon and its following whitespace from each line, separated by commas. I know this might be a stupid question but I just can't figure it out in Python; any help would be much appreciated!
Thank you for the help everyone! I actually came up with this solution:
import csv
input = 't2_5.txt'
input_name = input[:-4]
def parse(filepath):
data = []
with open(filepath, 'r') as file:
file.readline()
file.readline()
file.readline()
for line in file:
if line[0] == ' ':
start = line.find(':') + 2
end = line.find(' ', start)
if line[end - 1] == '%':
end -= 1
number = line[start:end]
data.append(number)
with open(input_name + '_output.csv', 'wb') as csvfile:
wr = csv.writer(csvfile)
wr.writerow(data)
parse(input)