I have 2 numpy arrays of same length lets call them A and B and 2 scalar values named C and D. I want to store these values into a single txt file. I thought of the following structure:
It doesnt have to have this format I just thought its convenient and clear. I know how to write a the numpy arrays into a txt file and read them out again, but I struggle how to write the txt file as a combination of arrays and scalar values and how to read them out again from txt to numpy.
A = np.array([1, 2, 3, 4, 5])
B = np.array([5, 4, 3, 2, 1])
C = [6]
D = [7]
np.savetxt('file.txt', (A, B))
A_B_load = np.loadtxt('file.txt')
A_load = A_B_load[0,:]
B_load= A_B_load[1,:]
This doesnt give me the same column structure that I proposed but stores the arrays in rows but that doesnt really matter.
I found one solution which is a bit unhandy since I have to fill up the scalar values with 0 for them to become of the same length like the arrays A and B there must be a smarter solution.
A = np.array([1, 2, 3, 4, 5])
B = np.array([5, 4, 3, 2, 1])
C = [6]
D = [7]
fill = np.zeros(len(A)-1)
C = np.concatenate((C,fill))
D = np.concatenate((D, fill))
np.savetxt('file.txt', (A,B,C,D))
A_B_load = np.loadtxt('file.txt')
A_load = A_B_load[0,:]
B_load = A_B_load[1,:]
C_load = A_B_load[2,0]
D_load = A_B_load[3,0]
In [123]: A = np.array([1, 2, 3, 4, 5])
...: B = np.array([5, 4, 3, 2, 1])
...: C = [6]
...: D = [7]
savetxt
is designed to write a 2d array in a consistent csv form - a neat table with the same number of columns in each row.
In [124]: arr = np.stack((A,B), axis=1)
In [125]: arr
Out[125]:
array([[1, 5],
[2, 4],
[3, 3],
[4, 2],
[5, 1]])
Here's one possible write format:
In [126]: np.savetxt('foo.txt', arr, fmt='%d', header=f'{C} {D}', delimiter=',')
...:
In [127]: cat foo.txt
# [6] [7]
1,5
2,4
3,3
4,2
5,1
I put the scalars in a header line, since they don't match with the arrays.
loadtxt
can recreate that arr
array:
In [129]: data = np.loadtxt('foo.txt', dtype=int, skiprows=1, delimiter=',')
In [130]: data
Out[130]:
array([[1, 5],
[2, 4],
[3, 3],
[4, 2],
[5, 1]])
The header line can be read with:
In [138]: with open('foo.txt') as f:
...: header = f.readline().strip()
...: line = header[1:]
...:
In [139]: line
Out[139]: ' [6] [7]'
I should have saved it as something that's simpler to parse, like '# 6,7'
Your accepted answer creates a dataframe with nan
values and blanks in the csv
In [143]: import pandas as pd
In [144]: df = pd.concat([pd.DataFrame(arr) for arr in [A,B,C,D]], axis=1)
...: df.to_csv("test.txt", na_rep="", sep=" ", header=False, index=False)
In [145]: df
Out[145]:
0 0 0 0
0 1 5 6.0 7.0
1 2 4 NaN NaN
2 3 3 NaN NaN
3 4 2 NaN NaN
4 5 1 NaN NaN
In [146]: cat test.txt
1 5 6.0 7.0
2 4
3 3
4 2
5 1
Note that np.nan
is a float, so some of the columns are float as a result. loadtxt
can't handle those "blank" columns; np.genfromtxt
is better at that, but it needs a delimiter like ,
to mark them.
Writing and reading the full length arrays is easy. But mixing types gets messy.
Here's a format that would be easier to write and read:
In [149]: arr = np.zeros((5,4),int)
...: for i,var in enumerate([A,B,C,D]):
...: arr[:,i] = var
...:
In [150]: arr
Out[150]:
array([[1, 5, 6, 7],
[2, 4, 6, 7],
[3, 3, 6, 7],
[4, 2, 6, 7],
[5, 1, 6, 7]])
In [151]: np.savetxt('foo.txt', arr, fmt='%d', delimiter=',')
In [152]: cat foo.txt
1,5,6,7
2,4,6,7
3,3,6,7
4,2,6,7
5,1,6,7
In [153]: np.loadtxt('foo.txt', delimiter=',', dtype=int)
Out[153]:
array([[1, 5, 6, 7],
[2, 4, 6, 7],
[3, 3, 6, 7],
[4, 2, 6, 7],
[5, 1, 6, 7]])