I'm trying to calculate a 2d variable z = x + y where x and y are 1d arrays of unequal dimensions (say, x- and y-coordinate points on a spatial grid). I'd like to display the result row-by-row in which the values of x and y are in the first two columns and the corresponding value of z calculated from these x and y values are in the third, something like the following for x = [1, 2] and y = [3, 4, 5]:
x y z
1 3 4
1 4 5
1 5 6
2 3 5
2 4 6
2 5 7
The code below works (using lists here, but will probably need numpy arrays later):
import pandas as pd
x = [1, 2]
y = [3, 4, 5]
col1 = []
col2 = []
z = []
for i in range(len(x)):
for j in range(len(y)):
col1.append(x[i])
col2.append(y[j])
z.append(x[i]+y[j])
df = pd.DataFrame(zip(col1, col2, z), columns=["x", "y", "z"])
print(df)
Just wondering, is there a better way of doing this without using the loop by some combination of meshgrid, indices, flatten, v/hstack, and reshape? The size of x and y will typically be around 100.
Here is one way:
import numpy as np
import pandas as pd
x = np.asarray([1, 2])[:, np.newaxis]
y = np.asarray([3, 4, 5])
x, y = np.broadcast_arrays(x, y)
z = x + y
df = pd.DataFrame(zip(x.ravel(), y.ravel(), z.ravel()), columns=["x", "y", "z"])
print(df)
# x y z
# 0 1 3 4
# 1 1 4 5
# 2 1 5 6
# 3 2 3 5
# 4 2 4 6
# 5 2 5 7
But yes, you can also use meshgrid instead of orthogonal arrays + explicit broadcasting. You can also use NumPy instead of Pandas.
x = np.asarray([1, 2])
y = np.asarray([3, 4, 5])
x, y = np.meshgrid(x, y, indexing='ij')
z = x + y
print(np.stack((x.ravel(), y.ravel(), z.ravel())).T)
# array([[1, 3, 4],
# [1, 4, 5],
# [1, 5, 6],
# [2, 3, 5],
# [2, 4, 6],
# [2, 5, 7]])