I have two 1D-arrays. I need to expand the first array (a) with all lines from the second array (b) to create a new array that is a 1D-array merging the two arrays.
Example below to be clearer:
a = np.array(['x', 'y'])
b = np.array(['a', 'b', 'c'])
# how to handle the above 1D-arrays to create the below array (c)?
c = np.array(['xa', 'xb', 'xc', 'ya', 'yb', 'yc'])
print(c)
The new array c would look like:
['xa' 'xb' 'xc' 'ya' 'yb' 'yc']
Of course, I can do it with loops, but I'm looking for a smarter code. Thank you
For 2 lists, a smart thing is to use a list comprehension:
In [234]: a = ['x', 'y']
...: b = ['a', 'b', 'c']
In [235]: [i+j for i in a for j in b]
Out[235]: ['xa', 'xb', 'xc', 'ya', 'yb', 'yc']
For arrays you can use np.char.add
as shown in the other answers:
In [236]: A=np.array(a); B=np.array(b)
In [237]: np.char.add(A[:,None],B)
Out[237]:
array([['xa', 'xb', 'xc'],
['ya', 'yb', 'yc']], dtype='<U2')
Timeit on such a small example has to viewed with caution. Often times for lists are better for small examples, but don't scale nearly as well. But I expect np.char.add
will hurt the array scaling (the np.char
functions just apply standard string methods to the array elements.).
In [238]: timeit np.char.add(A[:,None],B)
23.2 µs ± 57.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [239]: timeit [i+j for i in a for j in b]
1.55 µs ± 35.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Specifying object
dtype when making the arrays, we can use the +
operator, and gain some speed:
In [240]: A=np.array(a,object); B=np.array(b,object)
In [241]: A[:,None]+B
Out[241]:
array([['xa', 'xb', 'xc'],
['ya', 'yb', 'yc']], dtype=object)
In [242]: timeit A[:,None]+B
7.39 µs ± 76.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
For reference, adding two numeric arrays:
In [245]: %%timeit x=np.arange(2); y=np.arange(3)
...: x[:,None]+y
5.95 µs ± 8.71 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [246]: %%timeit x=np.arange(200); y=np.arange(300)
...: x[:,None]+y
100 µs ± 533 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
The 2nd case is 10_000 larger, but time increases only 20x.