pythonarraysnumpyvectorization

Filling zeros in numpy array that are between non-zero elements with the same value


I have a 1D numpy numpy array with integers, where I want to replace zeros with the previous non-zero value if and only if the next non-zero value is the same.

For example, an array of:

in: x = np.array([1,0,1,1,0,0,2,0,3,0,0,0,3,1,0,1])
out: [1,0,1,1,0,0,2,0,3,0,0,0,3,1,0,1]

should become

out: [1,1,1,1,0,0,2,0,3,3,3,3,3,1,1,1]

Is there a vectorized way to do this? I found some way to fill values of zeros here, but not how to do it with exceptions, i.e. to not fill the zeros that are within integers with different value.


Solution

  • Here's a vectorized approach taking inspiration from NumPy based forward-filling for the forward-filling part in this solution alongwith masking and slicing -

    def forward_fill_ifsame(x):
        # Get mask of non-zeros and then use it to forward-filled indices
        mask = x!=0
        idx = np.where(mask,np.arange(len(x)),0)
        np.maximum.accumulate(idx,axis=0, out=idx)
    
        # Now we need to work on the additional requirement of filling only
        # if the previous and next ones being same
        # Store a copy as we need to work and change input data
        x1 = x.copy()
    
        # Get non-zero elements
        xm = x1[mask]
    
        # Off the selected elements, we need to assign zeros to the previous places
        # that don't have their correspnding next ones different
        xm[:-1][xm[1:] != xm[:-1]] = 0
    
        # Assign the valid ones to x1. Invalid ones become zero.
        x1[mask] = xm
    
        # Use idx for indexing to do the forward filling
        out = x1[idx]
    
        # For the invalid ones, keep the previous masked elements
        out[mask] = x[mask]
        return out
    

    Sample runs -

    In [289]: x = np.array([1,0,1,1,0,0,2,0,3,0,0,0,3,1,0,1])
    
    In [290]: np.vstack((x, forward_fill_ifsame(x)))
    Out[290]: 
    array([[1, 0, 1, 1, 0, 0, 2, 0, 3, 0, 0, 0, 3, 1, 0, 1],
           [1, 1, 1, 1, 0, 0, 2, 0, 3, 3, 3, 3, 3, 1, 1, 1]])
    
    In [291]: x = np.array([1,0,1,1,0,0,2,0,3,0,0,0,1,1,0,1])
    
    In [292]: np.vstack((x, forward_fill_ifsame(x)))
    Out[292]: 
    array([[1, 0, 1, 1, 0, 0, 2, 0, 3, 0, 0, 0, 1, 1, 0, 1],
           [1, 1, 1, 1, 0, 0, 2, 0, 3, 0, 0, 0, 1, 1, 1, 1]])
    
    In [293]: x = np.array([1,0,1,1,0,0,2,0,3,0,0,0,1,1,0,2])
    
    In [294]: np.vstack((x, forward_fill_ifsame(x)))
    Out[294]: 
    array([[1, 0, 1, 1, 0, 0, 2, 0, 3, 0, 0, 0, 1, 1, 0, 2],
           [1, 1, 1, 1, 0, 0, 2, 0, 3, 0, 0, 0, 1, 1, 0, 2]])