Monday, November 27, 2023

[FIXED] numpy replace repeating values with 0

November 27, 2023 numpy, pandas, python No comments

Issue

I have two arrays that loooks like this:

arr1 = [0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 1 1 1 1 1 1 1 0 1 0 1 0 0 1
 1 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 1 0 0 0 1 1 1 1 1 0 0
 1 1 1 1 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 1 1 1 1 0 1 0]
arr2 = [0 0 0 0 1 1 1 0 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0
 0 0 0 1 1 1 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 0 1 0 1 1 1 0 0 1 0 0 0 1
 0 0 0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 1 0 0 0 1 1 0 0]

What is the fastest way to compare both arrays and if both have "1" in the same position, figure out which array has the closest "0" looking backwards and replace "1" in that array with "0".
Replace all "1"s in an array that are followed by a "1" with "0".

I solved it using iteration, but I am sure there's an easy and much faster solution for this in numpy or pandas, which I am just only beginning to learn.

Here's an ugly example solution to first problem using iteration:

    df = pd.DataFrame({"A": arr1, "B": arr2, })
    df2 = df[(df.A > 0) & (df.B > 0)]
    i = 1
    for idx in df2.index:
        while df.loc[idx, 'A'] == 1 and df.loc[idx, 'B'] == 1:
            try:
                if df.loc[idx - i, 'A'] > 0 or df.loc[idx - i, 'B'] > 0:
                    df.loc[idx, 'A'] = df.loc[idx - i, 'A']
                    df.loc[idx, 'B'] = df.loc[idx - i, 'B']
                else:
                    i += 1
            except KeyError:
                df.loc[idx, 'A'] = 0
                df.loc[idx, 'B'] = 0

And here's a solution to the second one:

    df2 = df[(df.A > 0)].A
    for idx in df2.index:
        if df.loc[idx + 1, 'A'] > 0:
            df.loc[idx, 'A'] = 0
    df2 = df[(df.B > 0)].B
    for idx in df2.index:
        if df.loc[idx + 1, 'B'] > 0:
            df.loc[idx, 'B'] = 0

Now do that pandas voodoo and make it all a one-liner.

Solution

Using numpy you could do the following:

import numpy as np
def clossest_zero(arr, arr_idx, n):
    return np.maximum.reduceat((1 - arr) * n, np.r_[0, arr_idx])[:-1]

def compare_replace(arr1, arr2):
    A, B = np.array(arr1), np.array(arr2)
    n = np.arange(A.size)
    idx = np.where(A * B == 1)[0]
    idx2 = clossest_zero(A, idx, n) > clossest_zero(B, idx, n)
    A[idx[idx2]] = 0
    B[idx[~idx2]] = 0
    return A, B

compare_replace(np.array([0,1,1,1,0,0,1]), np.array([1,0,1,1,1,1,1]))
(array([0, 1, 1, 1, 0, 0, 0]), array([1, 0, 0, 0, 1, 1, 1]))

for the second part:

def replace_ones(x):
    x[:-1][(x[1:] * x[:-1]) == 1] = 0
    return x

replace_ones(np.array([1, 1, 0, 1, 0, 1, 1, 1]))
array([0, 1, 0, 1, 0, 0, 0, 1])

Answered By - Onyambu

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, November 27, 2023

[FIXED] numpy replace repeating values with 0

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels