Tuesday, November 28, 2023

[FIXED] Python read same column values two times using concat and rename the column names at same time

November 28, 2023 dataframe, pandas, python-3.x No comments

Issue

I have one df :

dfs = pd.read_csv(StringIO("""
      datetime        ID  C_1 C_2  C_3   C_4 C_5 C_6
"18/06/2023 3:51:52"  136 101 2028  61    4   3   18  <-- row 1
"18/06/2023 3:51:53"  24  101 2029  65    0   0   0   <-- row 2
"18/06/2023 3:51:54"  136 102 2045  66    2   3   4   <-- row 3
"18/06/2023 3:51:55"  0   101 2022  89    0   0   0   <-- row 4
"18/06/2023 3:51:33"  136 101 2222  77    0   0   0   <-- row 5
"18/06/2023 3:51:56"  24  102 2022  89    0   0   0   <-- row 6
"18/06/2023 3:51:49"  136 101 2024  90    0   0   0   <-- row 7
"18/06/2023 3:51:57"  24  101 2026  87    0   1   8   <-- row 8
"18/06/2023 3:51:58"  0   102 2045  44    43  42  41  <-- row 9
"18/06/2023 3:51:59"  24  102 2043  33    0   1   8   <-- row 10
"18/06/2023 3:52:88"  136 101 3333  99    0   1   87  <-- row 11
"""), sep="\s+")

Is there a way to read previous and next values of the same column using concat function along with renaming the output dataframe column name.

I am trying below code but not sure how to rename the column-

m = (dfs['ID'].eq('0'))
m1 = dfs['ID'].isin(['0', '24'])
m2 = dfs['ID'].isin(['0', '136'])
cols = ['C_1']
tmp = dfs.mask(m).fillna({'C_1': dfs['C_1']})

out = pd.concat([tmp.loc[m1].groupby(dfs['C_1']).ffill().loc[m, cols+['datetime']],
                 tmp.loc[m2].groupby(dfs['C_1']).ffill().loc[m, cols+['C_2', 'C_3']],
                 tmp.loc[m1].groupby(dfs['C_1']).bfill().loc[m, cols+['datetime']],
         tmp.loc[m2].groupby(dfs['C_1']).bfill().loc[m, cols+['C_2', 'C_3']],
               ]).groupby(level=0).first()

In above code first I am checking condition 'ID=0', i.e. row 4 & row 9(for both matches read the C_1 value because we need to read the below column values for same C_1 which is 101 & 102) -

get first (from row 4 & 9 for same C_1 i.e. 101 & 102) previous datetime column value where ID=24, and first previous 'C_2', 'C_3' values where ID=136.
get first (from row 4 & 9 for same C_1 i.e. 101 & 102) next datetime column value where ID=24, and first next 'C_2' and 'C_3' values where ID=136.

Output -

C_1     datetime_prev    C_2_prev  C_3_prev   datetime_next   C_2_next   C_3_next
101  18/06/2023 3:51:53    2028       61    18/06/2023 3:51:57      2222        77
102  18/06/2023 3:51:56    2045       66    18/06/2023 3:51:59      3333        99

Solution

Ok, I'm giving a shot:

def group_fn(dfs):
    mask_0 = dfs["ID"].eq(0)
    mask_24 = dfs["ID"].eq(24)
    mask_136 = dfs["ID"].eq(136)

    dfs["prev_datetime"] = dfs.loc[mask_24, "datetime"]
    dfs["prev_datetime"] = dfs["prev_datetime"].ffill()
    dfs["next_datetime"] = dfs.loc[mask_24, "datetime"]
    dfs["next_datetime"] = dfs["next_datetime"].bfill()

    dfs[["prev_C_2", "prev_C_3"]] = dfs.loc[mask_136, ["C_2", "C_3"]]
    dfs[["prev_C_2", "prev_C_3"]] = dfs[["prev_C_2", "prev_C_3"]].ffill()
    dfs[["next_C_2", "next_C_3"]] = dfs.loc[mask_136, ["C_2", "C_3"]]
    dfs[["next_C_2", "next_C_3"]] = dfs[["next_C_2", "next_C_3"]].bfill()

    return dfs.loc[
        mask_0,
        [
            "C_1",
            "prev_datetime",
            "next_datetime",
            "prev_C_2",
            "prev_C_3",
            "next_C_2",
            "next_C_3",
        ],
    ]


out = dfs.groupby("C_1", group_keys=False).apply(group_fn)
print(out)

Prints:

   C_1       prev_datetime       next_datetime  prev_C_2  prev_C_3  next_C_2  next_C_3
3  101  18/06/2023 3:51:54  18/06/2023 3:51:57    2028.0      61.0    2222.0      77.0
8  102  18/06/2023 3:51:56  18/06/2023 3:51:59    2045.0      66.0       NaN       NaN

Answered By - Andrej Kesely

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, November 28, 2023

[FIXED] Python read same column values two times using concat and rename the column names at same time

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels