Sunday, November 14, 2021

[FIXED] Adding a value at the end of a column in a multindex column dataframe

November 14, 2021 append, dataframe, pandas, python No comments

Issue

I have a simple problem that probably has a simple solution but I couldn't found it anywhere. I have the following multindex column Dataframe:

mux = pd.MultiIndex.from_product(['A','B','C'], ['Datetime', 'Str', 'Ret']])
dfr = pd.DataFrame(columns=mux)

  |      A         |        B       |        C       |
  |Datetime|Str|Ret|Datetime|Str|Ret|Datetime|Str|Ret|

I need to add values one by one at the end of a specific subcolumn. For example add one value at the end of column A sub-column Datetime and leave the rest of the row as it is, then add another value to column B sub-column Str and again leave the rest of the values in the same row untouched and so on. So my questions are: Is it possible to target individual locations in this type of Dataframes? How? and also Is it possible to append not a full row but an individual value always at the end after the previous value without knowing where the end is?. Thank you so much for your answers.

Solution

IIUC, you can use .loc:

idx = len(dfr)  # get the index of the next row after the last one
dfr.loc[idx, ('A', 'Datetime')] = pd.to_datetime('2021-09-24')
dfr.loc[idx, ('B', 'Str')] = 'Hello'
dfr.loc[idx, ('C', 'Ret')] = 4.3

Output:

>>> dfr
                     A                  B                    C          
              Datetime  Str  Ret Datetime    Str  Ret Datetime  Str  Ret
0  2021-09-24 00:00:00  NaN  NaN      NaN  Hello  NaN      NaN  NaN  4.3

Update

I mean for example when I have different number of values in different columns (for example 6 values in column A-Str but only 4 in column B-Datetime) but I don´t really know. In that case what I need is to add the next value in that column after the last one so I need to know the index of the last non Nan value of that particular column so I can use it in your answer because if I use len(dfr) while trying to add value to the column that only has 4 values it will end up in the 7th row instead of the 5th row, this is because one of the columns may have more values than the others.

You can do it easily using last_valid_index. Create a convenient function append_to_col to append values inplace in your dataframe:

def append_to_col(col, val):
    idx = dfr[col].last_valid_index()
    dfr.loc[idx+1 if idx is not None else 0, col] = val


# Fill your dataframe
append_to_col(('A', 'Datetime'), '2021-09-24')
append_to_col(('A', 'Datetime'), '2021-09-25')
append_to_col(('B', 'Str'), 'Hello')
append_to_col(('C', 'Ret'), 4.3)
append_to_col(('C', 'Ret'), 8.2)
append_to_col(('A', 'Datetime'), '2021-09-26')

Output:

>>> dfr
            A                  B                    C          
     Datetime  Str  Ret Datetime    Str  Ret Datetime  Str  Ret
0  2021-09-24  NaN  NaN      NaN  Hello  NaN      NaN  NaN  4.3
1  2021-09-25  NaN  NaN      NaN    NaN  NaN      NaN  NaN  8.2
2  2021-09-26  NaN  NaN      NaN    NaN  NaN      NaN  NaN  NaN

Answered By - Corralien

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, November 14, 2021

[FIXED] Adding a value at the end of a column in a multindex column dataframe

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels