Tuesday, October 18, 2022

[FIXED] How create new columns from the result of apply(), where one result is a table and others single values?

October 18, 2022 numpy, pandas, python No comments

Issue

Given:

Table1 (fixed table)

Code:

import pandas as pd

data = [
    [10.0, 10.0, 0.5, 1.0, 7.0, 1.0, 1.0],
    [10.0, 10.0, 0.5, 1.0, 8.0, 1.0, 1.0],
    [10.0, 20.0, 0.5, 1.0, 7.0, 1.0, 1.0],
    [10.0, 20.0, 0.5, 1.0, 8.0, 1.0, 1.0],
]

table1 = pd.DataFrame(
    data=data,
    columns=["a", "b", "c", "d", "e", "f", "g"]
)

Result

      a     b    c    d    e    f    g
0  10.0  10.0  0.5  1.0  7.0  1.0  1.0
1  10.0  10.0  0.5  1.0  8.0  1.0  1.0
2  10.0  20.0  0.5  1.0  7.0  1.0  1.0
3  10.0  20.0  0.5  1.0  8.0  1.0  1.0

calculate_table() (function that accepts the values from table1)

Code:

import pandas as pd
import numpy as np

def calculate_table(
    a: float,
    b: float,
    c: float,
    d: float,
    e: int,
    f: float,
    g: float,
    pr: float = 10000,
):

    m = np.arange(e + 1)
    so = np.arange(e)

    # calculations
    p = d * (f ** m - 1) / (f - 1)
    q = np.r_[a, b * g ** so]
    r = np.cumsum(q)
    s = pr * (1 - ((d * (f ** m - 1) / (f - 1)) / 100))
    t = q / s
    u = np.cumsum(t)
    v = r / u
    w = v * (1 + (c / 100))
    x = ((w - s) / s) * 100
    y = a * (c / 100)
    z = r * (c / 100)

    data = {
        "m": m,
        "p": p,
        "s": s,
        "x": x,
        "w": w,
        "v": v,
        "q": q,
        "r": r,
        "t": t,
        "u": u,
        "y": y,
        "z": z,
    }

    table = pd.DataFrame(
        data=data
    )

    return table, p[-1], x[-1]

Result of the table calculation:

(   m          p            s         x             w             v    q     r         t         u     y      z
0  0   0.000000  10000.00000  0.500000  10050.000000  10000.000000   10    10  0.001000  0.001000  0.05  0.050
1  1   0.500000   9950.00000  0.701403  10019.789579   9969.939880   15    25  0.001508  0.002508  0.05  0.125
2  2   1.250000   9875.00000  0.936901   9967.519012   9917.929365   30    55  0.003038  0.005546  0.05  0.275
3  3   2.375000   9762.50000  1.258945   9885.404507   9836.223390   60   115  0.006146  0.011691  0.05  0.575
4  4   4.062500   9593.75000  1.727164   9759.449799   9710.895322  120   235  0.012508  0.024200  0.05  1.175
5  5   6.593750   9340.62500  2.432283   9567.815435   9520.214364  240   475  0.025694  0.049894  0.05  2.375
6  6  10.390625   8960.93750  3.524930   9276.804262   9230.651007  480   955  0.053566  0.103460  0.05  4.775
7  7  16.085938   8391.40625  5.273190   8833.901006   8789.951250  960  1915  0.114403  0.217862  0.05  9.575, 16.0859375, 5.273189534595566)

This is what I want to accomplish:

Apply the function calculate_table() on every row of table1, like:

table1["p", "x", "table"] = table1[["a", "b", "c", "d", "e", "f", "g"]].apply(calculate_table, axis=1)

The final result of the "loop" through table1 will be a new table1 with extra columns p, x, table

      a     b    c    d    e    f    g                        table      p      x 
0  10.0  10.0  0.5  1.0  7.0  1.0  1.0  result of calculate_table() 16.085 5.2731
1  10.0  10.0  0.5  1.0  8.0  1.0  1.0  
2  10.0  20.0  0.5  1.0  7.0  1.0  1.0  
3  10.0  20.0  0.5  1.0  8.0  1.0  1.0

Data in column "table" can be called like (or another way):

table1["table"]["w"][<index>]

The function calculate_table() stand alone is working, but when I apply this on the rows I'm having trouble to get it work because of the multiple variables used in the function which are not handle correctly (because they are Series?).

So do I apply the variables in a correct way to the function? Or is the function not setup correctly to accept the variales from table1 with the apply()?

Solution

The data you chose for your example throws error, as f contains only 1 values which will make calculate_table try to divide by 0.

With this alternate data (2.0 instead of 1.0 in f column):

data = [
    [10.0, 10.0, 0.5, 1.0, 7.0, 2.0, 1.0],
    [10.0, 10.0, 0.5, 1.0, 8.0, 2.0, 1.0],
    [10.0, 20.0, 0.5, 1.0, 7.0, 2.0, 1.0],
    [10.0, 20.0, 0.5, 1.0, 8.0, 2.0, 1.0],
]

Here is how to do it (rest of your code unchanged):

table1[["table", "p", "x"]] = pd.DataFrame(
    table1.apply(
        lambda x: calculate_table(
            x["a"], x["b"], x["c"], x["d"], x["e"], x["f"], x["g"], 10_000
        ),
        axis=1,
    ).tolist()
)

And then:

print(df)

Output:

Here is how to access data in column table. For instance, to get column m of the first dataframe in table column:

print(table1.loc[0, "table"]["m"])

Output:

Answered By - Laurent

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, October 18, 2022

[FIXED] How create new columns from the result of apply(), where one result is a table and others single values?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels