Issue
I have below two functions:
import pandas as pd
import random
import string
def random_alphanumeric(length, hyphen_interval=4):
characters = string.ascii_letters + string.digits
random_value = ''.join(random.choice(characters) for _ in range(length))
return '-'.join(random_value[i:i + hyphen_interval] for i in range(0, len(random_value), hyphen_interval))
def process_excel(xl_input_file, xl_output_file):
df = pd.read_excel(xl_input_file)
df['Value'] = pd.to_numeric(df['Value'], errors='coerce')
updated_rows = []
for index, row in df.iterrows():
updated_rows.append(pd.DataFrame([row], columns=df.columns))
if row['Value'] != 0:
new_row = row.copy()
new_row['Value'] = -row['Value']
updated_rows.append(pd.DataFrame([new_row], columns=df.columns))
new_row['ID'] = random_alphanumeric(16, hyphen_interval=4)
new_row['gla'] = '2100-abc'
updated_df = pd.concat(updated_rows, ignore_index=True)
updated_df.to_excel(xl_output_file, index=False)
# Example usage
xl_input_file = 'input.xlsx'
xl_output_file = 'updated_file.xlsx'
process_excel(xl_input_file, xl_output_file)
Functionality of functions
1- I have an excel file where I have `ID`, `gla`, `value`, and `4 more columns`.
2- The value has some numeric data which could be negative or positive. i.e. for any row there could be 23245, or -7989 for example.
3- The `process_excel()` function is suppose to convert corresponding
`positive value to negative`, and `vice versa`. i.e. for the example given in point 2, `for 23245` there will be `-23245`, and `for -7989` it would be `7989`. The function is doing the conversion.
My challenge is that I also wanted to enter some random generated value for ID
column and a hard coded value for gla
column. I have written a reusable function called random_alphanumeric()
that generates a alphanumeric value, and I have also called this function within process_excel()
. But it doesn't seems to have any effect in the new excel file that is generated. Please suggest what could be the issue in my code.
Note: for rest of 4 columns will be empty.
Solution
Can you explain why you need to loop through the rows?
In your example above, new_row
is a copy, but never makes it back into updated_df
after the ID is assigned.
You can do this without looping I think:
import pandas as pd
import random
import string
def random_alphanumeric(length, hyphen_interval=4):
characters = string.ascii_letters + string.digits
random_value = "".join(random.choice(characters) for _ in range(length))
return "-".join(
random_value[i : i + hyphen_interval]
for i in range(0, len(random_value), hyphen_interval)
)
df = pd.DataFrame(
{
"Value": [10, 0, 0, 22, -5],
}
)
df["Value"] = -pd.to_numeric(df["Value"], errors="coerce")
df["ID"] = df.apply(lambda x: random_alphanumeric(16, hyphen_interval=4), axis=1)
df["gla"] = "2100-abc"
print(df)
output:
Value ID gla
0 -10 Q1T2-0CxA-er3b-Zi5U 2100-abc
1 0 I7LL-bj56-uInj-aRpl 2100-abc
2 0 53qZ-CSAE-Cwsq-Fkbk 2100-abc
3 -22 3g6V-GYPM-VwAG-qKwd 2100-abc
4 5 Vbn1-EINt-A8c8-k89w 2100-abc
Answered By - Nathan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.