Issue
I have data in the dataset where Fe2O3, Au, LiO2, Au-Fe3O4 Cu@CuFe 2O3 and so on are found. Complex formulas have separators: -, / and @. I need to divide these values strictly by these separators and I need to lead to a view from Fe-CuO2 to FeCuO2. I wrote this block: An example of how I tried to separate:
for formula in df['Core']:
if formula.isalnum( ) == False:
line = re.split("-,@,/", formula) #разделила через регулярные выражения
comp1 =''
count = 0
while count <= len(line):
for i in line:
count += 1
comp1 += i
df['Core'] = comp1
However, as a result, all values in the column became empty, when the column is checked, the array is empty.
Assigning values for each value in pandas through a loop goes wrong? Can’t do a loop in Pandas? How do I correctly check all values for delimiters in a column in pandas and change incorrect values without delimiters (@, /, -)? Please let me know how it was done correctly. I don’t understand how to assign values to column values right
I tried to through a cycle and regular expressions, but something went wrong and all the values of the stork turned into one digit, and another into emptiness. How do you still change the values in columns in pandas to a value per row?
Solution
IIUC, you can try:
df["Cleaned"] = df["Core"].str.replace(
r"[a-zA-Z0-9-/@]+",
lambda g: g.group(0).replace("-", "").replace("/", "").replace("@", ""),
regex=True,
)
print(df)
Prints:
Core | Cleaned |
---|---|
Fe2O3 | Fe2O3 |
Au | Au |
LiO2 | LiO2 |
Au-Fe3O4 | AuFe3O4 |
Cu@CuFe | CuCuFe |
2O3 | 2O3 |
Fe2O3, Au, LiO2, Au-Fe3O4 Cu@CuFe 2O3 | Fe2O3, Au, LiO2, AuFe3O4 CuCuFe 2O3 |
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.