Issue
I am using Arabert (pre trained Bert for Arabic language) for binary classification labeled as true and false, i am trying to change the labels from "true" and "false" to 0 and one i used the code:
import pandas as pd
Data=pd.read_csv("/content/500-instances.csv")
DATA_COLUMN = 'sent'
LABEL_COLUMN = 'label'
Data.columns = [DATA_COLUMN, LABEL_COLUMN]
label_map = {
'fake' : 0,
'true' : 1
}
Data[DATA_COLUMN] = Data[DATA_COLUMN].apply(lambda x: preprocess(x, do_farasa_tokenization=False, use_farasapy = False))
Data[LABEL_COLUMN] = Data[LABEL_COLUMN].apply(lambda x: label_map[x])
i get an error: KeyError: 'true ' in the last line.
Do you have any solution?! Thanks in advance
Solution
These is a trailing space in 'true '
, that's why there is no match in label_map
, try:
Data[LABEL_COLUMN] = Data[LABEL_COLUMN].apply(lambda x: label_map[x.strip()])
EDIT
If you are not sure what lies in Data[LABEL_COLUMN]
, I would suggest catching unknown values with a default output value using label_map.get(x.strip(), <DEFAULT_VALUE>)
.
Answered By - rjg
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.