Issue
I have this dataframe:
+------+--------------+------------+
| ID | Education | Score |
+------+--------------+------------+
| 1 | High School | 7.884 |
| 2 | Bachelors | 6.952 |
| 3 | High School | 8.185 |
| 4 | High School | 6.556 |
| 5 | Bachelors | 6.347 |
| 6 | Master | 6.794 |
+------+--------------+------------+
I want to create a new column which is categorizing the score column. I want to label it as: 'bad', 'good', 'very good'.
Which maybe would look like this:
+------+--------------+------------+------------+
| ID | Education | Score | Labels |
+------+--------------+------------+------------+
| 1 | High School | 7.884 | Good |
| 2 | Bachelors | 6.952 | Bad |
| 3 | High School | 8.185 | Very good |
| 4 | High School | 6.556 | Bad |
| 5 | Bachelors | 6.347 | Bad |
| 6 | Master | 6.794 | Bad |
+------+--------------+------------+------------+
How can I do that?
Thanks in advance
Solution
import pandas as pd
# initialize list of lists
data = [[1,'High School',7.884], [2,'Bachelors',6.952], [3,'High School',8.185], [4,'High School',6.556],[5,'Bachelors',6.347],[6,'Master',6.794]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['ID', 'Education', 'Score'])
df['Labels'] = ['Bad' if x<7.000 else 'Good' if 7.000<=x<8.000 else 'Very Good' for x in df['Score']]
df
ID Education Score Labels
0 1 High School 7.884 Good
1 2 Bachelors 6.952 Bad
2 3 High School 8.185 Very Good
3 4 High School 6.556 Bad
4 5 Bachelors 6.347 Bad
5 6 Master 6.794 Bad
Answered By - Prathik Kini
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.