Issue
I have one dataframe called _df1 which looks like this. Please not that this is not the entire dataframe but parts of it.
_df1:
frame id x1 y1 x2 y2
1 1 1363 569 103 241
2 1 1362 568 103 241
3 1 1362 568 103 241
4 1 1362 568 103 241
964 5 925 932 80 255
965 5 925 932 79 255
966 5 925 932 79 255
967 5 924 932 80 255
968 5 924 932 79 255
16 6 631 761 100 251
17 6 631 761 100 251
18 6 631 761 100 251
19 6 631 761 100 251
20 6 631 761 100 251
21 6 631 761 100 251
88 7 623 901 144 123
89 7 623 901 144 123
90 7 623 901 144 123
91 7 623 901 144 123
92 7 623 901 144 123
93 7 623 901 144 123
94 7 623 901 144 123
In the full database, there are 108003 rows and 141 unique IDs in the dataframe. An ID represents a specific object and the ID is repeated as long as that frame has that object. In other words, my data has 141 different objects and 108003 frames. I wrote a code to identify frames that have the same objects but is labelled with a different ID. This is saved in another dataframe called _df2 which looks like this. This is also only part of the dataframe, not the entire thing.
_df2:
indexID matchID
4 5
6 7
8 9
12 13
18 19
20 21
.
.
.
The second dataframe shows which indexes has been wrongly classified as a different object. This means that the ID in 'matchID' is actually the same object as 'indexID'. This 'indexID' in _df2 corresponds to 'id' in _df1.
Taking the first line in _df2 as an example, it says that index 4 and 5 is the same. Therefore, I need to change the 'id' values, in _df1, of all the frames with 'id' 5 to 4. This is an example of what the final table should look like since 5 has to be classified as 4 and 7 has to be classified as 6.
Output:
frame id x1 y1 x2 y2
1 1 1363 569 103 241
2 1 1362 568 103 241
3 1 1362 568 103 241
4 1 1362 568 103 241
964 4 925 932 80 255
965 4 925 932 79 255
966 4 925 932 79 255
967 4 924 932 80 255
968 4 924 932 79 255
16 6 631 761 100 251
17 6 631 761 100 251
18 6 631 761 100 251
19 6 631 761 100 251
20 6 631 761 100 251
21 6 631 761 100 251
88 6 623 901 144 123
89 6 623 901 144 123
90 6 623 901 144 123
91 6 623 901 144 123
92 6 623 901 144 123
93 6 623 901 144 123
94 6 623 901 144 123
Solution
Using replace
df1.id=df.id.replace(dict(zip(df2.indexID,df2.matchID)))
Answered By - BENY
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.