Issue
My task is as follows:
Each rows of the labels
represents students id who received awards from a certain competition. Each competitions have different weighting factors which are given by the weights
.
scores
of students are calculated as weighted number of awards that they received. What I have so far is:
import numpy as np
labels = np.array([
[0,1,5],
[0,1,3],
[2,4,5]])
weights = np.array([
[1],
[2],
[4]])
c_labels = np.concatenate(labels)
c_weights = np.concatenate(np.broadcast_to(weights,(3,3)))
uni, inv = np.unique(c_labels,return_inverse=True)
scores = np.zeros(len(uni))
for i in uni:
scores[i] = np.sum(c_weights[inv==i])
print(scores)
In the above, I have simplified my task as the student scoring problem to explain clearly. This is part of my bigger task, and I need to generate the unique array of the labels
and the corresponding scores
array to proceed.
The actual data size is huge and I need to reiterate this many times, so I can't use for loops as it takes too long time. I wish I can change to a more efficient way (i.e. replace the last for-loop into an array operation somehow), but I don't know how. Any suggestion would be appreciated.
Solution
Use np.bincount
scores = np.bincount(inv, weights = c_weights)
Answered By - Daniel F
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.