Issue
Here is what I have:
long_list = a very long list of integer values (6M+ entries)
wanted_list = a list of integer values that are of interest (70K entries)
What I need:
mask_list = a list of booleans of the same length as long_list, describing whether each element of long_list in present in the wanted_list (i.e. [is long_list[0] in wanted_list?, is long_list[1] in wanted_list?,....]). The number of'True' entries in this list should be the same as len(wanted_list)
I got a working code using a for loop, but as expected it's way too slow for the length of the lists that I am working with (takes several minutes to run):
masklist = []
for element in long_list:
if element in wanted_list:
masklist.append(True)
else:
masklist.append(False)
I was wondering if there is a more elegant and fast way to achieve this goal? I was looking into numpy.ma module, but could not think of an elegant way to apply it to this problem
Solution
You can use numpy.isin
for this:
masklist = np.isin(long_list, wanted_list)
Answered By - Jab
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.