Issue
I have an array of strings arr
in which I want to search for elements and get the index of element. Numpy has a method where
to search element and return index in a tuple form.
arr = numpy.array(["string1","string2","string3"])
print(numpy.where(arr == "string1")
It prints:
(array([0], dtype=int64),)
But I only want the index number 0
.
I tried this:
i = numpy.where(arr == "string1")
print("idx = {}".format(i[0]))
which has output:
i = [0]
Is there any way to get the index number without using replace or slicing method?
Solution
TL;DR
Use:
try:
i = numpy.where(arr == "string1")[0][0]
except IndexError:
# handle the case where "string1" was not found in arr
or
indices = list(numpy.where(arr == "string1")[0])
Details
Finding elements in NumPy arrays is not intuitive the first time you try to do it.
Let's decompose the operation:
>>> arr = numpy.array(["string1","string2","string3"])
>>> arr == "string1"
array([ True, False, False])
Notice how just doing arr == "string1"
is already doing the search: it's returning an array of booleans of the same shape as arr
telling us where the condition is true.
Then, you're using numpy.where
which, when used with only one parameter (the condition), returns where its input is non-zero. With booleans, that means non false.
>>> numpy.where(numpy.array([ True, False, False]))
(array([0], dtype=int64),)
>>> numpy.where(arr == "string1")
(array([0], dtype=int64),)
It's not quite clear to my where this gives you a tuple of arrays for a 1-D input, but when you use this syntax with a 2-d input, it makes more sense.
In any case, what you're getting here is a tuple containing a list of indices where the condition matches. Notice it has to be a list, because you might have multiple matches.
For your code, you want numpy.where(arr == "string1")[0][0]
, because you know "string1"
occurs in the list, but the inner list may also contain zero or more than one values, depending on how many times the string is found.
>>> arr2 = numpy.array(["string1","string2","string3","string1", "string3"])
>>> numpy.where(arr2 == "foo")
(array([], dtype=int64),)
>>> numpy.where(arr2 == "string3")
(array([2, 4], dtype=int64),)
So when you want to use these indices, you should simply treat numpy.where(arr == "string1")[0]
as a list (it's really a 1-D array, though) and continue from there.
Now, just using numpy.where(arr == "some string")[0][0]
is risky, because it will throw in IndexError
exception if the string is not found in arr
. If you really want to do that, do it in a try/except block.
If you need the list of indices as a Python list, you can do this:
indices = list(numpy.where(arr == "string1")[0])
Answered By - joanis
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.