Issue
I have a numpy array that has a shape of (500, 151296)
. Below is the array format
array:
array([[-0.18510018, 0.13180602, 0.32903048, ..., 0.39744213,
-0.01461623, 0.06420607],
[-0.14988784, 0.12030973, 0.34801325, ..., 0.36962894,
0.04133283, 0.04434045],
[-0.3080041 , 0.18728344, 0.36068922, ..., 0.09335024,
-0.11459247, 0.10187756],
...,
[-0.17399777, -0.02492459, -0.07236133, ..., 0.08901921,
-0.17250113, 0.22222663],
[-0.17399777, -0.02492459, -0.07236133, ..., 0.08901921,
-0.17250113, 0.22222663],
[-0.17399777, -0.02492459, -0.07236133, ..., 0.08901921,
-0.17250113, 0.22222663]], dtype=float32)
array[0]:
array([-0.18510018, 0.13180602, 0.32903048, ..., 0.39744213,
-0.01461623, 0.06420607], dtype=float32)
I have another list that has stopwords which are same size of the numpy array shape
stopwords = ['no', 'not', 'in' .........]
I want to add each stopword to the numpy array which has 500 elements. Below is the code that I am using to add
for i in range(len(stopwords)):
array = np.append(array[i], str(stopwords[i]))
I am getting the below error
IndexError Traceback (most recent call last)
<ipython-input-45-361e2cf6519b> in <module>
1 for i in range(len(stopwords)):
----> 2 array = np.append(array[i], str(stopwords[i]))
IndexError: index 2 is out of bounds for axis 0 with size 2
Desired output:
array[0]:
array([-0.18510018, 0.13180602, 0.32903048, ..., 0.39744213,
-0.01461623, 0.06420607, 'no'], dtype=float32)
Can anyone tell me where am I doing wrong?
Solution
What you are doing wrong is that you overwrite the variable array
inside the for loop:
for i in range(len(stopwords)):
array = np.append(array[i], str(stopwords[i]))
# ^^^^^ ^^^^^
But what you are also doing wrong is to use np.append
in a for loop, which is almost always a bad idea.
You could rather do something like:
from string import ascii_letters
from random import choices
import numpy as np
N, M = 50, 7
arr = np.random.randn(N, M)
stopwords = np.array(["".join(choices(ascii_letters, k=10)) for _ in range(N)])
result = np.concatenate([arr, stopwords[:, None]], axis=-1)
assert result.shape == (N, M+1)
print(result[0]) # ['0.1' '-1.2' '-0.1' '1.6' '-1.4' '-0.2' '1.7' 'ybWyFlqhcS']
But it is also wrong, mixing data types for no apparent reason.
Imho, you better just keep the two arrays.
Depending on what you are doing you can iterate over them as follows:
for vector, stopword in zip(arr, stopwords):
print(f"{stopword = }")
print(f"{vector = }")
# stopword = 'RgfTVGzPOl'
# vector = array([-0.9, 1.1, 0.7 , -0.3 , -0.7 , -0.7, -0.6])
#
# stopword = 'XlJqKdsvCC'
# vector = array([-0.5, 0.1, -0.7 , -0.6, -1.1, -0.6, -0.6])
#
#...
Answered By - paime
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.