Issue
consider the following MWE:
import numpy as np
def create_weight_matrix(nrows, ncols):
"""Create a weight matrix with normally distributed random elements."""
return np.random.default_rng().normal(loc=0, scale=1/(nrows*ncols), size=(nrows, ncols))
def create_bias_vector(length):
"""Create a bias vector with normally distributed random elements."""
return create_weight_matrix(length,1)
if __name__ == "__main__":
num_samples = 100
num_features = 5
W = create_weight_matrix(4, num_features)
b = create_bias_vector(4)
x = np.random.rand(num_samples, num_features)
y = W.dot(x[1])
print(y.shape)
t = y + b
print(t.shape)
The intermediate variable y
is computed correctly as the dot product of W
and x[1]
and the output is a vector of the form [ 0.06678158 0.02322523 0.09542323 -0.05746891]
. I try then to add vector b
and i would expect this is performed element wise, but for some reason the code adds all elements of b
to all elements of y
creating a 4x4 matrix at the end instead of 4x1 vector. What am I missing here?
Solution
b
has a shape (4, 1)
(a column-vector) and y
has a shape (4,)
(a row vector). When you add them together, they get broadcasted into an array of shape (4, 4)
. You can avoid this by explicitly reshaping one of the vectors to the same shape as the other.
For a column vector of shape (4, 1)
, use any of the following lines:
t = b + y.reshape(b.shape)
t = b + y[:, None]
y[:, None]
adds a new axis to the 1d array
or, for a row vector of shape (4,)
:
t = b.reshape(y.shape) + y
t = b.squeeze() + y
Answered By - pho
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.