Issue
I was trying to calculate a covariance matrix of 2 arrays a
and b
via a dot product in python with the formula a @ b.T
.
The initial form of my input arrays are (2,)
when the shape is printed. Obviously the result of this calculation is a single scalar but not the covariance matrix i need.
As it turns out i needed to reshape the arrays(i.e. a.reshape(2,1)
) to a receive the desired (2,2)
covariance matrix.
Now my problem is that i dont understand what effect the reshaping of the arrays has on the data, more specifically why i am allowed to do it and receive the result i want.
How come i can just reshape the array to receive the covariance matrix, what is the thought process behind it?
I do understand the how the dot product works, but as i said i dont get why i am allowed to just reshape the array to receive the desired outcome dimensions.
I hope i made my problem understandable, i have a bit of trouble formulating it.
Appreciate any help!
Solution
While reshape
doesn't change the values of an array, it does affect how the array interacts with other arrays. Operators like *
and @
have their rules about handling arrays with various shapes.
In [221]: a, b = np.array([1,2]), np.array([3,4])
In [222]: a,b
Out[222]: (array([1, 2]), array([3, 4]))
transpose
of a 1d array changes nothing. It does not 'add' a dimension; it just reverses the existing dimensions:
In [223]: b.T
Out[223]: array([3, 4])
For 1d arrays, matmul
and dot
do the 'scalar', or 'dot' product. That's clearly documented:
In [224]: a@(b.T)
Out[224]: 11
Reshaping the arrays to (2,1)
. Now transpose does make a difference:
In [225]: a[:,None]
Out[225]:
array([[1],
[2]])
In [226]: b[:,None].T
Out[226]: array([[3, 4]])
Here we have a (2,1) with (1,2), which does the matrix product, summing on the size 1 dimensions. The result is (2,2). Again clearly documented.
In [227]: a[:,None]@b[:,None].T
Out[227]:
array([[3, 4],
[6, 8]])
Other ways of getting that product:
In [228]: np.outer(a,b)
Out[228]:
array([[3, 4],
[6, 8]])
In [229]: a[:,None]*b
Out[229]:
array([[3, 4],
[6, 8]])
The last case does elementwise multiplication of (2,1) with a (2,). The rules of broadcasting
apply. Read up on that!
Answered By - hpaulj
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.