Issue
Could anyone teach me why the below code uses dim=1 in the scatter_
method? The meaning of the attached codes is for one-hot encoding. I tried to read the PyTorch document example and thought I should use dim=0
for the desired result. However, the result has shown that dim=1
is correct instead.
>>> target = torch.tensor([3, 5, 0, 2, 7, 5])
>>> target
tensor([3, 5, 0, 2, 7, 5])
>>> onehot = torch.zeros(target.shape[0], 8)
>>> onehot.scatter_(1, target.unsqueeze(1), 1.0)
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1., 0., 0.]])
Solution
You are applying scatter on a zero tensor onehot
shaped (len(target), 8)
on dim=1
using target
as input and 1.
as value
. This will have the following effect on onehot
:
onehot[i][target[i][j]] = 1.
This means for every row in target
it will look at the unique value since j
is always equal to 1
and use it to index the 2nd axis of onehot
. In other words, for every row, it takes the value from target
to position the 1.
among the columns of onehot
.
Step by step illustration would be:
>>> for i in range(len(target)):
... k = target[i] # k, depends on values of target i.e. dim=1
... onehot[i, k] = 1
... print(onehot)
tensor([[0., 0., 0., 1., 0., 0., 0., 0.], # i=0; k=3
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.], # i=1; k=5
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.], # i=2; k=0
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.], # i=3; k=2
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.], # i=4; k=7
[0., 0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1., 0., 0.]]) # i=5; k=5
Notice that onehot.scatter_(0, target.unsqueeze(1), 1.0)
would have produced:
onehot[target[i][j]][j] = 1.
Which is a valid operation only if you initialize onehot
the other way around:
>>> onehot = torch.zeros(8, len(target))
>>> onehot.scatter_(0, target.unsqueeze(1), 1.)
tensor([[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.]])
And you get the transpose of the other matrix.
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.