Issue
I am writing a jury-rigged PyTorch version of scipy.linalg.toeplitz
, which currently has the following form:
def toeplitz_torch(c, r=None):
c = torch.tensor(c).ravel()
if r is None:
r = torch.conj(c)
else:
r = torch.tensor(r).ravel()
# Flip c left to right.
idx = [i for i in range(c.size(0)-1, -1, -1)]
idx = torch.LongTensor(idx)
c = c.index_select(0, idx)
vals = torch.cat((c, r[1:]))
out_shp = len(c), len(r)
n = vals.stride(0)
return torch.as_strided(vals[len(c)-1:], size=out_shp, stride=(-n, n)).copy()
But torch.as_strided
currently does not support negative strides. My function, therefore, throws the error:
RuntimeError:
as_strided
: Negative strides are not supported at the moment, got strides:[-1, 1]
.
My (perhaps incorrect) understanding of as_strided
is that it inserts the values of the first argument into a new array whose size is specified by the second argument and it does so by linearly indexing those values in the original array and placing them at subscript-indexed strides given by the final argument.
Both the NumPy and PyTorch documentation concerning as_strided
have scary warnings about using the function with "extreme care" and I don't understand this function fully, so I'd like to ask:
- Is my understanding of
as_strided
correct? - Is there a simple way to rewrite this so negative strides work?
- Will I be able to pass a gradient w.r.t
c
(orr
) throughtoeplitz_torch
?
Solution
> 1. Is my understanding of as_strided
correct?
The stride is an interface for your tensor to access the underlying contiguous data buffer. It does not insert values, no copies of the values are done by torch.as_strided
, the strides define the artificial layout of what we refer to as multi-dimensional array (in NumPy) or tensor (in PyTorch).
As Andreas K. puts it in another answer:
Strides are the number of bytes to jump over in the memory in order to get from one item to the next item along each direction/dimension of the array. In other words, it's the byte-separation between consecutive items for each dimension.
Please feel free to read the answers over there if you have some trouble with strides. Here we will take your example and look at how it is implemented with as_strided
.
The example given by Scipy for linalg.toeplitz
is the following:
>>> toeplitz([1,2,3], [1,4,5,6])
array([[1, 4, 5, 6],
[2, 1, 4, 5],
[3, 2, 1, 4]])
To do so they first construct the list of values (what we can refer to as the underlying values, not actually underlying data): vals
which is constructed as [3 2 1 4 5 6]
, i.e. the Toeplitz column and row flattened.
Now notice the arguments passed to np.lib.stride_tricks.as_strided
:
values
:vals[len(c)-1:]
notice the slice: the tensors show up smaller, yet the underlying values remain, and they correspond to those ofvals
. Go ahead and compare the two withstorage_offset
: it's just an offset of2
, the values are still there! How this works is that it essentially shifts the indices such thatindex=0
will refer to value1
,index=1
to4
, etc...shape
: given by the column/row inputs, here(3, 4)
. This is the shape of the resulting object.strides
: this is the most important piece:(-n, n)
, in this case(-1, 1)
The most intuitive thing to do with strides is to describe a mapping between the multi-dimensional space: (i, j) ∈ [0,3[ x [0,4[
and the flattened 1D space: k ∈ [0, 3*4[
. Since the strides are equal to (-n, n) = (-1, 1)
, the mapping is -n*i + n*j = -1*i + 1*j = j-i
. Mathematically you can describe your matrix as M[i, j] = F[j-i]
where F
is the flattened values vector [3 2 1 4 5 6]
.
For instance, let's try with i=1
and j=2
. If you look at the Topleitz matrix above M[1, 2] = 4
. Indeed F[k] = F[j-i] = F[1] = 4
If you look closely you will see the trick behind negative strides: they allow you to 'reference' to negative indices: for instance, if you take j=0
and i=2
, then you see k=-2
. Remember how vals
was given with an offset of 2
by slicing vals[len(c)-1:]
. If you look at its own underlying data storage it's still [3 2 1 4 5 6]
, but has an offset. The mapping for vals
(in this case i: 1D -> k: 1D
) would be M'[i] = F'[k] = F'[i+2]
because of the offset. This means M'[-2] = F'[0] = 3
.
In the above I defined M'
as vals[len(c)-1:]
which basically equivalent to the following tensor:
>>> torch.as_strided(vals, size=(len(vals)-2,), stride=(1,), storage_offset=2)
tensor([1, 4, 5, 6])
Similarly, I defined F'
as the flattened vector of underlying values: [3 2 1 4 5 6]
.
The usage of strides is indeed a very clever way to define a Toeplitz matrix!
> 2. Is there a simple way to rewrite this so negative strides work?
The issue is, negative strides are not implemented in PyTorch... I don't believe there is a way around it with torch.as_strided
, otherwise it would be rather easy to extend the current implementation and provide support for that feature.
There are however alternative ways to solve the problem. It is entirely possible to construct a Toeplitz matrix in PyTorch, but that won't be with torch.as_strided
.
We will do the mapping ourselves: for each element of M
indexed by (i, j)
, we will find out the corresponding index k
which is simply j-i
. This can be done with ease, first by gathering all (i, j)
pairs from M
:
>>> i, j = torch.ones(3, 4).nonzero().T
(tensor([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]),
tensor([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]))
Now we essentially have k
:
>>> j-i
tensor([ 0, 1, 2, 3, -1, 0, 1, 2, -2, -1, 0, 1])
We just need to construct a flattened tensor of all possible values from the row r
and column c
inputs. Negative indexed values (the content of c
) are put last and flipped:
>>> values = torch.cat((r, c[1:].flip(0)))
tensor([1, 4, 5, 6, 3, 2])
Finally index values
with k
and reshape:
>>> values[j-i].reshape(3, 4)
tensor([[1, 4, 5, 6],
[2, 1, 4, 5],
[3, 2, 1, 4]])
To sum it up, my proposed implementation would be:
def toeplitz(c, r):
vals = torch.cat((r, c[1:].flip(0)))
shape = len(c), len(r)
i, j = torch.ones(*shape).nonzero().T
return vals[j-i].reshape(*shape)
> 3. Will I be able to pass a gradient w.r.t c
(or r
) through toeplitz_torch
?
That's an interesting question because torch.as_strided
doesn't have a backward function implemented. This means you wouldn't have been able to backpropagate to c
and r
! With the above method, however, which uses 'backward-compatible' builtins, the backward pass comes free of charge.
Notice the grad_fn
on the output:
>>> toeplitz(torch.tensor([1.,2.,3.], requires_grad=True),
torch.tensor([1.,4.,5.,6.], requires_grad=True))
tensor([[1., 4., 5., 6.],
[2., 1., 4., 5.],
[3., 2., 1., 4.]], grad_fn=<ViewBackward>)
This was a quick draft (that did take a little while to write down), I will make some edits.
If you have some questions or remarks, don't hesitate to comment!
I would be interested in seeing other answers as I am not an expert with strides, this is just my take on the problem.
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.