Issue
I have a pytorch sparse tensor that I need sliced row/column wise using this slice [idx][:,idx]
where idx
is a list of indexes, using the mentioned slice yields my desired result on an ordinary float tensor. Is it possible applying the same slicing on a sparse tensor? Example here:
#constructing sparse matrix
i = np.array([[0,1,2,2],[0,1,2,1]])
v = np.ones(4)
i = torch.from_numpy(i.astype("int64"))
v = torch.from_numpy(v.astype("float32"))
test1 = torch.sparse.FloatTensor(i, v)
#constructing float tensor
test2 = np.array([[1,0,0],[0,1,0],[0,1,1]])
test2 = autograd.Variable(torch.cuda.FloatTensor(test2), requires_grad=False)
#slicing
idx = [1,2]
print(test2[idx][:,idx])
output:
Variable containing:
1 0
1 1
[torch.cuda.FloatTensor of size 2x2 (GPU 0)]
I am holding a 250.000 x 250.000 adjacency matrix, where I need to slice n
rows and n
columns, using the random idx, by simply sampling n
random idx's. Since the dataset is so large it is not realistic to convert to a more convenient datatype.
can I achieve the same slicing result on test1? Is it even possible? If not, are there any work-arounds?
Right now I am running my model with the following "hack" of a solution:
idx = sorted(random.sample(range(0, np.shape(test1)[0]), 9000))
test1 = test1AsCsr[idx][:,idx].todense().astype("int32")
test1 = autograd.Variable(torch.cuda.FloatTensor(test1), requires_grad=False)
Where test1AsCsr is my test1 converted to a numpy CSR matrix. This solution works, it is however very slow, and makes my GPU utilization very low, since it needs to read/write from CPU memory, constantly.
Edit: Its fine with a non-sparse tensor as result
Solution
Well it's been a couple of years since there was activity on this question, but better late than never.
This is the function I use for slicing sparse tensors. (Helper functions are below)
def slice_torch_sparse_coo_tensor(t, slices):
"""
params:
-------
t: tensor to slice
slices: slice for each dimension
returns:
--------
t[slices[0], slices[1], ..., slices[n]]
"""
t = t.coalesce()
assert len(args) == len(t.size())
for i in range(len(args)):
if type(args[i]) is not torch.Tensor:
args[i] = torch.tensor(args[i], dtype= torch.long)
indices = t.indices()
values = t.values()
for dim, slice in enumerate(args):
invert = False
if t.size(0) * 0.6 < len(slice):
invert = True
all_nodes = torch.arange(t.size(0))
unique, counts = torch.cat([all_nodes, slice]).unique(return_counts=True)
slice = unique[counts==1]
if slice.size(0) > 400:
mask = ainb_wrapper(indices[dim], slice)
else:
mask = ainb(indices[dim], slice)
if invert:
mask = ~mask
indices = indices[:, mask]
values = values[mask]
return torch.sparse_coo_tensor(indices, values, t.size()).coalesce()
Usage (took 2.4s on my machine):
indices = torch.randint(low= 0, high= 200000, size= (2, 1000000))
values = torch.rand(size=(1000000,))
t = torch.sparse_coo_tensor(indices, values, size=(200000, 200000))
idx = torch.arange(1000)
slice_coo(t, [idx, idx])
out:
tensor(indices=tensor([[ 13, 62, 66, 78, 134, 226, 233, 266, 299, 344, 349,
349, 369, 396, 421, 531, 614, 619, 658, 687, 769, 792,
810, 840, 926, 979],
[255, 479, 305, 687, 672, 867, 444, 559, 772, 96, 788,
980, 423, 699, 911, 156, 267, 721, 381, 781, 97, 271,
840, 292, 487, 185]]),
values=tensor([0.4260, 0.4816, 0.8001, 0.8815, 0.3971, 0.4914, 0.7068,
0.2329, 0.4038, 0.1757, 0.7758, 0.3210, 0.2593, 0.8290,
0.1320, 0.4322, 0.7529, 0.8341, 0.8128, 0.4457, 0.4100,
0.1618, 0.4097, 0.3088, 0.6942, 0.5620]),
size=(200000, 200000), nnz=26, layout=torch.sparse_coo)
Timings for slice_torch_sparse_coo_tensor:
%timeit slice_torch_sparse_coo_tensor(t, torch.randperm(200000)[:500], torch.arange(200000))
output:
1.08 s ± 447 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
for the built-in torch.index_select (implemented here):
%timeit t.index_select(0, torch.arange(100))
output:
56.7 s ± 4.87 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
These are the helper functions I use for this purpose, the function "ainb" finds the elements of a that are in b. I found this function in the internet a while ago but I can't find the post to link it.
import torch
def ainb(a,b):
"""gets mask for elements of a in b"""
size = (b.size(0), a.size(0))
if size[0] == 0: # Prevents error in torch.Tensor.max(dim=0)
return torch.tensor([False]*a.size(0), dtype= torch.bool)
a = a.expand((size[0], size[1]))
b = b.expand((size[1], size[0])).T
mask = a.eq(b).max(dim= 0).values
return mask
def ainb_wrapper(a, b, splits = .72):
inds = int(len(a)**splits)
tmp = [ainb(a[i*inds:(i+1)*inds], b) for i in list(range(inds))]
return torch.cat(tmp)
Since the function scales quadratically with the amount of elements I added a wrapper that splits the input into chunks and then concatenates the output. It's more efficient using only CPU, but I am not sure whether this holds when using a GPU, would appreciate it if someone could test it :)
It's my first time posting, so feedback on the quality of the post is also appreciated.
Answered By - Prezt
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.