Monday, February 5, 2024

[FIXED] How to slice data in pytorch tensor?

February 05, 2024 data-science, numpy, pandas, python, pytorch No comments

Issue

I've put my data to the pytorch tensor and now i want to split into the batches size 64. I got the following code:

batch = 0
BATCH_SIZE = 64
X_train = x_scaled.to(device)
y_train = y_scaled.to(device)

for batch in range(0,len(X_train[0]),BATCH_SIZE):
   ### Training
   model.train() # train mode is on by default after construction
   # 1. Forward pass
   y_pred = model(X_train[0:2][batch:batch+BATCH_SIZE])

The shape of the tensor is: torch.Size([2, 11938]). And i want to slice it into [2,64]. However it do not slice properly and gives an error: mat1 and mat2 shapes cannot be multiplied (2x11938 and 2x64)

What i want:

tensor([[0.0000, 0.0002, 0.0004, 0.0005, 0.0007, 0.0009, 0.0011, 0.0013, 0.0014,
    0.0016, 0.0018, 0.0018, 0.0020, 0.0022, 0.0023, 0.0025, 0.0027, 0.0029,
    0.0029, 0.0031, 0.0032, 0.0034, 0.0036, 0.0038, 0.0040, 0.0041, 0.0043,
    0.0045, 0.0047, 0.0049, 0.0051, 0.0052, 0.0054, 0.0056, 0.0058, 0.0060,
    0.0061, 0.0061, 0.0063, 0.0065, 0.0067, 0.0069, 0.0070, 0.0072, 0.0074,
    0.0076, 0.0078, 0.0079, 0.0081, 0.0083, 0.0083, 0.0085, 0.0087, 0.0088,
    0.0090, 0.0092, 0.0094, 0.0094, 0.0096, 0.0097, 0.0099, 0.0101, 0.0103,
    0.0105],[0.0684, 0.0684, 0.0684, 0.0684, 0.0684, 0.0684, 0.0684, 0.0684, 0.0684,
    0.0703, 0.0703, 0.0703, 0.0684, 0.0684, 0.0703, 0.0703, 0.0703, 0.0703,
    0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703,
    0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703,
    0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703,
    0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703, 0.0703,
    0.0703, 0.0703, 0.0712, 0.0712, 0.0712, 0.0712, 0.0712, 0.0712, 0.0712,
    0.0712]], device='cuda:0', dtype=torch.float64)

What i get:

tensor([[0.0000e+00, 1.8038e-04, 3.6076e-04,  ..., 9.9964e-01, 9.9982e-01,
         1.0000e+00],
        [6.8395e-02, 6.8395e-02, 6.8395e-02,  ..., 5.7695e-01, 5.7695e-01,
         5.7695e-01]], device='cuda:0', dtype=torch.float64)

How can i slice the torch tensor to the requiered shape?

Solution

You should read how indexing works in pytorch/numpy and other similar libraries.

You have a tensor of shape (2, 11938).

When you index with X_train[0:2], you get a tensor of size (2, 11938). You don't need to index if you want all the rows.

When you index with X_train[0:2][batch:batch+BATCH_SIZE], you're indexing the result of X_train[0:2] with [batch:batch+BATCH_SIZE]. This means you are still indexing the first axis.

If you want a chunk of size BATCH_SIZE from the second axis, you need to index the second axis, ie X_train[:, batch:batch+BATCH_SIZE].

Answered By - Karl

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, February 5, 2024

[FIXED] How to slice data in pytorch tensor?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels