Issue
I have tried many improvements like increasing epochs, using better loss functions and optimizers, deepening the network and shuffling the dataset, etc, but still to no avail. This problem has been bothering me for a long time, thanks for your help. Below is my code.
load and process dataset
def Iris_Reader(dataset):
np.random.shuffle(dataset['data'])
np.random.shuffle(dataset['target'])
data = torch.FloatTensor(dataset['data'])
label = torch.LongTensor(dataset['target'])
#convert to onthot
label = nn.functional.one_hot(label, num_classes=3).float()
#divide train/test dataset
train_data = data[:120]
train_label = label[:120]
test_data = data[120:]
test_label = label[120:]
return train_data, train_label, test_data, test_label
Define the classifier
class Classifier(nn.Module):
def __init__(self):
super().__init__()
#4*3*3 network
self.model = nn.Sequential(
nn.Linear(4,3),
nn.Sigmoid(),
nn.Linear(3,3),
nn.Sigmoid()
)
#SGD
self.optimiser = torch.optim.SGD(self.parameters(), lr = 0.1)
#MSE LOSS_FUNCTION
self.loss_fn = nn.MSELoss()
self.counter = 0
self.progress = []
def forward(self, input):
return self.model(input)
def train(self, input, target):
output = self.forward(input)
loss = self.loss_fn(output, target)
self.counter += 1
self.progress.append(loss.item())
self.optimiser.zero_grad()
loss.backward()
self.optimiser.step()
# plot loss
def plot_loss(self):
plt.figure(dpi=100)
plt.ylim([0,1.0])
plt.yticks([0, 0.25, 0.5, 1.0])
plt.scatter(x = [i for i in range(len(self.progress))], y = self.progress, marker = '.', alpha = 0.2)
plt.grid('on')
plt.show()
TRAIN
from sklearn import datasets
C = Classifier()
epochs = 5
for epoch in range(epochs):
dataset = datasets.load_iris()
train_data, train_label, _, _ = Iris_Reader(dataset)
# loop = tqdm(zip(train_data, train_label), len( train_label))
for i, j in zip(train_data, train_label):
C.train(i, j)
Solution
There's a bunch of problems here:
- First, you seem to shuffle data and labels independently, rendering the dataset useless.
- Also, you recreate the dataset inside the loop every epoch, wasting the CPU time pointlessly.
Overall, the dataset creation can be shortened to something like this:
def Iris_Reader(dataset):
train_data, test_data, train_label, test_label = sklearn.model_selection.train_test_split(dataset.data, dataset.target, test_size=0.2)
return torch.FloatTensor(train_data), torch.LongTensor(train_label), torch.FloatTensor(test_data), torch.LongTensor(test_label)
and should be taken outside the loop.
Next, MSELoss()
is suited for regression. For classification, CrossEntropyLoss()
is the default choice.
Using sigmoid as activation in an intermediate layer is not the best choice, especially with a short number of epochs. ReLU should converge much better.
Last but not least, your loss chart would look much cleaner if the values were averaged per epoch.
Answered By - dx2-66
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.