Issue
I want to treat errors from overestimates and underestimates differently (like The Price is Right) during model training. I don't want to rewrite the entire MLP, regression, DecisionTree, etc. algorithms in sklearn just to implement my custom cost function and relevant derivative. Is there a way for me to define a function any classifier can use to override the default? This is an example of what I'm looking for:
def myCustomError(y_preds,y_actuals):
#Calculate The price is right style error
return #not MSE
from sklearn import #Classifier
c = #Classifier(loss=myCustomError)
If I can't do this in sklearn but I have to use tensorflow or some other libraries, please let me know.
Solution
In cases where you'd need to define a custom loss function, a neural-net framework would typically be used rather than sklearn
. One usually can't supply a custom optimisation function to sklearn
algorithms. If you wanted to stick with sklearn
, some algorithms allow you to configure sample importance or class balancing, but from your question it doesn't seem like the desired solution.
I don't want to rewrite the entire MLP, regression, DecisionTree, etc. algorithms in sklearn just to implement my custom cost function and relevant derivative.
Not sure about decision trees, but MLPs and regression are straightforward to implement in PyTorch. Also, when you define a custom loss function, the derivative is taken care of for you. Here's a simple regression model using a custom loss function that penalises the overestimates more strongly than the underestimates:
Some mock data for this example (2D feature space, and target is a scalar):
#Test data
import numpy as np
np.random.seed(0)
X0 = np.random.randn(128, 2) + 5
X1 = np.random.randn(128, 2)
X = np.concatenate([X0, X1], axis=0)
y = np.concatenate([np.linspace(0, 3, 128), np.linspace(-10, -5, 128)]).reshape(-1, 1)
Simple regression net with a custom loss function in PyTorch:
import torch
from torch import nn
from torch import optim
#Example of a custom loss function
#Treats overestimates differently to underestimates
def custom_loss(predictions, target):
errors = predictions - target
overestimates = errors[errors > 0]
underestimates = errors[errors < 0]
#penalise the square error of the overestimates more
loss = (overestimates ** 2).sum() + (0.5 * underestimates ** 2).sum()
return loss / len(target)
#Define a simple regression neural net
torch.manual_seed(0)
model = nn.Sequential(
nn.Linear(2, 4),
nn.ReLU(),
nn.Linear(4, 4),
nn.ReLU(),
nn.Linear(4, 1)
)
#Data to tensors
X_tensor = torch.tensor(X).to(torch.float32)
y_tensor = torch.tensor(y).to(torch.float32)
#Choose an optimiser and start training
optimiser = torch.optim.RMSprop(model.parameters())
n_epochs = 5
model.train()
for epoch in range(n_epochs):
predictions = model(X_tensor)
loss = custom_loss(predictions, y_tensor)
print('epoch', epoch, loss)
#Backpropagation and optimisation step
optimiser.zero_grad()
loss.backward()
optimiser.step()
For brevity this example leaves out details like scaling and batching the data (and keeping a validation set).
Answered By - some3128
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.