Monday, March 28, 2022

[FIXED] How to search a set of normally distributed parameters using optuna?

March 28, 2022 hyperopt, hyperparameters, optimization, optuna, python No comments

Issue

I'm trying to optimize a custom model (no fancy ML whatsoever) that has 13 parameters, 12 of which I know to be normally distributed. I've gotten decent results using the hyperopt library:

space = {
    'B1': hp.normal('B1', B1['mean'], B1['std']),
    'B2': hp.normal('B2', B2['mean'], B2['std']),
    'C1': hp.normal('C1', C1['mean'], C1['std']),
    'C2': hp.normal('C2', C2['mean'], C2['std']),
    'D1': hp.normal('D1', D1['mean'], D1['std']),
    'D2': hp.normal('D2', D2['mean'], D2['std']),
    'E1': hp.normal('E1', E1['mean'], E1['std']),
    'E2': hp.normal('E2', E2['mean'], E2['std']),
    'F1': hp.normal('F1', F1['mean'], F1['std']),
    'F2': hp.normal('F2', F2['mean'], F2['std'])
}

where I can specify the shape of the search space per parameter to be normally distributed.

I've got 32 cores and the default Trials() object only uses one of them. Hyperopt suggests two ways to parallelize the search process, both of which I could not get to work on my windows machine for the life of me, so I've given up and want to try a different framework.

Even though Bayesian Hyper Parameter Optimization as far as I know is based on the idea that values are distributed according to a distribution, and the normal distribution is so prevalent that it is literally called normal. I cannot find a way to specify to Optuna that my parameters have a mean and a standard deviation.

How do i tell Optuna the mean and standard deviation of my parameters?

The only distributions I can find in the documentation are: suggest_uniform(), suggest_loguniform() and suggest_discrete_uniform().

Please tell me if I am somehow misunderstanding loguniform distrubution (It looks somewhat similar, but I can't specify a standard deviation?) or the pruning process.

As you might be able to tell from my text, I've spent a frustrating amount of time trying to figure this out and gotten exactly nowhere, any help will be highly appreciated!

Special thanks to dankal444 for this elegant solution (i will replace the mean and std with my own values):

from scipy.special import erfinv
space = {
    'B1': (erfinv(trial.suggest_float('B1', -1, 1))-mean)*std,
    'B2': ...
}

Solution

You can cheat optuna by using uniform distribution and transforming it into normal distribution. To do that one of the method is inversed error function implemented in scipy.

Function takes uniform distribution from in range <-1, 1> and converts it to standard normal distribution

import matplotlib.pyplot as plt
import numpy as np
from scipy import special


x = np.linspace(-1, 1)
plt.plot(x, special.erfinv(x))
plt.xlabel('$x$')
plt.ylabel('$erf(x)$')

mean = 2
std = 3
random_uniform_data = np.random.uniform(-1 + 0.00001, 1-0.00001, 1000)
random_gaussianized_data = (special.erfinv(random_uniform_data) - mean) * std
fig, axes = plt.subplots(1, 2, figsize=(12, 6))
axes[0].hist(random_uniform_data, 30)
axes[1].hist(random_gaussianized_data, 30)
axes[0].set_title('uniform distribution samples')
axes[1].set_title('erfinv(uniform distribution samples)')
plt.show()

This is how the function looks like:

And below example of transforming of uniform distribution into normal with custom mean and standard deviation (see code above)

Answered By - dankal444

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, March 28, 2022

[FIXED] How to search a set of normally distributed parameters using optuna?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels