Issue
I am working on a speech denoising problem using DNN. I am calculating my SNR ratio by the function below.
def calculate_snr(clean_signal, recovered_signal):
clean_power = tf.reduce_sum(tf.square(clean_signal))
noise_power = tf.reduce_sum(tf.square(clean_signal - recovered_signal))
snr_db = 10 * tf.math.log(clean_power / noise_power) / tf.math.log(10.0)
return snr_db
I am using keras api to create a model like this
model.compile(loss='mean_squared_error', optimizer=keras.optimizers.Adam(learning_rate=learning_rate),metrics=[calculate_snr])
sound_denoising_history = model.fit(x = X_abs.T, y = S_abs.T,epochs=200,batch_size = 100,validation_data=(X_test_01_abs.T,S_test_01_abs.T))
calculate_snr (X_test_01_abs.T,model.predict(X_test_01_abs.T) : 10.9
While model fit: -4.4 to -3
When I train it, I see that my SNR metric for validation is -7 and oscillates in that range. Whereas if I predict on xval input and then use that with the above function, It gives me 8.2. It is the same identical function and I have checked dimensions multiple times. I am not sure what is happening?
Edit: I know I am missing a step of processing for snr calculation for a signal but even if the metric was standalone usage, it should yield the almost the same ballpark on train end and then inference followed by calculation
Solution
When you use calculate_snr
as a metric in model.compile
, it gets applied batch-wise during training, and then these batch-wise values are averaged to compute the final metric. This can lead to differences in the calculated SNR compared to when you calculate it manually on the entire dataset after making predictions at the end of your code.
You can overcome this limitation by defining the the snr_metric
as a class.
class SNRMetric(keras.metrics.Metric):
def __init__(self, **kwargs):
super(SNRMetric, self).__init__(**kwargs)
self.clean_power = self.add_weight(name="clean_power", initializer="zeros")
self.noise_power = self.add_weight(name="noise_power", initializer="zeros")
self.count = self.add_weight(name="count", initializer="zeros")
def update_state(self, y_true, y_pred, sample_weight=None):
clean_power = tf.reduce_sum(tf.square(y_true))
noise_power = tf.reduce_sum(tf.square(y_true - y_pred))
self.clean_power.assign_add(clean_power)
self.noise_power.assign_add(noise_power)
self.count.assign_add(1)
def result(self):
snr_db = 10 * tf.math.log(self.clean_power / self.noise_power) / tf.math.log(10.0)
return snr_db
Then you can modify your code for training and testing as follows:
# TODO define your model
model.compile(
loss='mean_squared_error',
optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
metrics=[SNRMetric()] # here the crucial point
)
# Train
sound_denoising_history = model.fit(x=X_abs.T, y=S_abs.T, epochs=200, batch_size=100, validation_data=(X_test_01_abs.T, S_test_01_abs.T))
# Calculate SNR using the custom metric after training
snr_metric = SNRMetric()
snr_metric.update_state(S_test_01_abs.T, model.predict(X_test_01_abs.T))
snr_value = snr_metric.result()
print(f"SNR after training: {snr_value.numpy()}")
Answered By - Marco Parola
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.