Issue
The following code works, converges and the neural net approximates the exponential on the interval from 0 to 1:
# code works
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# fit an exponential
n = 101
x = np.linspace(start=0, stop=1, num=n)
y_e = np.exp(x)
# any odd neural net with sufficient degrees of freedom
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1]),
tf.keras.layers.Dense(units=50, activation="softmax"),
tf.keras.layers.Dense(units=1)
])
# loss function
def loss(y_true, y_pred):
L_ode = tf.reduce_mean(tf.square(y_pred - y_true), axis=-1)
return L_ode
model.compile('adam', loss)
model.fit(x, y_e, epochs=100, batch_size=1)
y_NN = model.predict(x).flatten()
plt.plot(x, y_NN, color='blue')
plt.plot(x, y_e, color='red')
plt.title('NN (blue) and exp (red)')
Now I redefine the loss function and replace y_true
by tf.zeros(n)
in the model.fit
. But this code - which should do the same thing - runs nicely but converges to a constant:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# fit an exponential
n = 101
x = np.linspace(start=0, stop=1, num=n)
y_e = np.exp(x)
# any odd neural net with sufficient degrees of freedom
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1]),
tf.keras.layers.Dense(units=50, activation="softmax"),
tf.keras.layers.Dense(units=1)
])
# loss function
def loss(y_true, y_pred):
L_ode = tf.reduce_mean(tf.square((y_pred - y_e) - y_true), axis=-1)
return L_ode
model.compile('adam', loss)
model.fit(x, tf.zeros(n), epochs=100, batch_size=1)
y_NN = model.predict(x).flatten()
plt.plot(x, y_NN, color='blue')
plt.plot(x, y_e, color='red')
plt.title('NN (blue) and exp (red)')
Where is the mistake?
Background: I would like to approximate solutions of odes with neural nets and therefore define my own loss-function. The above is a very reduced example of the "zero degree ode" y(x) = exp(x).
Solution
As discussed above, the issue is with the sampling of batches, the loss function sees random samples from the x
that you provide to the fit
function, so you cannot have a y
value in the loss that is a global variable, because you do not know which value that should be.
But luckily y_true
in the loss function can be multidimensional, see the keras docs:
y_true
should have shape(batch_size, d0, .. dN)
so we can just pack multiple values in the inner dimensions. In you case we just need the shape n,2
to pack your y_e
(the exponential values) and y_true
(your np.zeros
) in there:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# fit an exponential
n = 101
x = np.linspace(start=0, stop=1, num=n)
y_e = np.exp(x)
y_true = np.zeros(n)
# Pack y_e and y_e in a matrix
y_gt = np.empty((n,2))
y_gt[:,0] = y_e
y_gt[:,1] = y_true
# any odd neural net with sufficient degrees of freedom
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1]),
tf.keras.layers.Dense(units=50, activation="softmax"),
tf.keras.layers.Dense(units=1)
])
# loss function
def loss(y_gt, y_pred):
y_true = y_gt[:,1]
y_e = y_gt[:,0]
L_ode = tf.reduce_mean(tf.square((y_pred - y_e) - y_true), axis=-1)
return L_ode
model.compile('adam', loss)
model.fit(x, y_gt, epochs=100, batch_size=1)
y_NN = model.predict(x).flatten()
plt.plot(x, y_NN, color='blue')
plt.plot(x, y_e, color='red')
plt.title('NN (blue) and exp (red)')
Answered By - FlyingTeller
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.