Issue
I would like to profile my GPU usage when training agents from tensorflow/agents, but I cannot figure out how. Specifically I am trying to profile my GPU when running this example.
It seems that the TensorBoard profiler requires TensorBoard callbacks to be used like so:
# Create a TensorBoard callback
logs = "logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tboard_callback = tf.keras.callbacks.TensorBoard(log_dir = logs,
histogram_freq = 1,
profile_batch = '500,520')
model.fit(ds_train,
epochs=2,
validation_data=ds_test,
callbacks = [tboard_callback])
However no fit
methods are called when training a TF Agent. They are trained using a train
method that accepts no callbacks
argument, which can be seen here.
Is there another way to get the TensorBoard profiler to work when training an agent from the Tensorflow Agents library?
Solution
You can use the tf.profiler
module, which is what the TensorBoard callback does under the hood.
The agent example you linked uses a custom training loop, one possibility would be to use the profiler this way:
# to customize based on the number of steps you want to profile
start_profiling_step = 50
stop_profiling_step = 100
profiling_log_dir = "./profile_logs"
while global_step_val < num_iterations:
# rest of the training code
# ...
if global_step_val == start_profiling_step:
tf.profiler.experimental.start(logdir=profiling_log_dir)
if global_step_val == stop_profiling_step:
tf.profiler.experimental.stop(save=True)
You can look at the documentation of the tf.profiler
module for more information.
Answered By - Lescurel
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.