Wednesday, August 10, 2022

[FIXED] invalid tensorboard callback object

August 10, 2022 jupyter-notebook, keras, python, tensorboard, tensorflow No comments

Issue

I am having difficulty getting tensorflow callback object working.

after much experimentation i now believe my problem is in the creation of my model. the tutorial i followed https://www.youtube.com/watch?v=ViO56ASqeks used tflearn, which is where mine is different from other peoples examples.

I believe (maybe) the problem might be down to 2 log dirs

(a more fundamental base logs folder for all your tensorboard logs)

model = tflearn.DNN(convnet, tensorboard_dir=actual_dir)

(and the specific callbacks location)

tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

simplified example of entire problem below

import numpy as np
import os
import random
import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tensorflow as tf 
import datetime
print(tf.__version__)

raw_data_dir = "C:\\Users\\tgmjack\\Desktop\\ml area\\c v d\\PetImages\\raw"
MODEL_NAME = 'dogsvscats-{}-{}.model'.format(1e-3, '2conv-basic') # just so we remember which saved model is which, sizes must match
actual_dir = "C:/Users/tgmjack/Desktop"


def make_model():
    tf.compat.v1.reset_default_graph()
    convnet = input_data(shape=[None, 1, 1, 1], name='input')
    convnet = conv_2d(convnet, 32, 5, activation='relu')
    convnet = max_pool_2d(convnet, 5)
    convnet = fully_connected(convnet, 1024, activation='relu')
    convnet = dropout(convnet, 0.8)
    convnet = fully_connected(convnet, 2, activation='softmax')
    convnet = regression(convnet, optimizer='adam', learning_rate=1e-3, loss='categorical_crossentropy', name='targets')
    model = tflearn.DNN(convnet, tensorboard_dir=actual_dir)
    model.save(actual_dir+"/"+MODEL_NAME)
    return model

X = [0,1,2,3,4,5,6]
Y = [0,1,2,3,4,5,6]
test_x= [0,1]
test_y= [0,1]

model = make_model()
logdir = os.path.join("logs\cvd", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
model.fit({'input': X}, {'targets': Y}, n_epoch=3, validation_set=({'input': test_x}, {'targets': test_y}) ,
          snapshot_step=500, show_metric=True, run_id=MODEL_NAME , callbacks =[tensorboard_callback]  )

the entire ouput below is

2.9.1
INFO:tensorflow:C:/Users/tgmjack/Desktop/dogsvscats-0.001-2conv-basic.model is not in all_model_checkpoint_paths. Manually adding it.
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model
INFO:tensorflow:0
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model.data-00000-of-00001
INFO:tensorflow:400
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model.index
INFO:tensorflow:400
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model.meta
INFO:tensorflow:500
---------------------------------
Run id: dogsvscats-0.001-2conv-basic.model
Log directory: C:/Users/tgmjack/Desktop/
INFO:tensorflow:Summary name Accuracy/ (raw) is illegal; using Accuracy/__raw_ instead.
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Input In [7], in <cell line: 37>()
     35 logdir = os.path.join("logs\cvd", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
     36 tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
---> 37 model.fit({'input': X}, {'targets': Y}, n_epoch=3, validation_set=({'input': test_x}, {'targets': test_y}) ,
     38           snapshot_step=500, show_metric=True, run_id=MODEL_NAME , callbacks =[tensorboard_callback]  )

File ~\anaconda3\lib\site-packages\tflearn\models\dnn.py:196, in DNN.fit(self, X_inputs, Y_targets, n_epoch, validation_set, show_metric, batch_size, shuffle, snapshot_epoch, snapshot_step, excl_trainops, validation_batch_size, run_id, callbacks)
    194 # Retrieve data preprocesing and augmentation
    195 daug_dict, dprep_dict = self.retrieve_data_preprocessing_and_augmentation()
--> 196 self.trainer.fit(feed_dicts, val_feed_dicts=val_feed_dicts,
    197                  n_epoch=n_epoch,
    198                  show_metric=show_metric,
    199                  snapshot_step=snapshot_step,
    200                  snapshot_epoch=snapshot_epoch,
    201                  shuffle_all=shuffle,
    202                  dprep_dict=dprep_dict,
    203                  daug_dict=daug_dict,
    204                  excl_trainops=excl_trainops,
    205                  run_id=run_id,
    206                  callbacks=callbacks)

File ~\anaconda3\lib\site-packages\tflearn\helpers\trainer.py:314, in Trainer.fit(self, feed_dicts, n_epoch, val_feed_dicts, show_metric, snapshot_step, snapshot_epoch, shuffle_all, dprep_dict, daug_dict, excl_trainops, run_id, callbacks)
    311 callbacks = to_list(callbacks)
    313 if callbacks:
--> 314     [caller.add(cb) for cb in callbacks]
    316 caller.on_train_begin(self.training_state)
    317 train_ops_count = len(self.train_ops)

File ~\anaconda3\lib\site-packages\tflearn\helpers\trainer.py:314, in <listcomp>(.0)
    311 callbacks = to_list(callbacks)
    313 if callbacks:
--> 314     [caller.add(cb) for cb in callbacks]
    316 caller.on_train_begin(self.training_state)
    317 train_ops_count = len(self.train_ops)

File ~\anaconda3\lib\site-packages\tflearn\callbacks.py:88, in ChainCallback.add(self, callback)
     86 def add(self, callback):
     87     if not isinstance(callback, Callback):
---> 88         raise Exception(str(callback) + " is an invalid Callback object")
     90     self.callbacks.append(callback)

Exception: <keras.callbacks_v1.TensorBoard object at 0x000002477036C580> is an invalid Callback object

please, if anyone could show me this working... i really have tried every imaginable combination of directories written in different formats (ive been stuck on this 1 little thing for like 3 weeks) ive also tried clearing out all my logs, changing my working directory, switch between anaconda notebook or idle... etc...

Yet another update

Below is an example of the files that get created when i run without callbacks. Tensorboard cannot see these files. It doesnt crash now :) . but tensorboard sees no data. i have tried this with much more complex data too its not the empty arbitrary data thats causing it

left: files created

right: tensorboard

ps) i only have 1 folder on this whole computer called cvd2, so both the files and my tensorboard seem to be in the right place... so how could it miss?

Solution

Update:

For managing Tensorboard in a tflearn.DNN model, you have to first set a tensorboard_dir where the logs will be saved. However, this does not tells the model to save the logs, just where they can be saved if needed.

Having a look at the tflearn documentation, in order to enable saving the logs for TensorBoard, you have to set the parameter tensorflow_verbose (see also here for more info). For best visualization it is suggested setting tensorflow_verbose=3. At this point you're all set-up, no need for callbacks.

The code:

import numpy as np
import os
import random
import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tensorflow as tf 
import datetime
print(tf.__version__)

MODEL_NAME = 'dogsvscats-{}-{}.model'.format(1e-3, '2conv-basic') # just so we remember which saved model is which, sizes must match
actual_dir = "/my/path/"


def make_model():
    tf.compat.v1.reset_default_graph()
    convnet = input_data(shape=[None, 1, 1, 1], name='input')
    convnet = conv_2d(convnet, 32, 5, activation='relu')
    convnet = max_pool_2d(convnet, 5)
    convnet = fully_connected(convnet, 1024, activation='relu')
    convnet = dropout(convnet, 0.8)
    convnet = fully_connected(convnet, 2, activation='softmax')
    convnet = regression(convnet, optimizer='adam', learning_rate=1e-3, loss='categorical_crossentropy', name='targets')
    model = tflearn.DNN(convnet, tensorboard_dir=actual_dir, tensorboard_verbose=3)
    model.save(actual_dir+"/"+MODEL_NAME)
    return model

X = [[[[0]]]]
Y = [[0, 0]]

model = make_model()
model.fit({'input': X}, {'targets': Y}, n_epoch=3 ,
          snapshot_step=500, show_metric=True, run_id=MODEL_NAME)

To view Tensorboard and your logs, just call the instruction below as always:

%tensorboard --logdir='/my/path'

Answered By - claudia

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, August 10, 2022

[FIXED] invalid tensorboard callback object

Issue

Yet another update

Solution

0 comments:

Post a Comment

Popular Posts

Labels