Issue
I am having difficulty getting tensorflow callback object working.
after much experimentation i now believe my problem is in the creation of my model. the tutorial i followed https://www.youtube.com/watch?v=ViO56ASqeks used tflearn, which is where mine is different from other peoples examples.
I believe (maybe) the problem might be down to 2 log dirs
(a more fundamental base logs folder for all your tensorboard logs)
model = tflearn.DNN(convnet, tensorboard_dir=actual_dir)
(and the specific callbacks location)
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
simplified example of entire problem below
import numpy as np
import os
import random
import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tensorflow as tf
import datetime
print(tf.__version__)
raw_data_dir = "C:\\Users\\tgmjack\\Desktop\\ml area\\c v d\\PetImages\\raw"
MODEL_NAME = 'dogsvscats-{}-{}.model'.format(1e-3, '2conv-basic') # just so we remember which saved model is which, sizes must match
actual_dir = "C:/Users/tgmjack/Desktop"
def make_model():
tf.compat.v1.reset_default_graph()
convnet = input_data(shape=[None, 1, 1, 1], name='input')
convnet = conv_2d(convnet, 32, 5, activation='relu')
convnet = max_pool_2d(convnet, 5)
convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)
convnet = fully_connected(convnet, 2, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=1e-3, loss='categorical_crossentropy', name='targets')
model = tflearn.DNN(convnet, tensorboard_dir=actual_dir)
model.save(actual_dir+"/"+MODEL_NAME)
return model
X = [0,1,2,3,4,5,6]
Y = [0,1,2,3,4,5,6]
test_x= [0,1]
test_y= [0,1]
model = make_model()
logdir = os.path.join("logs\cvd", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
model.fit({'input': X}, {'targets': Y}, n_epoch=3, validation_set=({'input': test_x}, {'targets': test_y}) ,
snapshot_step=500, show_metric=True, run_id=MODEL_NAME , callbacks =[tensorboard_callback] )
the entire ouput below is
2.9.1
INFO:tensorflow:C:/Users/tgmjack/Desktop/dogsvscats-0.001-2conv-basic.model is not in all_model_checkpoint_paths. Manually adding it.
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model
INFO:tensorflow:0
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model.data-00000-of-00001
INFO:tensorflow:400
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model.index
INFO:tensorflow:400
INFO:tensorflow:C:/Users/tgmjack/Desktop\dogsvscats-0.001-2conv-basic.model.meta
INFO:tensorflow:500
---------------------------------
Run id: dogsvscats-0.001-2conv-basic.model
Log directory: C:/Users/tgmjack/Desktop/
INFO:tensorflow:Summary name Accuracy/ (raw) is illegal; using Accuracy/__raw_ instead.
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Input In [7], in <cell line: 37>()
35 logdir = os.path.join("logs\cvd", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
36 tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
---> 37 model.fit({'input': X}, {'targets': Y}, n_epoch=3, validation_set=({'input': test_x}, {'targets': test_y}) ,
38 snapshot_step=500, show_metric=True, run_id=MODEL_NAME , callbacks =[tensorboard_callback] )
File ~\anaconda3\lib\site-packages\tflearn\models\dnn.py:196, in DNN.fit(self, X_inputs, Y_targets, n_epoch, validation_set, show_metric, batch_size, shuffle, snapshot_epoch, snapshot_step, excl_trainops, validation_batch_size, run_id, callbacks)
194 # Retrieve data preprocesing and augmentation
195 daug_dict, dprep_dict = self.retrieve_data_preprocessing_and_augmentation()
--> 196 self.trainer.fit(feed_dicts, val_feed_dicts=val_feed_dicts,
197 n_epoch=n_epoch,
198 show_metric=show_metric,
199 snapshot_step=snapshot_step,
200 snapshot_epoch=snapshot_epoch,
201 shuffle_all=shuffle,
202 dprep_dict=dprep_dict,
203 daug_dict=daug_dict,
204 excl_trainops=excl_trainops,
205 run_id=run_id,
206 callbacks=callbacks)
File ~\anaconda3\lib\site-packages\tflearn\helpers\trainer.py:314, in Trainer.fit(self, feed_dicts, n_epoch, val_feed_dicts, show_metric, snapshot_step, snapshot_epoch, shuffle_all, dprep_dict, daug_dict, excl_trainops, run_id, callbacks)
311 callbacks = to_list(callbacks)
313 if callbacks:
--> 314 [caller.add(cb) for cb in callbacks]
316 caller.on_train_begin(self.training_state)
317 train_ops_count = len(self.train_ops)
File ~\anaconda3\lib\site-packages\tflearn\helpers\trainer.py:314, in <listcomp>(.0)
311 callbacks = to_list(callbacks)
313 if callbacks:
--> 314 [caller.add(cb) for cb in callbacks]
316 caller.on_train_begin(self.training_state)
317 train_ops_count = len(self.train_ops)
File ~\anaconda3\lib\site-packages\tflearn\callbacks.py:88, in ChainCallback.add(self, callback)
86 def add(self, callback):
87 if not isinstance(callback, Callback):
---> 88 raise Exception(str(callback) + " is an invalid Callback object")
90 self.callbacks.append(callback)
Exception: <keras.callbacks_v1.TensorBoard object at 0x000002477036C580> is an invalid Callback object
please, if anyone could show me this working... i really have tried every imaginable combination of directories written in different formats (ive been stuck on this 1 little thing for like 3 weeks) ive also tried clearing out all my logs, changing my working directory, switch between anaconda notebook or idle... etc...
Yet another update
Below is an example of the files that get created when i run without callbacks. Tensorboard cannot see these files. It doesnt crash now :) . but tensorboard sees no data. i have tried this with much more complex data too its not the empty arbitrary data thats causing it
left: files created
right: tensorboard
ps) i only have 1 folder on this whole computer called cvd2, so both the files and my tensorboard seem to be in the right place... so how could it miss?
Solution
Update:
For managing Tensorboard in a tflearn.DNN
model, you have to first set a tensorboard_dir
where the logs will be saved. However, this does not tells the model to save the logs, just where they can be saved if needed.
Having a look at the tflearn documentation, in order to enable saving the logs for TensorBoard, you have to set the parameter tensorflow_verbose
(see also here for more info).
For best visualization it is suggested setting tensorflow_verbose=3
. At this point you're all set-up, no need for callbacks.
The code:
import numpy as np
import os
import random
import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
import tensorflow as tf
import datetime
print(tf.__version__)
MODEL_NAME = 'dogsvscats-{}-{}.model'.format(1e-3, '2conv-basic') # just so we remember which saved model is which, sizes must match
actual_dir = "/my/path/"
def make_model():
tf.compat.v1.reset_default_graph()
convnet = input_data(shape=[None, 1, 1, 1], name='input')
convnet = conv_2d(convnet, 32, 5, activation='relu')
convnet = max_pool_2d(convnet, 5)
convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)
convnet = fully_connected(convnet, 2, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=1e-3, loss='categorical_crossentropy', name='targets')
model = tflearn.DNN(convnet, tensorboard_dir=actual_dir, tensorboard_verbose=3)
model.save(actual_dir+"/"+MODEL_NAME)
return model
X = [[[[0]]]]
Y = [[0, 0]]
model = make_model()
model.fit({'input': X}, {'targets': Y}, n_epoch=3 ,
snapshot_step=500, show_metric=True, run_id=MODEL_NAME)
To view Tensorboard and your logs, just call the instruction below as always:
%tensorboard --logdir='/my/path'
Answered By - claudia
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.