Issue
I'm trying to make use of Google Colab to use a Tensor Processing Unit (TPU) to train a neural network. Tensorflow has just come out with a major release, 2.0, so I am trying to do this within Tensorflow 2.0. I have tried following three guides, but all were written for Tensorflow 1.14- and fail with Tensorflow 2.0:
1) Following the guide TPUs in Colab, I get the error:
AttributeError: module 'tensorflow' has no attribute 'Session'
(from the reference: with tf.Session(tpu_address) as session:)
2) Following the guide Simple Classification Model using Keras on Colab TPU, I get the same error
3) Following the guide cloud_tpu_custom_training, I get the error:
AttributeError: module 'tensorflow' has no attribute 'contrib'
(from the reference: resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu=TPU_WORKER))
Does anyone have an example of using a TPU to train a neural network in Tensorflow 2.0?
Edit: This issue also appears to have been raised on github: InvalidArgumentError: Unable to find a context_id matching the specified one #1
Solution
Finally support has been added for TPUs in Tensorflow 2.1.0 (as of Jan 8, 2020). From the release notes here https://github.com/tensorflow/tensorflow/releases/tag/v2.1.0:
Experimental support for Keras .compile, .fit, .evaluate, and .predict is available for Cloud TPUs, Cloud TPU, for all types of Keras models (sequential, functional and subclassing models).
The tutorial is available here: https://www.tensorflow.org/guide/tpu
For completeness, I'll add the walkthrough here:
- Go to Google Colab and create a new Python 3 Notebook here: https://colab.research.google.com/
- In the toolbar, click Runtime / Change runtime type, then choose "TPU" under Hardware accelerator.
- Copy and paste the below code into the notebook and click run cell (the play button).
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import os
import tensorflow_datasets as tfds
# Distribution strategies
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
# MNIST model
def create_model():
return tf.keras.Sequential(
[tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10)])
# Input datasets
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
def scale(image, label):
image = tf.cast(image, tf.float32)
image /= 255.0
return image, label
train_dataset = mnist_train.map(scale).shuffle(10000).batch(batch_size)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
# Create and train a model
strategy = tf.distribute.experimental.TPUStrategy(resolver)
with strategy.scope():
model = create_model()
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['sparse_categorical_accuracy'])
train_dataset, test_dataset = get_dataset()
model.fit(train_dataset,
epochs=5,
validation_data=test_dataset,steps_per_epoch=50)
Note that when I run the code from the tensorflow tutorial as-is I get the below error. I've corrected this by adding the steps_per_epoch parameter in model.fit()
ValueError: Number of steps could not be inferred from the data, please pass the steps_per_epoch argument.
Answered By - maurera
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.