Issue
I am trying to build pose detection using cv2, tensorflow in google colab I am encountering with the following error..
Code:
import tensorflow as tf
import tensorflow_hub as hub
import cv2
from matplotlib import pyplot as plt
import numpy as np
from google.colab.patches import cv2_imshow
model = hub.load('https://tfhub.dev/google/movenet/multipose/lightning/1')
movenet = model.signatures['serving_default']
img_original = cv2.imread('/content/brandon-atchison-eexdeq3NleQ-unsplash.jpeg',1)
img_copy = img_original.copy()
input_img = tf.cast(img_original,dtype=tf.int32)
img_copy.shape
tensor = tf.convert_to_tensor(img_original,dtype=tf.int32)
tensor
results = movenet(tensor)
I have created the variable img_copy
cuz I need to perform some operations on the image and want the original image as it is. Not sure what is the error I am facing while trying to get results from the movenet
model.
Solution
Try:
results = movenet(tensor[None, ...])
since you are missing the batch dimension, which is needed to feed data to your model. You could also use tf.expand_dims
:
tensor = tf.expand_dims(tensor, axis=0)
# resize
tensor = tf.image.resize(tensor, [32 * 186, 32 * 125])
Here is a working example:
import tensorflow_hub as hub
model = hub.load('https://tfhub.dev/google/movenet/multipose/lightning/1')
movenet = model.signatures['serving_default']
tensor = tf.random.uniform((1, 160, 256, 3), minval=0, maxval=255, dtype=tf.int32)
movenet(tensor)
Check the model description and make sure you have the correct shape:
A frame of video or an image, represented as an int32 tensor of dynamic shape: 1xHxWx3, where H and W need to be a multiple of 32 and the larger dimension is recommended to be 256. To prepare the input image tensor, one should resize (and pad if needed) the image such that the above conditions are hold. Please see the Usage section for more detailed explanation. Note that the size of the input image controls the tradeoff between speed vs. accuracy so choose the value that best suits your application. The channel order is RGB with values in [0, 255].
Answered By - AloneTogether
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.