Issue
I have a model which I trained for action detection. I saved it and have the .h5 or .hdf5 file with me. I want to see the results real time(webcam). I mean I want the model to predict the action performed which is coming from a live webcam feed. How do I go about it? And I also have to puttext to display the "action" which is being predicted. I didn't find a proper description about on how to go about this. I found a lot of articles for only object detection but I don't want for that. Please help me. Any article/video links/website links regarding this would be highly appreciated.
Solution
As I was saying in the comments you could do something like this:
# 1) load the model
from tensorflow import keras
model = keras.models.load_model('path/to/location')
import cv2
# Open the device at the ID 0
# Use the camera ID based on
# /dev/videoID needed
cap = cv2.VideoCapture(0)
#Check if camera was opened correctly
if not (cap.isOpened()):
print("Could not open video device")
# 2) fetch one frame at a time from your camera
while(True):
# frame is a numpy array, that you can predict on
ret, frame = cap.read()
# 3) obtain the prediction
# depending on your model, you may have to reshape frame
prediction = model(frame, training=False)
# you may need then to process prediction to obtain a label of your data, depending on your model. Probably you'll have to apply an argmax to prediction to obtain a label.
# 4) Adding the label on your frame
__draw_label(frame, 'Label: {}'.format(prediction), (20,20), (255,0,0))
# 5) Display the resulting frame
cv2.imshow("preview",frame)
#Waits for a user input to quit the application
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
The function to add the overlay, comes from here:
def __draw_label(img, text, pos, bg_color):
font_face = cv2.FONT_HERSHEY_SIMPLEX
scale = 0.4
color = (0, 0, 0)
thickness = cv2.FILLED
margin = 2
txt_size = cv2.getTextSize(text, font_face, scale, thickness)
end_x = pos[0] + txt_size[0][0] + margin
end_y = pos[1] - txt_size[0][1] - margin
cv2.rectangle(img, pos, (end_x, end_y), bg_color, thickness)
cv2.putText(img, text, pos, font_face, scale, color, 1, cv2.LINE_AA)
References:
- OpenCV tutorial on how to fetch frames from a camera
- How to load and save a keras model
- See here full details on how to add an overlay to your frame
Hope this helps getting you on the right track.
On a side note, if you process only one frame at a time, things could become pretty slow. If this is the case you may think about creating a producer-consumer schema. You could have some queues where you temporary store the frames coming from your camera. You process the frames inside the queues in a parallel way. After that you re-organize your frames to show them on the screen in the right order. This could speed things up.
Answered By - ClaudiaR
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.