Issue
I have an image of a car and a corresponding bounding box. For example:
(xmin, ymin, xmax, ymax)
(504.8863220214844, 410.2454833984375,
937.6451416015625, 723.9139404296875)
data:image/s3,"s3://crabby-images/f09c8/f09c8ad1aa4b4ef17c6d84d520c5b47223fe6a02" alt=""
That's how I draw boxes:
def plot_results(pil_img, prob, boxes):
plt.figure(figsize=(16,10))
plt.imshow(pil_img)
ax = plt.gca()
for p, (xmin, ymin, xmax, ymax), c in zip(prob, boxes.tolist(), COLORS * 100):
ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,
fill=False, color=c, linewidth=3))
cl = p.argmax()
text = f'{CLASSES[cl]}: {p[cl]:0.2f}'
ax.text(xmin, ymin, text, fontsize=15,
bbox=dict(facecolor='yellow', alpha=0.5))
plt.axis('off')
plt.show()
I want to measure the distance from car to camera. If the car is nearby, the distance value should be around 0.2-0.4 If the car is far from the camera, the distance value should be around 0.6-0.8.
I also found a solution for my problem: https://pythonprogramming.net/detecting-distances-self-driving-car/ But here author uses an old model. This model doesn't work well.
Solution
In comments you requested code that works similarly to the link you provided. I want to make it clear your source example isn't measuring distance. It is only measuring the width of the bounding boxes on the vehicles. The logic is based on the concept that larger widths are closer to the camera, and a smaller widths are further from the camera. This approach has many flaws due to optical illusions and lack of size and scale context. At any rate:
def plot_results(pil_img, prob, boxes):
granularity = 3 # fiddle with this to scale
img_width_inches = 16
img_height_inches = 10
fig = plt.figure(figsize=(img_width_inches, img_height_inches))
img_width_pixels = img_width_inches * fig.dpi
img_height_pixels = img_height_inches * fig.dpi
plt.imshow(pil_img)
ax = plt.gca()
for p, (xmin, ymin, xmax, ymax), c in zip(prob, boxes.tolist(), COLORS * 100):
ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,
fill=False, color=c, linewidth=3))
cl = p.argmax()
text = f'{CLASSES[cl]}: {p[cl]:0.2f}'
ax.text(xmin, ymin, text, fontsize=15, bbox=dict(facecolor='yellow', alpha=0.5))
# get width of bounding box
box_width_pixels = xmax - xmin
# normalize the box width with image width
normalized_width = box_width_pixels / img_width_pixels
# invert with 1 - apply power of granularity and round to 1 place
apx_distance = round(((1 - (normalized_width))**granularity), 1)
# get middle of box in pixels
mid_x = (xmin + xmax) / 2
mid_y = (ymin + ymax) / 2
# draw value
ax.text(mid_x, mid_y, apx_distance, fontsize=15, color="white")
# normalize the middle x position with image width
mid_x_normalized = mid_x / img_width_pixels
# create arbitrary ranges and logic to consider actionable
if apx_distance <= 0.5:
if mid_x_normalized > 0.3 and mid_x_normalized < 0.7:
ax.text(50, 50, "WARNING!!!", fontsize=26, color="red")
plt.axis('off')
plt.show()
Output:
The main difference between this code and the example you provided is that the bounding box values you've given (504.8863220214844, 410.2454833984375, 937.6451416015625, 723.9139404296875)
represent pixels. However, the code in the example has bounding box values that are already normalized between 0 and 1 in relation to the image size. This is why I verbosely defined the image width and height in inches and pixels (also for self explaining code). They are needed to normalize the pixel based widths and positions so they are between 0 and 1 to match the logic in your example, and which you requested. These values can also be helpful when trying to actually measure sizes and distances.
If you are interested in taking this further. I recommend reading about the laws of perspective. Here is an interesting place to start: https://www.handprint.com/HP/WCL/perspect2.html#distance
Answered By - DSander
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.