position of detected elements and bounding box

#24

by ashesvats - opened 15 days ago

15 days ago

•

I’m building a PDF‑extraction app that needs to detect images inside PDFs and extract their bounding‑box coordinates. have tried adding a prompt such as:
Detect and recognize text in the image, and output the text coordinates in a formatted manner.

The response contains bounding box coordinates in HTML tags in array format.

However, when I extract the numbers they don’t seem to match the actual positions of the detected images in the PDF.

Any guidance on the correct interpretation of the bbox values or a sample snippet for extraction would be greatly appreciated.

Thank you!

ashesvats changed discussion title from Bounding Box Co-ordinates to position of detected elements and bounding box 15 days ago

anty0m

14 days ago

    x1 = int(x01 / 1000 * imwidth)
    y1 = int(y01 / 1000 * imheight)
    x2 = int(x02 / 1000 * imwidth)
    y2 = int(y02 / 1000 * imheight)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment