HELP: Improve llm image processing with number “1” and “7”
Hello! I’m having an issue using the GPT4o model. It gets confused when identifying the numbers 7 and 1 for a task involving analyzing an order on a sheet of paper.
I’m doing this through the OpenAI playground with a custom assistant, and I provide this prompt:
Instructions for Image Analysis ## Objective Perform a detailed analysis of the information contained in the image, ensuring the accurate interpretation of alphanumeric characters. ## Specific Criteria 1. Analyze each character carefully and precisely. 2. If you identify the number “7,” check if it has a horizontal line in the middle: - With a line: Identify it as “7.” - Without a line: Interpret it as “1.” ## Accuracy Prioritize precision in data extraction and interpretation, ensuring the visual information is reflected as faithfully as possible.
I think that The main objective is to identify the number 7 only when it has a “line” in the middle, otherwise identify as 1
in the reference image you can see that the original image(right side) has 1's and the model is reconizing it as 7's
If someone could suggest me about how can I improve the model, how can I use RAG for this solution or your experience and approach with similar projects. I would be very grateful.