I have a question about Conversational Agents in Google. I tried to get the agent to recognize images from the RAG (using OCR and the Layout Parser), but it doesn't work. I am using the JSONL format like this:
{"id": "hanpo_espacio_hanpo_manual_sd_azucar_carga_de_precios_vpu_d8952e_page_1", "content": {"mimeType": "text/plain", "rawBytes": "textinBase64"}, "structData": {"doc_id": "hanpo_espacio_hanpo_manual_sd_azucar_carga_de_precios_vpu_d8952e", "page": 1, "images": ["gs://rag_gptress/rag_images/hanpo_espacio_hanpo_manual_sd_azucar_carga_de_precios_vpu_d8952e/images/image-000.png"]}}
In my first attempt, it worked, but when I tried with more information, it stopped working.
Maybe there is another way to solve my problem. If someone knows how, maybe they can help me.