Recognize images in conversational agents in google

17:21 16 Feb 2026

I have a question about Conversational Agents in Google. I tried to get the agent to recognize images from the RAG (using OCR and the Layout Parser), but it doesn't work. I am using the JSONL format like this:

{"id": "hanpo_espacio_hanpo_manual_sd_azucar_carga_de_precios_vpu_d8952e_page_1", "content": {"mimeType": "text/plain", "rawBytes": "textinBase64"}, "structData": {"doc_id": "hanpo_espacio_hanpo_manual_sd_azucar_carga_de_precios_vpu_d8952e", "page": 1, "images": ["gs://rag_gptress/rag_images/hanpo_espacio_hanpo_manual_sd_azucar_carga_de_precios_vpu_d8952e/images/image-000.png"]}}

In my first attempt, it worked, but when I tried with more information, it stopped working.

Maybe there is another way to solve my problem. If someone knows how, maybe they can help me.

google-cloud-platform google-cloud-storage google-cloud-datastore google-cloud-vertex-ai google-cloud-console

Your Answer

Privacy & Cookie Consent