I’m working on a small embedded vision project and was wondering if it’s feasible to run a local image recognition model on an ESP32-S3 WROOM (16 MB flash).
The goal is to recognize playing cards from still images captured by a camera module. It doesn’t need to be real-time processing. I’d ideally like it to identify the exact card (rank and suit), not just detect that a card is present.
Does anyone know if this is feasible or have any tips?