i'm building a Natural Language to SQL system that;
Connects to any PostgreSQL database and auto-introspects schema using
information_schemaBuilds a graph of table relationships using NetworkX
Uses sentence-transformers embeddings to find relevant tables from a plain English query
Scores join paths using semantic similarity + keyword boosting + coverage scoring
Sends the best path to Groq LLM (Llama 3.3 70B) to generate SQL
Has a self-healing retry loop that sends failed SQL back to LLM for fixing
Tech Stack: Python, PostgreSQL, NetworkX, sentence-transformers, Groq API, psycopg2, FastAPI (planned)
Current Problems:
Problem 1 — What is the best way to handle reserved word table names when auto-generating SQL with an LLM? Should I rename the table or handle it in the prompt?
Problem 2 — LLM Self-Healing Loop Not Working
for attempt in range(3):
execute SQL
if fails → send error to LLM → fix SQL → retry
LLM returns identical broken SQL all 3 attempts
Question: What prompt engineering techniques work best for getting an LLM to actually fix broken SQL when given an error message?
Problem 3 — Join Path Ambiguity
Multiple valid paths between same tables patient → prescription → prescription_detail → drug ✅ correct patient → controlled_drug_log → drug ✅ also valid but wrong context
Currently scoring with:
- Semantic similarity (weight 0.8)
- Path length (weight 0.1)
- Nullable FK penalty (weight 0.1)
- Keyword boost (+0.15 per matching keyword)
- Relevant table coverage boost (×0.8)
Question1:What is the most reliable approach to disambiguate multiple valid join paths in a graph when the correct path depends on business context?
Problem 2— Schema Linking Noise** With 23 tables, embedding-based table selection sometimes picks irrelevant tables which pollute the path finding. Currently using top_k=5 with forced entity detection.
Question3:What is the recommended approach for schema linking in NL-to-SQL systems with 20+ tables?
### What I've Tried:
- Full schema in prompt → LLM gets confused with 23 tables
- Graph-based path finding with cutoff=4 → reduced paths from 1437 to 6
- Embedding similarity alone → wrong path selected due to length bias
- Keyword boosting → improved accuracy significantly
- Irrelevant table penalty → reduced noise in path selection