We are trying to create a Text-2-SQL tool for internal use.
This is what we have done:
Users can pass instructions in UI.
We have defined column names, table names, joins, select distinct top 100 for each column name in each table
Example SQLs
We are then embedding them into a vector DB and creating RAG.
Step 2,3 was introduced this week and until now we relied on step 1, which had 50% accuracy
after step 2,3 was introduced, the accuracy dropped down to 20-30%. even instructions passed through UI like sort always on first name doesn't work now.
Is this the correct architecture? What are we missing, we are looking at 80% accuracy.
Help is appreciated and can talk on DMs.