I’m building a chatbot inside a SaaS platform where users can ask questions in natural language about their data. The backend is a complex relational database with many interrelated tables and strict business rules around permissions, state transitions, and computed fields. The chatbot must answer only from real data, always follow business rules, and never hallucinate or invent information.
I’m trying to understand the correct production-grade approach for this. How do you safely map natural language questions to backend data access without letting the model generate arbitrary SQL? What role should the LLM play versus the backend services? Is this usually done via intent or capability routing with predefined queries, or some other pattern?
I’d also like to know what tech stack works best in practice. Do models need to be fine-tuned, or is this typically handled with prompting and orchestration? Are open-source libraries or frameworks commonly used for this, or is custom logic unavoidable?
If you’ve implemented something like this in a real SaaS product, I’d really appreciate hearing what architecture you used, what worked, and what to avoid.