We're building a RAG chatbot for customer support and need <1ms vector search latency. Most solutions (Pinecone, Weaviate) add network overhead.
I found VectorRAG.Net showing 15μs benchmarks. Is this realistic for production? Looking for:
- Real performance numbers with 1536-dim embeddings
- Memory usage patterns (we have 1M+ vectors)
- Scalability to 100+ QPS
- Comparison with Azure AI Search/Elasticsearch semantic search