Building AI Agents: 5% AI and 100% Software Engineering
The development of AI agents is more about software engineering than the AI models themselves. Key elements such as data management, controls, and observability play a crucial role in ensuring success. This article delves into the essential components of a doc-to-chat pipeline and how to effectively integrate AI agents into existing software stacks.
Understanding the Doc-to-Chat Pipeline
A doc-to-chat pipeline processes enterprise documents by ingesting, standardizing, enforcing governance, indexing embeddings, and serving retrieval and generation through authenticated APIs. This architecture is vital for applications like agentic Q&A, copilots, and workflow automation, ensuring that responses comply with permissions and are audit-ready.
Integration with Existing Stacks
To seamlessly integrate AI agents, it’s important to utilize standard service boundaries such as REST/JSON or gRPC over a trusted storage layer. For managing tables, Iceberg offers ACID compliance, schema evolution, and snapshots, which are essential for reproducible retrieval. When dealing with vector data, pgvector is a great option for embedding management, and dedicated engines like Milvus can handle high-query-per-second (QPS) approximate nearest neighbor (ANN) searches.
Key Properties of Data Management
- Iceberg Tables: Provide ACID compliance, hidden partitioning, and snapshot isolation.
- pgvector: Combines SQL and vector similarity in a single query plan.
- Milvus: Features a scalable architecture for large-scale similarity searches.
Coordinating Agents, Humans, and Workflows
Effective production agents require clear coordination points for human intervention. Tools like AWS A2I offer managed human-in-the-loop (HITL) processes, ensuring that low-confidence outputs are reviewed. Frameworks such as LangGraph can model these checkpoints within agent workflows, making approvals a key part of the process.
Ensuring Reliability Before Model Deployment
Reliability in AI systems should be approached as a layered defense strategy:
- Language and Content Guardrails: Pre-validate inputs and outputs for safety.
- PII Detection and Redaction: Utilize tools like Microsoft Presidio to identify and mask personally identifiable information.
- Access Control and Lineage: Implement row- and column-level access controls to maintain compliance.
- Retrieval Quality Gates: Assess retrieval-augmented generation (RAG) using metrics like faithfulness and context precision.
Scaling Indexing and Retrieval
To effectively manage real traffic, focus on two main aspects: ingest throughput and query concurrency. Normalize data at the lakehouse edge and write to Iceberg for versioned snapshots. For vector serving, leverage Milvus’s architecture to support horizontal scaling and independent failure domains.
Monitoring Beyond Logs
Effective monitoring requires a mix of traces, metrics, and evaluations:
- Distributed Tracing: Use OpenTelemetry for comprehensive visibility.
- LLM Observability Platforms: Compare options like LangSmith and Arize Phoenix.
- Continuous Evaluation: Regularly evaluate canary sets to track performance over time.
Conclusion: The Importance of Software Engineering in AI
The notion that building AI agents is 5% AI and 100% software engineering underscores the reality that most failures in agent systems arise from issues related to data quality, permissioning, and retrieval decay rather than model performance. By prioritizing strong data management and observability practices, organizations can ensure their AI systems are both reliable and effective.
FAQs
- What is a doc-to-chat pipeline? A doc-to-chat pipeline processes documents for applications like Q&A and workflow automation, ensuring compliance and audit readiness.
- Why is software engineering more important than AI models in building agents? Most failures stem from data management issues rather than the AI models themselves, making software engineering critical.
- How can I ensure data quality in AI systems? Implementing strict data management practices and monitoring can help maintain data quality.
- What tools can assist in human-in-the-loop processes? Tools like AWS A2I can help manage HITL processes effectively.
- How do I scale indexing and retrieval for AI systems? Focus on optimizing ingest throughput and query concurrency, and consider using architectures like Milvus for vector serving.




























