Skip to main content

Building an AI-Powered Customer Support Assistant

TL;DR

An enterprise client needed to enhance their existing web application with intelligent customer support capabilities. The solution: a modern AI-powered chat assistant built with React and TypeScript on the frontend, backed by a FastAPI service running on Kubernetes. By leveraging AWS Bedrock for LLM capabilities and Knowledge Base for RAG-powered search across internal documentation, combined with custom tools for database queries and rich UI widgets for data visualization, the assistant delivers accurate, contextual responses while surfacing relevant customer data directly in the chat interface.

Introduction

Customer support at scale presents a challenging balance: users expect instant, accurate answers, while support teams struggle with repetitive queries and the need to search across multiple internal systems. For this enterprise client, the goal was clear: integrate an AI assistant directly into their existing web application that could not only answer questions from documentation but also pull real-time data from internal systems and present it in a user-friendly way.

The project required tight integration with the client’s existing infrastructure, including their Confluence knowledge base, internal databases, and authentication systems. The assistant needed to feel native to the application while providing capabilities far beyond a simple FAQ bot.

Technical Architecture

Frontend: Modern Chat Widget

The chat interface was built as a React component library using TypeScript, designed to integrate seamlessly with the client’s existing React-based web application.

Component Technology Purpose
UI Framework React + TypeScript Type-safe, component-based architecture
State Management React Context + Hooks Conversation state, message history
Styling CSS Modules Scoped styling, theming support
Rich Widgets Custom components Data cards, charts, interactive elements

The widget supports both text responses and rich interactive components, rendered dynamically based on the assistant’s response type. When the backend returns structured data (like customer information or statistics), the frontend renders appropriate visualization widgets rather than plain text.

Backend: FastAPI on Kubernetes

The backend service was built with FastAPI, chosen for its async support, automatic OpenAPI documentation, and excellent performance characteristics for AI workloads.

ai-assistant-service/
├── app/
│   ├── api/           # API routes and endpoints
│   ├── core/          # Configuration, security
│   ├── llm/           # Bedrock integration, prompts
│   ├── rag/           # Knowledge Base retrieval
│   ├── tools/         # Custom tool implementations
│   └── widgets/       # Response widget definitions
├── k8s/               # Kubernetes manifests
└── tests/             # Test suite

Deployed on Kubernetes, the service benefits from:

  • Horizontal pod autoscaling based on request load
  • Health checks and automatic restarts
  • Secrets management for AWS credentials
  • Resource limits ensuring predictable performance

AI/ML: AWS Bedrock Stack

AWS Bedrock provides the core AI capabilities:

Service Purpose
Bedrock LLM Natural language understanding and generation
Bedrock Knowledge Base RAG indexing and semantic search
S3 Document storage for knowledge base sources

System Architecture Overview

PlantUML Diagram PlantUML Diagram

Key Technical Challenges

Challenge 1: Custom RAG with Multiple Data Sources

Problem: The client’s documentation was spread across multiple systems: Confluence for internal wikis, S3 for PDF manuals, and various internal tools. Traditional keyword search wasn’t sufficient for natural language queries.

Solution: AWS Bedrock Knowledge Base was configured with multiple data source connectors. The Confluence adapter syncs documentation automatically, while custom ingestion pipelines handle PDF documents and structured data exports from other systems.

The Knowledge Base handles:

  • Automatic chunking and embedding of documents
  • Semantic similarity search across all sources
  • Source attribution for answer traceability

When a user asks a question, the system retrieves relevant context from across all connected sources, providing the LLM with comprehensive information to formulate accurate responses.

Challenge 2: Custom Database Tools

Problem: Users frequently needed real-time data that doesn’t exist in documentation: their account status, recent transactions, support ticket history, and other dynamic information.

Solution: Custom tools were implemented following a function-calling pattern. The LLM can invoke these tools when it determines that real-time data is needed to answer a query.

Tools implemented include:

  • Customer lookup: Retrieve customer profile and account details
  • Transaction search: Query recent transactions with filters
  • Ticket history: Fetch support ticket status and history
  • Usage statistics: Calculate and return usage metrics

Each tool is defined with a clear schema describing its inputs and outputs, allowing the LLM to understand when and how to use it. The tools execute with the authenticated user’s permissions, ensuring data access is properly scoped.

Challenge 3: Rich Response Widgets

Problem: Text-only responses are insufficient when users need to understand complex data like account summaries, transaction histories, or system statistics.

Solution: A widget system was developed that allows the backend to return structured response types alongside text. The frontend recognizes these widget types and renders appropriate visualizations.

Widget types include:

  • Customer card: Profile photo, name, account status, key metrics
  • Transaction table: Sortable, filterable list of recent transactions
  • Stats dashboard: Key metrics with trend indicators
  • Action buttons: Quick actions the user can take directly from the chat

The widget system is extensible: new widget types can be added by defining a schema on the backend and a corresponding React component on the frontend.

Core Features Delivered

Conversational AI

  • Natural language understanding for customer queries
  • Context-aware responses using conversation history
  • Graceful handling of ambiguous or unclear questions

Knowledge Retrieval

  • Semantic search across Confluence, PDFs, and other sources
  • Source citations for transparency and trust
  • Automatic knowledge base synchronization

Real-Time Data Access

  • Customer profile and account information
  • Transaction history and details
  • Support ticket status

Rich Visualizations

  • Customer data cards with key information
  • Interactive data tables
  • Statistics and metrics displays
  • Quick action buttons

Enterprise Integration

  • SSO authentication pass-through
  • Role-based data access
  • Audit logging for compliance

Results

Improved Support Experience

The AI assistant transformed how users interact with customer support:

  • Instant answers: Users get immediate responses to common questions without waiting for human agents
  • Self-service data access: Customers can check their own account information, transaction history, and ticket status without contacting support
  • Contextual help: The assistant understands the user’s current context and provides relevant suggestions

Reduced Support Burden

By handling routine queries and data lookups, the assistant allowed the support team to focus on complex issues that require human judgment. Common query types that are now fully automated include:

  • Account status inquiries
  • Documentation lookups
  • Transaction clarifications
  • Feature explanations

Scalable Architecture

The Kubernetes deployment ensures the assistant can handle traffic spikes without degradation:

  • Auto-scaling responds to demand
  • Stateless design enables horizontal scaling
  • AWS Bedrock handles AI compute scaling automatically

Lessons Learned

  1. RAG quality depends on source quality. The assistant is only as good as the documentation it can access. Investing time in cleaning up and organizing Confluence content paid dividends in response accuracy.

  2. Tool design requires careful scoping. Each tool should do one thing well. Overly complex tools are harder for the LLM to use correctly and harder to maintain.

  3. Widget responses need graceful degradation. Not every client can render rich widgets. The system always includes a text fallback alongside structured widget data.

  4. Conversation context is critical. Users expect the assistant to remember what was discussed earlier in the conversation. Implementing proper conversation history management significantly improved user satisfaction.

  5. Latency budgets matter. Users expect near-instant responses. Careful optimization of the RAG retrieval and LLM calls was necessary to keep response times acceptable.

Conclusion

Building an AI-powered customer support assistant requires more than just connecting an LLM to a chat interface. The real value comes from deep integration with existing systems: knowledge bases that contain institutional knowledge, databases that hold customer data, and UI components that can present complex information clearly.

By combining AWS Bedrock’s powerful AI capabilities with custom tools and rich UI widgets, this project delivered an assistant that genuinely helps users rather than just providing generic responses. The result is a support experience that’s faster, more accurate, and more satisfying for both users and support teams.

How can we support you?

Just send me a message or give me a call. I am looking forward to hearing from you.

[email protected]

+49 160 975 280 94

Portrait of Marco Rico, Fractional CTO

Marco Rico

Fractional CTO