Skip to content

cal-poly-dxhub/doe-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

DOE ChatBot

A collaborative project by the Cal Poly DxHub, based on AWS’s Francis GenAI RAG ChatBot reference architecture. This solution is designed as an internal training and support assistant for educators adopting a newly implemented platform, using RAG and AWS Bedrock to deliver contextual answers from organizational documentation, including written resources and recorded training materials.


Table of contents


Collaboration

Thanks for your interest in our solution. Having specific examples of replication and cloning allows us to continue to grow and scale our work. If you clone or download this repository, kindly shoot us a quick email to let us know you are interested in this work!

[wwps-cic@amazon.com]


Disclaimers

Customers are responsible for making their own independent assessment of the information in this document.

This document:

(a) is for informational purposes only,

(b) references AWS product offerings and practices, which are subject to change without notice,

(c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided "as is" without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers, and

(d) is not to be considered a recommendation or viewpoint of AWS.

Additionally, you are solely responsible for testing, security and optimizing all code and assets on GitHub repo, and all such code and assets should be considered:

(a) as-is and without warranties or representations of any kind,

(b) not suitable for production environments, or on production or other critical data, and

(c) to include shortcuts in order to support rapid prototyping such as, but not limited to, relaxed authentication and authorization and a lack of strict adherence to security best practices.

All work produced is open source. More information can be found in the GitHub repo.

Authors and Acknowledgements

Modified by:

Based on:


Francis GenAI RAG ChatBot on AWS

Francis is a GenAI RAG ChatBot reference architecture provided by AWS, designed to help developers quickly prototype, deploy, and launch Generative AI-powered products and services using Retrieval-Augmented Generation (RAG). By integrating advanced information retrieval with large language models, this architecture delivers accurate, contextually relevant natural language responses to user queries.

You can use this README file to find out how to build, deploy, use and test the code. You can also contribute to this project in various ways such as reporting bugs, submitting feature requests or additional documentation. For more information, refer to the Contributing topic.

Licence

Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at

http://www.apache.org/licenses/

or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License.

Key Features

  1. Flexible Document Ingestion Pipeline: The chatbot offers two powerful ingestion paths:

    • Default Pipeline: A customizable pipeline that processes various document formats (CSV, plain text) using Lambda functions and stores embeddings in Aurora PostgreSQL.
    • Amazon Bedrock Knowledge Base: A managed ingestion path that leverages Amazon Bedrock's built-in capabilities for document processing and storage in OpenSearch Serverless.
    • Long-video ingestion (ECS): Handles long video assets using an Amazon ECS task that extracts audio, generates transcripts, and produces chunkable text artifacts for downstream indexing.
    • S3-driven KB management: The knowledge base is treated as a mirror of the “input assets” S3 bucket—adding/removing files in that bucket adds/removes them from the KB on the next ingestion run.
  2. Dual Vector Store Support:

    • Postgres pgvector: Efficient vector similarity search using Aurora PostgreSQL with the pgvector extension, ideal for smaller to medium-sized datasets and cost-sensitive deployments.
    • OpenSearch Serverless: Managed vector search solution that seamlessly integrates with Amazon Bedrock Knowledge Base, offering better scalability for larger datasets.
  3. AWS Bedrock Integration:

    • Direct access to state-of-the-art foundation models through AWS Bedrock
    • Seamless integration with Bedrock Knowledge Base for enhanced RAG capabilities
    • Support for various embedding models, text generation and reranking models
    • Built-in document processing and chunking capabilities when using Bedrock Knowledge Base
    • Support for Bedrock Guardrails to filter harmful content and redact sensitive information
  4. Interactive Chatbot Interface: User-friendly interface supporting:

    • Natural language conversations
    • Context-aware responses
    • Real-time document querying
    • Follow-up questions and clarifications
  5. Enterprise-Ready Features:

    • High availability options with OpenSearch Serverless standby replicas
    • Scalable architecture supporting both serverless and provisioned resources
    • Comprehensive security controls and encryption
    • Flexible deployment options to match your requirements

Architecture overview

Architecture reference diagram

The following diagram represents the solution's architecture design.

Diagram

Solution components

The solution deploys the following components:

  • Web Application Firewall: AWS WAF is utilized to safeguard web frontend and API endpoints from prevalent web vulnerabilities and automated bots that could potentially impact availability, compromise security, or overutilize resources.

  • Amazon CloudFront Distribution: Amazon CloudFront distribution is used to serve the ChatBot Web UI. CloudFront delivers low latency, high performance, and secure static web hosting. An Amazon Simple Storage Service (Amazon S3) web UI bucket hosts the static web application artifacts.

  • Amazon Cognito: An Amazon Cognito user pool to provide customers a quick and convenient authentication mechanism to explore the solution's functionalities without extensive configuration.

  • Amazon API Gateway: It exposes a set of RESTful APIs and routes incoming requests to the backend lambda functions.

  • Chat Lambda Function: This lambda function stores and retrieves chat messages for user's chat sessions in a DynamoDB table, enabling the maintenance of conversational context.

  • Inference Lambda Function: The Inference Lambda Function handles user queries and provides natural language responses. It interacts with either the similarity search function or Bedrock Knowledge Base to retrieve relevant context information based on the user's query and fetches the user's chat session messages from the chat lambda function. By combining context retrieval, chat session awareness, and leveraging large language models, the Inference Lambda Function ensures accurate and contextually relevant answers to user queries.

  • Vector Store: The solution supports two vector store options:

    • Amazon Aurora PostgreSQL: A serverless cluster with the PGVector extension to store document chunks and embeddings when using the default ingestion pipeline.
    • Amazon OpenSearch Serverless: A managed vector search service that integrates with Bedrock Knowledge Base, offering enhanced scalability and built-in high availability through standby replicas.
  • Document Ingestion: The solution provides two ingestion paths:

    • Default Pipeline: An AWS Step Function that orchestrates document processing, including:
      • Document chunking and preprocessing
      • Embedding generation
      • Vector store ingestion (Aurora PostgreSQL)
    • Amazon Bedrock Knowledge Base: A managed document ingestion service that provides:
      • Built-in document processing and chunking
      • Automatic embedding generation
      • Direct integration with OpenSearch Serverless
      • Simplified management through Bedrock console
  • Chat History Data Store: A DynamoDB table which stores the user's chat session messages.

  • Amazon Bedrock: Provides access to:

    • Foundation Models for text generation
    • Embedding Models for vector generation
    • Knowledge Base for document ingestion and retrieval
    • Built-in RAG capabilities when using Knowledge Base

The solution architecture adapts based on the chosen ingestion path and vector store configuration:

  1. Default Pipeline with Aurora PostgreSQL:

    • Uses Step Functions for document processing
    • Stores vectors in Aurora PostgreSQL
    • Provides full control over the ingestion process
  2. Bedrock Knowledge Base with OpenSearch Serverless:

    • Leverages managed document processing
    • Stores vectors in OpenSearch Serverless
    • Offers simplified management and scalability
    • Enables built-in Bedrock RAG capabilities

Both configurations maintain the same high-level architecture while offering different trade-offs in terms of management overhead, scalability, and control.


Prerequisites

Build environment specifications

  • To build and deploy this solution, we recommend using an ARM-based instance, Ubuntu with minimum 4 cores CPU, 16GB RAM.
  • The computer used to build the solution must be able to access the internet.

AWS account

  • A CDK bootstrapped AWS account: You must bootstrap your AWS CDK environment in the target region you want to deploy, using the AWS CDK toolkit's cdk bootstrap command. From the command line, authenticate into your AWS account, and run cdk bootstrap aws://<YOUR ACCOUNT NUMBER>/<REGION>. For more information, refer to the AWS CDK's How to bootstrap page.

  • Access to Amazon Bedrock foundation models: Access to Amazon Bedrock foundation models isn't granted by default. In order to gain access to a foundation model, an IAM user with sufficient permissions needs to request access to it through the console. Once access is provided to a model, it is available for all users in the account. To manage model access, sign into the Amazon Bedrock console. Then select Model access at the bottom of the left navigation pane.

  • **AWS Services enabled **: We need following services in this solution, So please make sure these services are enabled in your account

-   Amazon API Gateway
-   Amazon Bedrock
-   AWS CDK
-   Amazon CloudFront
-   Amazon Cognito
-   Amazon DynamoDB
-   Amazon EC2
-   AWS IAM
-   AWS Lambda
-   Amazon CloudWatch Logs
-   Amazon OpenSearch Serverless
-   Amazon S3
-   AWS WAF
-   AWS CloudFormation (Custom resources)
-   AWS Systems Manager (implied by Custom resources)

Please check the file names scp-allow-required-services.json

To apply this SCP in Control Tower:

- Go to AWS Organizations console
- Navigate to Policies → Service control policies
- Create new policy using the JSON content
- Attach to the appropriate OU or account

Tools

  • The latest version of the AWS CLI, installed and configured.
  • The latest version of the AWS CDK.
  • Nodejs version 18 or newer.
  • Docker

1- Development machine Setup.

You need a machine to build and deploy the porgrm. We are going to setup a EC2 as a development & deployment server now. Skip this step, if you are using your desktop or laptop.

  1. Navigate to EC2 Dashboard

    • Sign in to AWS Management Console
    • Search for "EC2" in the services search bar
    • Click on "EC2" to open the EC2 Dashboard
  2. Launch Instance

    • Click "Launch Instance" button
    • Enter instance name (e.g., "Dev-Server")
  3. Choose AMI

    • Select "Amazon Linux" or "Ubuntu" from Quick Start
    • Ensure the AMI shows "arm64" architecture
  4. Select Instance Type

    • Click on instance type dropdown
    • Filter by "ARM-based processors"
    • Select "m6g.xlarge" (4 vCPUs, 16 GiB Memory) or higher
    • Alternative options: m6g.xlarge, c6g.xlarge, r6g.xlarge
  5. Configure Key Pair

    • Select existing key pair or create new one ( give name [CLIENT]-dev-server-kp)
    • Download .pem file if creating new key pair
  6. Network Settings

    • Keep default VPC settings
    • Ensure "Auto-assign public IP" is enabled
    • Configure security group:
      • Allow SSH (port 22) from your IP
  7. Storage Configuration

    • Default 8 GiB is usually sufficient but give 20GB
    • Increase if needed for your workload
  8. Launch Instance

    • Review configuration
    • Click "Launch Instance"
    • Wait for instance to reach "Running" state
  9. Connect to Instance

    chmod 400 "your-key.pem"
    ssh -i "your-key.pem" ec2-user@<public-ip-address>

2- Setup Node & Npm.

Execute the following command in terminal. For more information check here

 curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
 source ~/.bashrc
 nvm install --lts
 node -e "console.log('Running Node.js ' + process.version)"
 node --version

3- Setup git

Install git

 sudo yum install git -y
 git --version

Setup your name & email for git

git config --global user.name “Your Name”
git config --global user.email "Your Email"

Verify

git config  --global user.name
git config --global user.email

If you are accessing this code as a private repository you also need to setup SSH Keys. For information to check about existing keys click here & for creating new keys click here

Generate SSH keys on your dev machine to Authenticate via SSH (if you have not done so.)

ssh-keygen -t ed25519 -C "youremailid@company.com"

now we need to copy the public key & setit up in github. Do the vim command to see the content

vim ~/.ssh/id_ed25519.pub

Copy the content of the file by selecting the content of the file and Control+C

Now go to your Github account

  • Click on profile
  • Click Settings
  • Click SSH & GPG Keys
  • Click on New SSH Key
  • Give a Title and paste the content into Key area.
  • Click on Add SSH Key

4- Setup AWS CDK

Execute the following command in terminal. For more information check here

npm install -g aws-cdk

5- Setup Docker

Execute the following command in terminal. For more information check here

sudo yum update -y
sudo yum install -y docker
sudo service docker start

escalate the permissions

sudo usermod -a -G docker ec2-user

Abort the conenction & reconnect to Ec2

exit

Reconnect to EC2, this time you will have escalated permissions.

 ssh -i "your-key.pem" ec2-user@<public-ip-address>

List all running docker containers ( you should not see any)

docker ps

5- Create a development IAM user

  1. Navigate to IAM Console

    # Go to: https://console.aws.amazon.com/iam/
  2. Create New Cli User

    # Click "Users" in left navigation
    # Click "Create user" button
    # Enter username (e.g. "deployment-user")
    # Click "Next" & Add Admin permissions
    
  3. **Create Access Keys

    # Create access keys and note down these keys. We will use keys later on to connect to AWS account for deployment from dev server

6- Bedrock model persmissions

### Enable Bedrock Model Access

1. **Navigate to Bedrock Console**

   - Sign in to AWS Management Console
   - Go to Amazon Bedrock console: https://console.aws.amazon.com/bedrock/
   - Select "Model access" from the left navigation pane

2. **Request Model Access**

   - Click on "Edit" button in top right
   - Select the required models (Check config.yaml file):
     -us.meta.llama3-3-70b-instruct-v1:0
     -us.amazon.nova-pro-v1:0
     -us.anthropic.claude-3-5-sonnet-20241022-v2:0
     -us.anthropic.claude-3-7-sonnet-20250219-v1:0
     -cohere.rerank-v3-5:0
     -amazon.titan-embed-text-v2:0
   - Click "Save changes"
   - Wait for access approval (usually immediate)

3. **Verify Model Access**

   - Return to "Model access" page
   - Confirm selected models show "Access granted" status
   - Models should now be available for use in the solution

4. **Region Considerations**

   - Ensure model access is enabled in the same region where you plan to deploy
   - If deploying to a different region, repeat process to enable models there
   - Some models may not be available in all regions

5. **Required Permissions**
   - IAM user must have sufficient permissions to request model access
   - Recommended to use admin permissions during initial setup

How to build and deploy the solution

Before you deploy the solution, review the architecture and prerequisites sections in this guide. Follow the step-by-step instructions in this section to configure and deploy the solution into your account.

Time to deploy: approximately 20 minutes

Authenticate for AWS deployment

AWS configure

Provide the Access key and secret, For Region select us-east-1 ( or region of your choice)

Get the code on your dev machine

git clone git@github.com:cal-poly-dxhub/doe-chatbot.git

Change the directory to code directory

 cd doe-chatbot/

Bootstrap the AWS environment

Go to bin directory

cd bin/

Issue the cdk bootstrap command

cdk bootstrap aws://<YOUR ACCOUNT NUMBER>/<REGION>

Build the code

Install the dependencies

npm install

Build the code:

npm run build

Deploy the solution

Provide the admin email address so admin can get the login crredentials via email & setup more users

npm run cdk deploy -- --parameters adminUserEmail=<ADMIN_EMAIL_ADDRESS>

Additional Configuration

Use the bin/config.yaml file to configure the solution.

Data retention policy configuration (optional)

By default, all solution data (S3 buckets, Aurora DB instances, Aurora DB snapshots etc.) will be kept when you uninstall the solution. To remove this data, in the configuration file, set the retainData flag to false. You are liable for the service charges when solution data is retained in the default configuration.

retainData: false,

Application name (optional)

An unique identifier, composed of ASCII characters, is used to support multiple deployments within the same account. The application name will be appended to the CloudFormation stack name, ensuring each CloudFormation stack remains unique.

applicationName: <string>

LLM configuration

Specify settings for the large language models, including streaming, conversation history length, corpus document limits, similarity thresholds, and prompt configurations for standalone question rephrasing and question-answering chains.

  • streaming (optional): Whether to enable streaming responses from the language model. Default is false.

    streaming: <true|false>
  • maxConversationHistory (optional): The maximum number of chat messages to include in the conversation history for rephrasing a follow-up question into a standalone question. Default is 5.

    maxConversationHistory: <integer>
  • maxCorpusDocuments (optional): The maximum number of documents to include in the context for a question-answering prompt. Default is 5.

    maxCorpusDocuments: <integer>
  • corpusSimilarityThreshold (optional): The minimum similarity score required for a document to be considered relevant to the question. Default is 0.25.

    corpusSimilarityThreshold: <float>
  • standaloneChainConfig (optional): Configuration for the standalone question rephrasing chain. If this chain is not configured, the original user questions will be used directly for answering without any rephrasing.

    • modelConfig: configuration for the language model used in this chain

      modelConfig:
        provider: <the provider of the language model (e.g., bedrock, sagemaker).>
        modelId: <the ID of the language model or inference profile (e.g., anthropic.claude-3-haiku-20240307-v1:0, us.anthropic.claude-3-haiku-20240307-v1:0)>
        modelEndpointName: <the name of SageMaker endpoint if the model provider is set to sagemaker. Leave it to empty if the provider is bedrock.>
        modelKwargs: <Additional keyword arguments for the language model, such as topP, temperature etc.>

      Example:

      modelConfig:
        provider: bedrock
        modelId: anthropic.claude-3-haiku-20240307-v1:0
        modelKwargs:
          maxTokens: 1024
          temperature: 0.1
          topP: 0.99
          stopSequences:
            - "Asisstant:"

      To find more information about modelKwargs, please refer to the inference parameters.

    • promptTemplate: The prompt template used for rephrasing questions.

      promptTemplate: <string>

      Example:

      promptTemplate: |
        Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language. 
        If there is no chat history, just rephrase the question to be a standalone question.
      
        Chat History:
        ${chat_history}
        Follow Up Input: ${question}
    • promptVariables: The list of variables used in the prompt template.

      promptVariables:
        - <variable1>
        - <variable2>

      Example:

      promptVariables:
        - chat_history
        - question
    • kwargs: Addtional keyword arguments used in this chain.

      kwargs:
        <key>: <value>

      Example:

      kwargs:
        system_prompt: |
          You are an AI assistant. Your primary role is to assist users with a wide range of questions and tasks. However, do not provide advice or information related to buying or selling stocks, investments, or financial trading. If asked about these topics, politely decline and suggest the user consult a financial professional.
        input_label: inputs
        model_kwargs_label: parameters
        output_label: generated_text
  • qaChainConfig: Configuration for the question-answering chain.

    • modelConfig: Configuration for the language model used in this chain (similar to standaloneChainConfig.modelConfig).

      modelConfig:
        provider: <the provider of the language model (e.g., bedrock, sagemaker).>
        modelId: <the ID of the language model or inference profile (e.g., anthropic.claude-3-haiku-20240307-v1:0, us.anthropic.claude-3-haiku-20240307-v1:0)>
        modelEndpointName: <The name of the SageMaker endpoint for the language model (required for SageMaker models).>
        modelKwargs: <Additional keyword arguments for the language model, such as topP, temperature etc.>
    • promptTemplate: The prompt template used for answering questions.

      prompteTemplate: <string>
    • promptVariables: The list of variables used in the prompt template.

      promptVariables:
        - <variable1>
        - <variable2>
    • kwargs: Addtional keyword arguments used in this chain.

      kwargs:
        <key>: <value>

      To enable promotion image handling, firstly you need to upload the document to the input bucket, and then specify the promotion image URL using the promotion_image_url parameter in the kwargs.

      kwargs:
        promotion_image_url: <s3>
  • rerankingConfig (optional): Configuration for reranking retrieved documents to improve relevance and accuracy of responses. Reranking helps refine the initial similarity search results by applying a more sophisticated model to assess document relevance.

    rerankingConfig:
      modelConfig:
        provider: <the provider of the reranking model (currently supports 'bedrock')>
        modelId: <the ID of the reranking model>
      kwargs:
        numberOfResults: <the number of top results to return after reranking>
        additionalModelRequestFields: <model-specific parameters for reranking requests>
          <key>: <value>

    Example:

    rerankingConfig:
      modelConfig:
        provider: bedrock
        modelId: cohere.rerank-v3-5:0
      kwargs:
        numberOfResults: 10
        additionalModelRequestFields:
          max_tokens_per_doc: 4000

    When enabled, reranking is applied after the initial vector similarity search and before sending context to the LLM. This can significantly improve the quality of retrieved documents, especially for complex queries.

    Note: Reranking may increase latency and costs as it involves an additional model inference step.

  • guardrailConfig (optional): Configuration for content moderation and PII protection. Guardrails help ensure safe and compliant interactions by filtering inappropriate content and handling sensitive information.

    guardrailConfig:
      contentFilters:
        - type: <content filter type (HATE, VIOLENCE, SEXUAL)>
          inputStrength: <filter strength for input (LOW, MEDIUM, HIGH)>
          outputStrength: <filter strength for output (LOW, MEDIUM, HIGH)>
      piiFilters:
        - type: <PII filter type (EMAIL, PHONE, NAME, etc.)>
          action: <action to take on PII (ANONYMIZE, BLOCK)>
      blockedMessages:
        input: <custom message for blocked input>
        output: <custom message for blocked output>

    Example:

    guardrailConfig:
      contentFilters:
        - type: HATE
          inputStrength: HIGH
          outputStrength: HIGH
        - type: VIOLENCE
          inputStrength: HIGH
          outputStrength: HIGH
        - type: SEXUAL
          inputStrength: MEDIUM
          outputStrength: MEDIUM
      piiFilters:
        - type: EMAIL
          action: ANONYMIZE
        - type: PHONE
          action: ANONYMIZE
        - type: NAME
          action: ANONYMIZE
      blockedMessages:
        input: "I apologize, but I cannot process your request as it may contain inappropriate content. Please rephrase your question."
        output: "I apologize, but I cannot provide that type of response. Please try asking a different question."

    When enabled, guardrails are applied to both user inputs and AI responses. Content filters help prevent harmful or inappropriate content, while PII filters protect sensitive personal information.

RAG configuration

  • vectorStoreConfig: Configuration for the vector store. This solution supports two types of vector stores: Amazon Aurora PostgreSQL and Amazon OpenSearch Serverless.

    vectorStoreConfig:
      vectorStoreType: <pgvector | opensearch>
      vectorStoreProperties:
        # For pgvector (Aurora PostgreSQL)
        minCapacity: <The minimum capacity (in Aurora Capacity Units) for the vector store.>
        maxCapacity: <The maximum capacity (in Aurora Capacity Units) for the vector store.>
        useRDSProxy: <Boolean flag indicating if RDS proxy is used for database connections.>
    
        # For OpenSearch Serverless
        standbyReplicas: <'ENABLED' | 'DISABLED', Indicates whether to use standby replicas for the collection. Default is ENABLED>
        allowFromPublic: <Boolean flag determining whether the collection is accessible over the internet from public networks. Default is false>

    Example for pgvector:

    vectorStoreConfig:
      vectorStoreType: pgvector
      vectorStoreProperties:
        minCapacity: 2
        maxCapacity: 8
        useRDSProxy: true

    Example for OpenSearch Serverless:

    vectorStoreConfig:
      vectorStoreType: opensearch
      vectorStoreProperties:
        standbyReplicas: ENABLED
        allowFromPublic: false
  • embeddingsModels: A list of embeddings models used for generating document embeddings.

    embeddingsModels:
      - provider: <The provider of the embeddings model (e.g., bedrock, sagemaker).>
        modelId: <The ID of the embeddings model.>
        modelRefKey: <A reference key for the embeddings model.>
        dimensions: <The dimensionality of the embeddings produced by the model.>
        modelEndpointName: <The name of the SageMaker endpoint for the embeddings model (required for SageMaker models).>

    If multiple embedding models are configured, the first model in the list will be chosen by default unless modelRefKey is specified.

  • corpusConfig (optional): Configuration for the document corpus and ingestion settings. The solution provides two ingestion paths:

    1. Default Pipeline: Uses Aurora PostgreSQL as the vector store
    • Automatically provisions an ingestion pipeline
    • Processes documents through Lambda functions
    • Stores embeddings in Aurora PostgreSQL
    1. Amazon Bedrock Knowledge Base: Uses OpenSearch Serverless as the vector store
    • Leverages Amazon Bedrock's built-in ingestion capabilities
    • Requires OpenSearch Serverless as the vector store
    • Provides managed chunking and processing

    Important: To use Amazon Bedrock Knowledge Base, you must configure OpenSearch Serverless as your vector store in the vectorStoreConfig section. To use default ingestion pipeline, you must configure Aurora PostgreSQL as the vector store in the vectorStoreConfig.

    corpusConfig:
      corpusType: <'default' | 'knowledgebase'>
      corpusProperties:
        # chunking configuration for default
        chunkingConfiguration:
          chunkSize: <Number of characters per chunk, default is 1000>
          chunkOverlap: <Number of characters overlapping between chunks, default is 200>
        # chunk configuration for knowledgebase
        chunkingConfiguration:
          chunkingStrategy: <'FIXED_SIZE' | 'SEMANTIC', default is 'FIXED_SIZE'>
            # For FIXED_SIZE strategy
            fixedSizeChunkingConfiguration:
              maxTokens: <Maximum tokens per chunk (1-1000), default is 512>
              overlapPercentage: <Overlap between chunks (0-100), default is 20>
            # For SEMANTIC strategy
            semanticChunkingConfiguration:
              maxTokens: <Maximum tokens per chunk (1-1000)>
              overlapPercentage: <Overlap between chunks (0-100)>
              boundaryType: <'SENTENCE' | 'PARAGRAPH'>

Chat history configuration (optional) By default, this solution uses DynamoDB to store chat history. Alternatively, it supports storing chat history in the same PostgreSQL database as the vector store.

chatHistoryConfig:
  storeType: <dynamodb | aurora_postgres>

Handoff mechanism configuration (optional) This solution supports a handoff mechanism to transfer the conversation to a human agent after a certain number of requests from the user.

Under classificationChainConfig -> promptTemplate, the model should be configured to return another classification type "handoff_request". If handoff is not enabled, this type should not be present.

handoffConfig:
  model:
    provider: <bedrock>
    modelId: <the Bedrock ID of the handoff model>
    supportsSystemPrompt: <true | false - whether the model supports system prompts via Converse API>
    modelKwArgs: # Optional; uses Bedrock defaults if not set
      maxTokens: 1024
      temperature: 0.1
      topP: 0.99
      stopSequences: ["..."]
  handoffThreshold: <the (integer) number of requests after which the handoff mechanism is triggered>
  details: <optional list of details for the summarizer LLM to focus on>
  handoffPrompts: # Each field is individually optional and handoffPrompts is optional
    handoffRequested: <optional prompt for the model when the user requests a handoff and one has not been triggered>
    handoffJustTriggered: <optional prompt for the model when the most recent request triggered handoff>
    handoffCompleting: <optional prompt for the model when the handoff has been triggered and the user asks for a human again>

AWS WAF configuration (optional) This solution provisions AWS WAF Web ACL for API Gateway resources, by default. For a CloudFront distribution WAF Web ACL, the solution allows users to associate their existing AWS WAF Web ACL for CloudFront with the CloudFront distribution created by the solution. Refer to the configuration options below for configuring your AWS WAF Web ACL.

wafConfig:
  enableApiGatewayWaf: <true|false>
  cloudfrontWebAclArn: <The ARN of existing Waf WebAcl to link with CloudFront. It has to be created on us-east-1.>
  allowedExternalIpRanges: <A list of IP prefixes. e.g. 192.168.0.0/24, 10.0.0.0/8>

Example WAF Configuration:

wafConfig:
  enableApiGatewayWaf: true
  allowedExternalIpRanges:
    - 192.168.0.0/24
    - 10.0.0.0/8

RAG Processing Flows

The GenAI RAG chatbot supports three primary data flows: Classification Flow, Condensing Flow, and Question/Answer Flow. In the Classification Flow, the chatbot analyzes the incoming user query to determine its type and the appropriate processing path. The Condensing Flow is designed to enhance context understanding by rephrasing follow-up questions and the chat history into a standalone, contextually complete question, ensuring accurate downstream processing. Finally, the Question/Answer Flow retrieves relevant information from the knowledge corpus and generates precise, context-aware responses to user queries, leveraging retrieved documents and advanced generative capabilities. Together, these flows enable seamless and intelligent interactions.

Diagram

How to ingest the documents into vector store

The solution provides two methods for document ingestion. Here's a guide to help you choose the right method for your use case:

Method 1: Default Pipeline (Aurora PostgreSQL)

Best For:

  • Custom document preprocessing requirements
  • Fine-grained control over the ingestion process
  • Integration with existing PostgreSQL workflows
  • Cost-sensitive deployments
  • Smaller to medium-sized document collections

This section provides instructions on how to ingest documents into the vector store using our AWS Step Function-based ingestion pipeline.

  1. Find the input bucket name from deployment output starting with InputBucket. Upload the documents from local directory to the input bucket.
aws s3 cp <local_dir> s3://<input_bucket>/<input_prefix>/ --recursive
  1. Find the state machine ARN of AWS step function from the deployment output starting with StateMachineArn. Execute the ingestion pipeline.
aws stepfunctions start-execution --state-machine-arn <state-machine-arn>

Capture the ARN of execution.

  1. Monitor the execution status of the Step Function through the AWS Management Console or using the AWS CLI.
aws stepfunctions describe-execution --execution-arn <execution-arn>
  1. Review the logs generated by the Lambda functions for any errors or issues during the ingestion process. Logs can be accessed through AWS CloudWatch.

Method 2: Amazon Bedrock Knowledge Base (OpenSearch Serverless)

Best For:

  • Simplified operations and management
  • Large-scale document collections
  • Quick setup and deployment
  • Integration with other Bedrock features
  • Production workloads requiring high availability

This section provides instructions on how to ingest documents into Amazon Bedrock Knowledge Base using the AWS CLI.

  1. Find the input bucket name from deployment output starting with InputBucket. Upload the documents from local directory to the input bucket.
aws s3 cp <local_dir> s3://<input_bucket>/<input_prefix>/ --recursive
  1. Find the knowledge base ID and data source ID from deployment output starting with 'KnowledgeBase'. Start the ingestion job using the AWS CLI.
aws bedrock-agent start-ingestion-job --knowledge-base-id <knowledge-base-id> --data-source-id <data-source-id>

Capture the ID of ingestion job.

  1. Monitor the ingestion job status with the AWS CLI.
aws bedrock-agent get-ingestion-job --knowledge-base-id <knowledge-base-id> --data-source-id <data-source-id> --ingestion-job-id <job-id>

Access the solution web UI

After the solution stack has been deployed and launched, you can sign in to the web interface.

  1. Find the website URL from deployment output starting with CloudFrontDomain and open it in your browser. We recommend using Chrome. You will be redirected to the sign in page that requires username and password.
  2. Sign in with the email address specified during deployment (adminEmail) and use the temporary password received via email after deployment. You will receive a temporary password from no-reply@verificationemail.com.
  3. During the sign in, you are required to set a new password when signing in for the first time.
  4. After signing in, you can view the solution's web UI.

File structure

Upon successfully cloning the repository into your local development environment but prior to running the initialization script, you will see the following file structure in your editor.

|- lib/                       # Infrastructure and backend code
   |- infra/                  # CDK Infrastructure
   |- backend/                # Backend code
|- docs/                      # Documentation files
   |- images/                 # Documentation images and diagrams
|- frontend/                  # React ChatBot UI application
   |- src/                    # Source code files
   |- public/                 # Static assets
|- quickstart/                # Quick start configuration examples
   |- bedrock/                # Sample configuration files
|- reports/                   # Security and dependency reports
|- .gitignore                 # Git ignore file
|- LICENSE.txt                # Apache 2.0 license
|- README.md                  # Project documentation

Uninstall the solution

You can uninstall the solution by directly deleting the stacks from the AWS CloudFormation console.

To uninstall the solution, delete the stacks from the AWSCloudFormation console

  • Go to the AWS CloudFormation console, find and delete the following stacks:
    • All the stacks with the prefix FrancisChatbotStack

Alternatively, you could also uninstall the solution by running npm run cdk destroy from the source folder.


Support

For any queries or issues, please contact:

Copyright 2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.

Licensed under the Apache License Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at

http://www.apache.org/licenses/

or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and limitations under the License.

About

GenAI RAG ChatBot reference architecture provided by AWS, designed to help developers quickly prototype, deploy, and launch Generative AI-powered products and services using Retrieval-Augmented Generation (RAG). We customized this framework to ingest helpdesk-style documents (.docx, .pdf), long-form webinars (.mp4), etc., for helpdesk usecases.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors