AWS Bedrock — Exploring Agents, Knowledge-Base & RAG

Kapil Raina
7 min readMay 3, 2024

Like many other AWS services, AWS Bedrock is designed to address multiple use-case, usage patterns and solutions. Primarily, Bedrock expose AWS GenAI models for consumers which includes AWS’s own LLM viz. Titan Model family as well as models from other vendors such as Anthropic, Meta, Mistral, AI21& Stable Diffusion. Apart from being a model provider, AWS Bedrock has also provided managed solution for many GenAI patterns like RAG &Agentic GenAI, using AWS native services which makes the implementation of these patterns, somewhat a different experience (as opposed to writing these patterns in a bespoke application using some framework). This post aims to tie together these typical GenAI solution building blocks, and explore the usage scenarios and how it a developer experience would be in using the Bedrock. Bedrock provides the flexibility to leverage part of Bedrock to build solutions or integrate with more than one bedrock service. There are some niche features such as model-comparison via chat playgrounds, Watermark detection, Model Evaluation etc. that bedrock provides. I am tackling more common use-cases here from a GenAI consumer standpoint.

A unified AWS Bedrock view looks like this :

AWS Bedrock — Putting it all together

The main components in this view are described below. The VectorDB, Embeddings model, LLM Model, S3, KMS, IAM Role — these are well understood.

Agent : Bedrock provide a tool orchestrating agent. The Agent service can be bound to multiple Action Groups — which essentially are a OpenAPI defined actions that a bespoke Lambda brokers with for the Agent. Agent invokes appropriate ‘Tool’ via the API definition and delegates the processing to a handler Lambda. In addition, Bedrock Agent also uses a secondary knowledge source (known as Knowledge Base), as one of the tools to delegate the “RAG” calls to the Knowledge base. An Agent can thus orchestrate over multiple Action Groups and multiple Knowledge Bases and uses an AWS LLM as reasoning engine to run the tool orchestration.

Defining Agent

Knowledge Base : Knowledge Base is an abstration over a “Data Sources” and (if vector DB is used), a Embeddings model. Knowledge base defines the knowledge source pipeline part of the RAG pattern. The Knowledge base is designed to “pull” the data from data sources(which as S3 objects), define a chunking strategy, define a metadata strategy for structures data files such as JSON or CSV. As will be seen subsequently, Knowledge Base can be used in multiple patterns as a retriever , independent of the Agents.

Knowledge Base

Data Source : Data source is an abstraction over an S3 source, and document chunking strategy. The Datasource just defines a data ingestion and chunking pipe and has to be used inside a knowledge base. Knowledge base completes the ingestion phase of knowledge data for a RAG/Search use case.

Data Source

With these solution blocks, lets explore the pattern and use-case which can be used to implement a comprehensive or part of a GenAI flow.

The code presented here, along with OpenAI Spec, some SAM templates and designs are pushed to this git repo. The examples need a local .aws credentials profile. I haven’t provided the name and assumed that default profile would contain all IAM permissions to run the code, if you need to.

Using Bedrock Knowledge Base as Semantic Search only.

Since Bedrock Knowledge Base provides a full-fledged implementation of knowledge data ingestion pipeline, one of the data stores for which is AWS OpenSearch Service, which provides a fully managed Vector Database, KnowledgeBase can be used purely as a retriever for semantic search use-cases such as lookups, recommendations, enterprise search etc. Bedrock SDK can provide retrieval services, but more popular GenAI frameworks such as LangChain/LLamaIndex make creating retrievers out of KnowledgeBase easier. Code below uses LangChain community retriever for AWS KnowledgeBase.

from langchain_community.retrievers import AmazonKnowledgeBasesRetriever

bedrock_kb_retriever = AmazonKnowledgeBasesRetriever(knowledge_base_id="BCVTY5BVYX",
retrieval_config={"vectorSearchConfiguration":{"numberOfResults": 5}},
region_name="us-east-1",
credentials_profile_name="default"
)
'''
Use AWS Bedrock knowledgebase purely as a semantic search function, with no LLM dependency on retrievaal
'''
def raw_knowledgebase():
while True:
q_ = input("(q to quit): ")
if q_ == 'q':
break
documents = bedrock_kb_retriever.invoke(q_)
print_documents(documents)

Use KnowledgeBase with an external RAG implementation

KnowledgeBase provides the ingestion part of a RAG implementation. With retrievers, it can be plugged in to a bespoke RAG with external LLM and a RAG pipeline. During execution, the Datasource provides the matching chunks from OpenSearch Serverless Vector DB , along with the metadata, which then is summarised into a context for an LLM to be grounded into making precise responses. Code here uses the LangChain Q&A chain, along with AWS Anthropic Claude (could have been anything else too). Using KnowledgeBase like this provides a quick solution for “Chat With Documents” kind of scenarios.

from langchain_aws import ChatBedrock
from langchain.chains import RetrievalQA

chat_claude = ChatBedrock(model_id="anthropic.claude-v2",
region_name="us-east-1",
credentials_profile_name="default",
verbose=False,
model_kwargs=model_kwargs_claude)

'''
Use AWS Bedrock knowledge base as Retriever to augment in a RAG use-case. Here RAG is implemented as a lanchain QA chain.
Knowledge base datasource setup creates the pipeline for knowledge documents to be synced for embedding in OpenSearch.
'''

def knowledgebase_RAG():
chain = RetrievalQA.from_chain_type(llm=chat_claude,retriever=bedrock_kb_retriever,return_source_documents=True)
while True:
q_ = input("(q to quit): ")
if q_ == 'q':
break
res = chain.invoke({"query":q_})
print(res['result'])
print("\n SOURCES:")
for index, doc in enumerate(res['source_documents'], start=1):
print("*"*100)
print(f"{doc.page_content[:100]} ...")
print(doc.metadata)

Use KnowledgeBase as a ‘Tool’ for Agentic Framework

KnowledgeBase retriever can be converted into a Tool definition, that allows an Agent to delegate processing or one or multiple steps in processing to the Knowledge Base. Code below use LangChain to create a ReAct agent with KnowledgeBase and a Google Search Tools that enable multiple actions to be orchestrated i.e. a user query that can be solved via Knowledge lookup is delegated to KnowledgeBase, any external search is delegated to Google SerperAPI.

from langchain.agents.agent_toolkits.conversational_retrieval.tool import create_retriever_tool
from langchain_community.utilities import GoogleSerperAPIWrapper
from langchain.agents import AgentExecutor, Tool, create_react_agent
from langchain import hub
from langchain.memory import ChatMessageHistory

'''
Use AWS Bedrock knowledge base retriever as one of the tool in Agentic framework (langchain ReAact Agent here).
Includes external websearch tool and knowledge base retriever tools for agent executor to orchestrate.
needs SERPER_API_KEY
'''
def knowledgebase_retriever_Agentic():
retriever_tool = create_retriever_tool(bedrock_kb_retriever,"World History,revolutions, religions, religious history, Reative Architecture, DDD, Microservices, Domain Driven Design, CQRS, Reactive Manifesto",
"World History,revolutions, religions, religious history, Reative Architecture, DDD, Microservices, Domain Driven Design, CQRS, Reactive Manifesto")
#Google Search API
search = GoogleSerperAPIWrapper(k=2)
search_tool = Tool(name="GoogleSearch", func=search.run, description="Use to search on google for a maximum of 2 times")
alltools = [ retriever_tool,search_tool]

prompt = hub.pull("hwchase17/react")
agent = create_react_agent(chat_claude, alltools, prompt)
memory = ChatMessageHistory(session_id="chat-history")
agent_executor = AgentExecutor(agent=agent, tools=alltools,handle_parsing_errors=True,verbose=True)
agent_with_chat_history = RunnableWithMessageHistory(
agent_executor,
lambda session_id: memory,
input_messages_key="input",
history_messages_key="chat_history",
)
while True:
q_ = input("(q to quit): ")
if q_ == 'q':
break
res = agent_with_chat_history.invoke({"input": q_},
config={"configurable": {"session_id": "JHJMNBNMB67686"}})
print(res['output'])
Agent Execution

Use Bedrock Agent as an Agentic solution

In the earlier example , there are many things that a conversational agent needs to take care of viz. maintaining history, allocating memory, purging history to maintain the LLM context window etc. By using AWS Bedrock managed agent, those concerns are taken care up the Agent service. Bedrock agent, albeit is a much different agentic experience than what you can achieve via Frameworks such as LangChain, CrewAI, AutoGen etc. These frameworks, offer great deal of control on the type of Agent you want to create (search, conversational, ReAct etc.) and the kind of the tools you can natively bind with the Agent. Bedrock agent provides only an OpenAPI based Tool definition, that needs to be handled by a custom Lambda function for implementation (checkout src/tools/ecommapiadapter.py in the repo). Although Bedrock Agent provides convenience and native tools, the Agentic experience is in the flexibility in the kind of tools that can be bound to Agent and the experimentation on the complexity of reasoning that a specific agent type needs. These two aspects are somewhat restricted in Bedrock Agent. But still a good choice for simple and typical use-cases, that can benefit from delegation to an API defined tools.

Also there isnt maor support for Bedrock Agents to be wrapped in major GenAI frameworks. Thus the code below uses the boto3 SDK to initiate a session with an existing Bedrock Agent.

import boto3
from uuid import uuid4

'''
Use Bedrock Agent via AWS SDK. The Agent is configured to hanlde user session and maintains state and memory. So client doesnt have to do it.
'''
def bedrock_Agent():
client = boto3.client('bedrock-agent-runtime')
session_id = str(uuid4())
while True:
q_ = input("(q to quit): ")
if q_ == 'q':
break
res = client.invoke_agent(agentAliasId="TMYXHNPEKI",agentId="MWNIEHJPHT",sessionId=session_id,inputText=q_)
eventstream = res['completion']
for event in eventstream:
print(event['chunk'])

The Bedrock Agent in this example has been set up with a typical ecomm APIs (Customer, Order etc.) and the same KnowledgeBase as used for earlier examples. The API implementation are mocked in a lambda function (src/tools/ecommapiadapter.py).

BedRock Agent Execution.

Using Bedrock Models — General Purposes

With community contributions, Bedrock models are now mainstreamed into major GenAI frameworks. Here is an example of using Titan embeddings Models and Anthropic Claude model, with GenAI. Adhereing to the LangChain abstrations, opens up all usages for the Bedrock provided models.


#As simple LLM
llm_titan = BedrockLLM(credentials_profile_name="default",
model_id="amazon.titan-text-express-v1",
region_name="us-east-1")

#As Chat Model for conversations.
model_kwargs_claude = {"temperature": 0, "top_k": 10}
chat_claude = ChatBedrock(model_id="anthropic.claude-v2",
region_name="us-east-1",
credentials_profile_name="default",
verbose=False,
model_kwargs=model_kwargs_claude)

# As embeddings model
embeddings_bedrock = BedrockEmbeddings(credentials_profile_name="default",
region_name="us-east-1"
)

Thank you.

--

--

Kapil Raina

Tech Architect | Unpredictably Curious | Deep Breathing