2023/11/28 - Agents for Amazon Bedrock Runtime - 3 new api methods
Changes This release adds support for customization types, model life cycle status and minor versions/aliases for model identifiers.
Retrieve from knowledge base.
See also: AWS API Documentation
Request Syntax
client.retrieve( knowledgeBaseId='string', retrievalQuery={ 'text': 'string' }, retrievalConfiguration={ 'vectorSearchConfiguration': { 'numberOfResults': 123 } }, nextToken='string' )
string
[REQUIRED]
Identifier of the KnowledgeBase
dict
[REQUIRED]
Knowledge base input query.
text (string) -- [REQUIRED]
Knowledge base input query in text
dict
Search parameters for retrieving from knowledge base.
vectorSearchConfiguration (dict) -- [REQUIRED]
Knowledge base vector search configuration
numberOfResults (integer) -- [REQUIRED]
Top-K results to retrieve from knowledge base.
string
Opaque continuation token of previous paginated response.
dict
Response Syntax
{ 'retrievalResults': [ { 'content': { 'text': 'string' }, 'location': { 'type': 'S3', 's3Location': { 'uri': 'string' } }, 'score': 123.0 }, ], 'nextToken': 'string' }
Response Structure
(dict) --
retrievalResults (list) --
List of knowledge base retrieval results
(dict) --
Result item returned from a knowledge base retrieval.
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
score (float) --
The relevance score of a result.
nextToken (string) --
Opaque continuation token of previous paginated response.
Invokes the specified Bedrock model to run inference using the input provided in the request body.
See also: AWS API Documentation
Request Syntax
client.invoke_agent( sessionState={ 'sessionAttributes': { 'string': 'string' }, 'promptSessionAttributes': { 'string': 'string' } }, agentId='string', agentAliasId='string', sessionId='string', endSession=True|False, enableTrace=True|False, inputText='string' )
dict
Session state passed by customer. Base64 encoded json string representation of SessionState.
sessionAttributes (dict) --
Session Attributes
(string) --
(string) --
promptSessionAttributes (dict) --
Prompt Session Attributes
(string) --
(string) --
string
[REQUIRED]
Identifier for Agent
string
[REQUIRED]
Identifier for Agent Alias
string
[REQUIRED]
Identifier used for the current session
boolean
End current session
boolean
Enable agent trace events for improved debugging
string
[REQUIRED]
Input data in the format specified in the Content-Type request header.
dict
The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.
Response Syntax
{ 'completion': EventStream({ 'chunk': { 'bytes': b'bytes', 'attribution': { 'citations': [ { 'generatedResponsePart': { 'textResponsePart': { 'text': 'string', 'span': { 'start': 123, 'end': 123 } } }, 'retrievedReferences': [ { 'content': { 'text': 'string' }, 'location': { 'type': 'S3', 's3Location': { 'uri': 'string' } } }, ] }, ] } }, 'trace': { 'agentId': 'string', 'agentAliasId': 'string', 'sessionId': 'string', 'trace': { 'preProcessingTrace': { 'modelInvocationInput': { 'traceId': 'string', 'text': 'string', 'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING', 'inferenceConfiguration': { 'temperature': ..., 'topP': ..., 'topK': 123, 'maximumLength': 123, 'stopSequences': [ 'string', ] }, 'overrideLambda': 'string', 'promptCreationMode': 'DEFAULT'|'OVERRIDDEN', 'parserMode': 'DEFAULT'|'OVERRIDDEN' }, 'modelInvocationOutput': { 'traceId': 'string', 'parsedResponse': { 'rationale': 'string', 'isValid': True|False } } }, 'orchestrationTrace': { 'rationale': { 'traceId': 'string', 'text': 'string' }, 'invocationInput': { 'traceId': 'string', 'invocationType': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH', 'actionGroupInvocationInput': { 'actionGroupName': 'string', 'verb': 'string', 'apiPath': 'string', 'parameters': [ { 'name': 'string', 'type': 'string', 'value': 'string' }, ], 'requestBody': { 'content': { 'string': [ { 'name': 'string', 'type': 'string', 'value': 'string' }, ] } } }, 'knowledgeBaseLookupInput': { 'text': 'string', 'knowledgeBaseId': 'string' } }, 'observation': { 'traceId': 'string', 'type': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH'|'ASK_USER'|'REPROMPT', 'actionGroupInvocationOutput': { 'text': 'string' }, 'knowledgeBaseLookupOutput': { 'retrievedReferences': [ { 'content': { 'text': 'string' }, 'location': { 'type': 'S3', 's3Location': { 'uri': 'string' } } }, ] }, 'finalResponse': { 'text': 'string' }, 'repromptResponse': { 'text': 'string', 'source': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'PARSER' } }, 'modelInvocationInput': { 'traceId': 'string', 'text': 'string', 'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING', 'inferenceConfiguration': { 'temperature': ..., 'topP': ..., 'topK': 123, 'maximumLength': 123, 'stopSequences': [ 'string', ] }, 'overrideLambda': 'string', 'promptCreationMode': 'DEFAULT'|'OVERRIDDEN', 'parserMode': 'DEFAULT'|'OVERRIDDEN' } }, 'postProcessingTrace': { 'modelInvocationInput': { 'traceId': 'string', 'text': 'string', 'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING', 'inferenceConfiguration': { 'temperature': ..., 'topP': ..., 'topK': 123, 'maximumLength': 123, 'stopSequences': [ 'string', ] }, 'overrideLambda': 'string', 'promptCreationMode': 'DEFAULT'|'OVERRIDDEN', 'parserMode': 'DEFAULT'|'OVERRIDDEN' }, 'modelInvocationOutput': { 'traceId': 'string', 'parsedResponse': { 'text': 'string' } } }, 'failureTrace': { 'traceId': 'string', 'failureReason': 'string' } } }, 'internalServerException': { 'message': 'string' }, 'validationException': { 'message': 'string' }, 'resourceNotFoundException': { 'message': 'string' }, 'serviceQuotaExceededException': { 'message': 'string' }, 'throttlingException': { 'message': 'string' }, 'accessDeniedException': { 'message': 'string' }, 'conflictException': { 'message': 'string' }, 'dependencyFailedException': { 'message': 'string', 'resourceName': 'string' }, 'badGatewayException': { 'message': 'string', 'resourceName': 'string' } }), 'contentType': 'string', 'sessionId': 'string' }
Response Structure
(dict) --
InvokeAgent Response
completion (:class:`.EventStream`) --
Inference response from the model in the format specified in the Content-Type response header.
chunk (dict) --
Base 64 endoded byte response
bytes (bytes) --
PartBody of the payload in bytes
attribution (dict) --
Citations associated with final agent response
citations (list) --
List of citations
(dict) --
Citation associated with the agent response
generatedResponsePart (dict) --
Generate response part
textResponsePart (dict) --
Text response part
text (string) --
Response part in text
span (dict) --
Span of text
start (integer) --
Start of span
end (integer) --
End of span
retrievedReferences (list) --
list of retrieved references
(dict) --
Retrieved reference
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
trace (dict) --
Trace Part which contains intermidate response for customer
agentId (string) --
Identifier of the agent.
agentAliasId (string) --
Identifier of the agent alias.
sessionId (string) --
Identifier of the session.
trace (dict) --
Trace contains intermidate response for customer
preProcessingTrace (dict) --
Trace Part which contains information related to preprocessing step
modelInvocationInput (dict) --
Trace Part which contains information used to call Invoke Model
traceId (string) --
Identifier for trace
text (string) --
Prompt Message
type (string) --
types of prompts
inferenceConfiguration (dict) --
Configurations for controlling the inference response of an InvokeAgent API call
temperature (float) --
Controls randomness, higher values increase diversity
topP (float) --
Cumulative probability cutoff for token selection
topK (integer) --
Sample from the k most likely next tokens
maximumLength (integer) --
Maximum length of output
stopSequences (list) --
List of stop sequences
(string) --
overrideLambda (string) --
ARN of a Lambda.
promptCreationMode (string) --
indicates if agent uses default prompt or overriden prompt
parserMode (string) --
indicates if agent uses default prompt or overriden prompt
modelInvocationOutput (dict) --
Trace Part which contains information related to preprocessing
traceId (string) --
Identifier for trace
parsedResponse (dict) --
Trace Part which contains information if preprocessing was successful
rationale (string) --
Agent Trace Rationale String
isValid (boolean) --
Boolean value
orchestrationTrace (dict) --
Trace contains intermidate response during orchestration
rationale (dict) --
Trace Part which contains information related to reasoning
traceId (string) --
Identifier for trace
text (string) --
Agent Trace Rationale String
invocationInput (dict) --
Trace Part which contains input details for action group or knowledge base
traceId (string) --
Identifier for trace
invocationType (string) --
types of invocations
actionGroupInvocationInput (dict) --
input to lambda used in action group
actionGroupName (string) --
Agent Trace Action Group Name
verb (string) --
Agent Trace Action Group Action verb
apiPath (string) --
Agent Trace Action Group API path
parameters (list) --
list of parameters included in action group invocation
(dict) --
parameters included in action group invocation
name (string) --
Name of parameter
type (string) --
Type of parameter
value (string) --
Value of parameter
requestBody (dict) --
Request Body Content Map
content (dict) --
Content type paramter map
(string) --
(list) --
list of parameters included in action group invocation
(dict) --
parameters included in action group invocation
name (string) --
Name of parameter
type (string) --
Type of parameter
value (string) --
Value of parameter
knowledgeBaseLookupInput (dict) --
Input to lambda used in action group
text (string) --
Agent Trace Action Group Lambda Invocation Output String
knowledgeBaseId (string) --
Agent Trace Action Group Knowledge Base Id
observation (dict) --
Trace Part which contains output details for action group or knowledge base or final response
traceId (string) --
Identifier for trace
type (string) --
types of observations
actionGroupInvocationOutput (dict) --
output from lambda used in action group
text (string) --
Agent Trace Action Group Lambda Invocation Output String
knowledgeBaseLookupOutput (dict) --
Input to lambda used in action group
retrievedReferences (list) --
list of retrieved references
(dict) --
Retrieved reference
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
finalResponse (dict) --
Agent finish output
text (string) --
Agent Trace Action Group Lambda Invocation Output String
repromptResponse (dict) --
Observation information if there were reprompts
text (string) --
Reprompt response text
source (string) --
Parsing error source
modelInvocationInput (dict) --
Trace Part which contains information used to call Invoke Model
traceId (string) --
Identifier for trace
text (string) --
Prompt Message
type (string) --
types of prompts
inferenceConfiguration (dict) --
Configurations for controlling the inference response of an InvokeAgent API call
temperature (float) --
Controls randomness, higher values increase diversity
topP (float) --
Cumulative probability cutoff for token selection
topK (integer) --
Sample from the k most likely next tokens
maximumLength (integer) --
Maximum length of output
stopSequences (list) --
List of stop sequences
(string) --
overrideLambda (string) --
ARN of a Lambda.
promptCreationMode (string) --
indicates if agent uses default prompt or overriden prompt
parserMode (string) --
indicates if agent uses default prompt or overriden prompt
postProcessingTrace (dict) --
Trace Part which contains information related to post processing step
modelInvocationInput (dict) --
Trace Part which contains information used to call Invoke Model
traceId (string) --
Identifier for trace
text (string) --
Prompt Message
type (string) --
types of prompts
inferenceConfiguration (dict) --
Configurations for controlling the inference response of an InvokeAgent API call
temperature (float) --
Controls randomness, higher values increase diversity
topP (float) --
Cumulative probability cutoff for token selection
topK (integer) --
Sample from the k most likely next tokens
maximumLength (integer) --
Maximum length of output
stopSequences (list) --
List of stop sequences
(string) --
overrideLambda (string) --
ARN of a Lambda.
promptCreationMode (string) --
indicates if agent uses default prompt or overriden prompt
parserMode (string) --
indicates if agent uses default prompt or overriden prompt
modelInvocationOutput (dict) --
Trace Part which contains information related to postprocessing
traceId (string) --
Identifier for trace
parsedResponse (dict) --
Trace Part which contains information if preprocessing was successful
text (string) --
Agent Trace Output String
failureTrace (dict) --
Trace Part which is emitted when agent trace could not be generated
traceId (string) --
Identifier for trace
failureReason (string) --
Agent Trace Failed Reason String
internalServerException (dict) --
This exception is thrown if there was an unexpected error during processing of request
message (string) --
Non Blank String
validationException (dict) --
This exception is thrown when the request's input validation fails
message (string) --
Non Blank String
resourceNotFoundException (dict) --
This exception is thrown when a resource referenced by the operation does not exist
message (string) --
Non Blank String
serviceQuotaExceededException (dict) --
This exception is thrown when a request is made beyond the service quota
message (string) --
Non Blank String
throttlingException (dict) --
This exception is thrown when the number of requests exceeds the limit
message (string) --
Non Blank String
accessDeniedException (dict) --
This exception is thrown when a request is denied per access permissions
message (string) --
Non Blank String
conflictException (dict) --
This exception is thrown when there is a conflict performing an operation
message (string) --
Non Blank String
dependencyFailedException (dict) --
This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource due to a customer fault (i.e. bad configuration)
message (string) --
Non Blank String
resourceName (string) --
Non Blank String
badGatewayException (dict) --
This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource
message (string) --
Non Blank String
resourceName (string) --
Non Blank String
contentType (string) --
streaming response mimetype of the model
sessionId (string) --
streaming response mimetype of the model
RetrieveAndGenerate API
See also: AWS API Documentation
Request Syntax
client.retrieve_and_generate( sessionId='string', input={ 'text': 'string' }, retrieveAndGenerateConfiguration={ 'type': 'KNOWLEDGE_BASE', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'string', 'modelArn': 'string' } }, sessionConfiguration={ 'kmsKeyArn': 'string' } )
string
Identifier of the session.
dict
[REQUIRED]
Customer input of the turn
text (string) -- [REQUIRED]
Customer input of the turn in text
dict
Configures the retrieval and generation for the session.
type (string) -- [REQUIRED]
The type of RetrieveAndGenerate.
knowledgeBaseConfiguration (dict) --
Configurations for retrieval and generation for knowledge base.
knowledgeBaseId (string) -- [REQUIRED]
Identifier of the KnowledgeBase
modelArn (string) -- [REQUIRED]
Arn of a Bedrock model.
dict
Configures common parameters of the session.
kmsKeyArn (string) -- [REQUIRED]
The KMS key arn to encrypt the customer data of the session.
dict
Response Syntax
{ 'sessionId': 'string', 'output': { 'text': 'string' }, 'citations': [ { 'generatedResponsePart': { 'textResponsePart': { 'text': 'string', 'span': { 'start': 123, 'end': 123 } } }, 'retrievedReferences': [ { 'content': { 'text': 'string' }, 'location': { 'type': 'S3', 's3Location': { 'uri': 'string' } } }, ] }, ] }
Response Structure
(dict) --
sessionId (string) --
Identifier of the session.
output (dict) --
Service response of the turn
text (string) --
Service response of the turn in text
citations (list) --
List of citations
(dict) --
Citation associated with the agent response
generatedResponsePart (dict) --
Generate response part
textResponsePart (dict) --
Text response part
text (string) --
Response part in text
span (dict) --
Span of text
start (integer) --
Start of span
end (integer) --
End of span
retrievedReferences (list) --
list of retrieved references
(dict) --
Retrieved reference
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location