AWS API Changes

2023/11/28 - Agents for Amazon Bedrock Runtime - 3 new api methods

Changes This release adds support for customization types, model life cycle status and minor versions/aliases for model identifiers.

Retrieve (new)

Link ¶

Retrieve from knowledge base.

See also: AWS API Documentation

Request Syntax

client.invoke_agent(
    sessionState={
        'sessionAttributes': {
            'string': 'string'
        },
        'promptSessionAttributes': {
            'string': 'string'
        }
    },
    agentId='string',
    agentAliasId='string',
    sessionId='string',
    endSession=True|False,
    enableTrace=True|False,
    inputText='string'
)

type sessionState

dict

param sessionState

Session state passed by customer. Base64 encoded json string representation of SessionState.

sessionAttributes (dict) --

Session Attributes
- (string) --
  - (string) --
promptSessionAttributes (dict) --

Prompt Session Attributes
- (string) --
  - (string) --

type agentId

string

param agentId

[REQUIRED]

Identifier for Agent

type agentAliasId

string

param agentAliasId

[REQUIRED]

Identifier for Agent Alias

type sessionId

string

param sessionId

[REQUIRED]

Identifier used for the current session

type endSession

boolean

param endSession

End current session

type enableTrace

boolean

param enableTrace

Enable agent trace events for improved debugging

type inputText

string

param inputText

[REQUIRED]

Input data in the format specified in the Content-Type request header.

rtype

dict

returns

The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.

Response Syntax

{
    'completion': EventStream({
        'chunk': {
            'bytes': b'bytes',
            'attribution': {
                'citations': [
                    {
                        'generatedResponsePart': {
                            'textResponsePart': {
                                'text': 'string',
                                'span': {
                                    'start': 123,
                                    'end': 123
                                }
                            }
                        },
                        'retrievedReferences': [
                            {
                                'content': {
                                    'text': 'string'
                                },
                                'location': {
                                    'type': 'S3',
                                    's3Location': {
                                        'uri': 'string'
                                    }
                                }
                            },
                        ]
                    },
                ]
            }
        },
        'trace': {
            'agentId': 'string',
            'agentAliasId': 'string',
            'sessionId': 'string',
            'trace': {
                'preProcessingTrace': {
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    },
                    'modelInvocationOutput': {
                        'traceId': 'string',
                        'parsedResponse': {
                            'rationale': 'string',
                            'isValid': True|False
                        }
                    }
                },
                'orchestrationTrace': {
                    'rationale': {
                        'traceId': 'string',
                        'text': 'string'
                    },
                    'invocationInput': {
                        'traceId': 'string',
                        'invocationType': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH',
                        'actionGroupInvocationInput': {
                            'actionGroupName': 'string',
                            'verb': 'string',
                            'apiPath': 'string',
                            'parameters': [
                                {
                                    'name': 'string',
                                    'type': 'string',
                                    'value': 'string'
                                },
                            ],
                            'requestBody': {
                                'content': {
                                    'string': [
                                        {
                                            'name': 'string',
                                            'type': 'string',
                                            'value': 'string'
                                        },
                                    ]
                                }
                            }
                        },
                        'knowledgeBaseLookupInput': {
                            'text': 'string',
                            'knowledgeBaseId': 'string'
                        }
                    },
                    'observation': {
                        'traceId': 'string',
                        'type': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH'|'ASK_USER'|'REPROMPT',
                        'actionGroupInvocationOutput': {
                            'text': 'string'
                        },
                        'knowledgeBaseLookupOutput': {
                            'retrievedReferences': [
                                {
                                    'content': {
                                        'text': 'string'
                                    },
                                    'location': {
                                        'type': 'S3',
                                        's3Location': {
                                            'uri': 'string'
                                        }
                                    }
                                },
                            ]
                        },
                        'finalResponse': {
                            'text': 'string'
                        },
                        'repromptResponse': {
                            'text': 'string',
                            'source': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'PARSER'
                        }
                    },
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    }
                },
                'postProcessingTrace': {
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    },
                    'modelInvocationOutput': {
                        'traceId': 'string',
                        'parsedResponse': {
                            'text': 'string'
                        }
                    }
                },
                'failureTrace': {
                    'traceId': 'string',
                    'failureReason': 'string'
                }
            }
        },
        'internalServerException': {
            'message': 'string'
        },
        'validationException': {
            'message': 'string'
        },
        'resourceNotFoundException': {
            'message': 'string'
        },
        'serviceQuotaExceededException': {
            'message': 'string'
        },
        'throttlingException': {
            'message': 'string'
        },
        'accessDeniedException': {
            'message': 'string'
        },
        'conflictException': {
            'message': 'string'
        },
        'dependencyFailedException': {
            'message': 'string',
            'resourceName': 'string'
        },
        'badGatewayException': {
            'message': 'string',
            'resourceName': 'string'
        }
    }),
    'contentType': 'string',
    'sessionId': 'string'
}

Response Structure

(dict) --

InvokeAgent Response
- completion (:class:`.EventStream`) --
  
  Inference response from the model in the format specified in the Content-Type response header.
  - chunk (dict) --
    
    Base 64 endoded byte response
    - bytes (bytes) --
      
      PartBody of the payload in bytes
    - attribution (dict) --
      
      Citations associated with final agent response
      - citations (list) --
        
        List of citations
        
        (dict) --
        
        Citation associated with the agent response
        
        generatedResponsePart (dict) --
        
        Generate response part
        
        textResponsePart (dict) --
        
        Text response part
        
        text (string) --
        
        Response part in text
        
        span (dict) --
        
        Span of text
        
        start (integer) --
        
        Start of span
        
        end (integer) --
        
        End of span
        
        retrievedReferences (list) --
        
        list of retrieved references
        
        (dict) --
        
        Retrieved reference
        
        content (dict) --
        
        Content of a retrieval result.
        
        text (string) --
        
        Content of a retrieval result in text
        
        location (dict) --
        
        The source location of a retrieval result.
        
        type (string) --
        
        The location type of a retrieval result.
        
        s3Location (dict) --
        
        The S3 location of a retrieval result.
        
        uri (string) --
        
        URI of S3 location
  - trace (dict) --
    
    Trace Part which contains intermidate response for customer
    - agentId (string) --
      
      Identifier of the agent.
    - agentAliasId (string) --
      
      Identifier of the agent alias.
    - sessionId (string) --
      
      Identifier of the session.
    - trace (dict) --
      
      Trace contains intermidate response for customer
      - preProcessingTrace (dict) --
        
        Trace Part which contains information related to preprocessing step
        
        modelInvocationInput (dict) --
        
        Trace Part which contains information used to call Invoke Model
        
        traceId (string) --
        
        Identifier for trace
        
        text (string) --
        
        Prompt Message
        
        type (string) --
        
        types of prompts
        
        inferenceConfiguration (dict) --
        
        Configurations for controlling the inference response of an InvokeAgent API call
        
        temperature (float) --
        
        Controls randomness, higher values increase diversity
        
        topP (float) --
        
        Cumulative probability cutoff for token selection
        
        topK (integer) --
        
        Sample from the k most likely next tokens
        
        maximumLength (integer) --
        
        Maximum length of output
        
        stopSequences (list) --
        
        List of stop sequences
        
        (string) --
        
        overrideLambda (string) --
        
        ARN of a Lambda.
        
        promptCreationMode (string) --
        
        indicates if agent uses default prompt or overriden prompt
        
        parserMode (string) --
        
        indicates if agent uses default prompt or overriden prompt
        
        modelInvocationOutput (dict) --
        
        Trace Part which contains information related to preprocessing
        
        traceId (string) --
        
        Identifier for trace
        
        parsedResponse (dict) --
        
        Trace Part which contains information if preprocessing was successful
        
        rationale (string) --
        
        Agent Trace Rationale String
        
        isValid (boolean) --
        
        Boolean value
      - orchestrationTrace (dict) --
        
        Trace contains intermidate response during orchestration
        
        rationale (dict) --
        
        Trace Part which contains information related to reasoning
        
        traceId (string) --
        
        Identifier for trace
        
        text (string) --
        
        Agent Trace Rationale String
        
        invocationInput (dict) --
        
        Trace Part which contains input details for action group or knowledge base
        
        traceId (string) --
        
        Identifier for trace
        
        invocationType (string) --
        
        types of invocations
        
        actionGroupInvocationInput (dict) --
        
        input to lambda used in action group
        
        actionGroupName (string) --
        
        Agent Trace Action Group Name
        
        verb (string) --
        
        Agent Trace Action Group Action verb
        
        apiPath (string) --
        
        Agent Trace Action Group API path
        
        parameters (list) --
        
        list of parameters included in action group invocation
        
        (dict) --
        
        parameters included in action group invocation
        
        name (string) --
        
        Name of parameter
        
        type (string) --
        
        Type of parameter
        
        value (string) --
        
        Value of parameter
        
        requestBody (dict) --
        
        Request Body Content Map
        
        content (dict) --
        
        Content type paramter map
        
        (string) --
        
        (list) --
        
        list of parameters included in action group invocation
        
        (dict) --
        
        parameters included in action group invocation
        
        name (string) --
        
        Name of parameter
        
        type (string) --
        
        Type of parameter
        
        value (string) --
        
        Value of parameter
        
        knowledgeBaseLookupInput (dict) --
        
        Input to lambda used in action group
        
        text (string) --
        
        Agent Trace Action Group Lambda Invocation Output String
        
        knowledgeBaseId (string) --
        
        Agent Trace Action Group Knowledge Base Id
        
        observation (dict) --
        
        Trace Part which contains output details for action group or knowledge base or final response
        
        traceId (string) --
        
        Identifier for trace
        
        type (string) --
        
        types of observations
        
        actionGroupInvocationOutput (dict) --
        
        output from lambda used in action group
        
        text (string) --
        
        Agent Trace Action Group Lambda Invocation Output String
        
        knowledgeBaseLookupOutput (dict) --
        
        Input to lambda used in action group
        
        retrievedReferences (list) --
        
        list of retrieved references
        
        (dict) --
        
        Retrieved reference
        
        content (dict) --
        
        Content of a retrieval result.
        
        text (string) --
        
        Content of a retrieval result in text
        
        location (dict) --
        
        The source location of a retrieval result.
        
        type (string) --
        
        The location type of a retrieval result.
        
        s3Location (dict) --
        
        The S3 location of a retrieval result.
        
        uri (string) --
        
        URI of S3 location
        
        finalResponse (dict) --
        
        Agent finish output
        
        text (string) --
        
        Agent Trace Action Group Lambda Invocation Output String
        
        repromptResponse (dict) --
        
        Observation information if there were reprompts
        
        text (string) --
        
        Reprompt response text
        
        source (string) --
        
        Parsing error source
        
        modelInvocationInput (dict) --
        
        Trace Part which contains information used to call Invoke Model
        
        traceId (string) --
        
        Identifier for trace
        
        text (string) --
        
        Prompt Message
        
        type (string) --
        
        types of prompts
        
        inferenceConfiguration (dict) --
        
        Configurations for controlling the inference response of an InvokeAgent API call
        
        temperature (float) --
        
        Controls randomness, higher values increase diversity
        
        topP (float) --
        
        Cumulative probability cutoff for token selection
        
        topK (integer) --
        
        Sample from the k most likely next tokens
        
        maximumLength (integer) --
        
        Maximum length of output
        
        stopSequences (list) --
        
        List of stop sequences
        
        (string) --
        
        overrideLambda (string) --
        
        ARN of a Lambda.
        
        promptCreationMode (string) --
        
        indicates if agent uses default prompt or overriden prompt
        
        parserMode (string) --
        
        indicates if agent uses default prompt or overriden prompt
      - postProcessingTrace (dict) --
        
        Trace Part which contains information related to post processing step
        
        modelInvocationInput (dict) --
        
        Trace Part which contains information used to call Invoke Model
        
        traceId (string) --
        
        Identifier for trace
        
        text (string) --
        
        Prompt Message
        
        type (string) --
        
        types of prompts
        
        inferenceConfiguration (dict) --
        
        Configurations for controlling the inference response of an InvokeAgent API call
        
        temperature (float) --
        
        Controls randomness, higher values increase diversity
        
        topP (float) --
        
        Cumulative probability cutoff for token selection
        
        topK (integer) --
        
        Sample from the k most likely next tokens
        
        maximumLength (integer) --
        
        Maximum length of output
        
        stopSequences (list) --
        
        List of stop sequences
        
        (string) --
        
        overrideLambda (string) --
        
        ARN of a Lambda.
        
        promptCreationMode (string) --
        
        indicates if agent uses default prompt or overriden prompt
        
        parserMode (string) --
        
        indicates if agent uses default prompt or overriden prompt
        
        modelInvocationOutput (dict) --
        
        Trace Part which contains information related to postprocessing
        
        traceId (string) --
        
        Identifier for trace
        
        parsedResponse (dict) --
        
        Trace Part which contains information if preprocessing was successful
        
        text (string) --
        
        Agent Trace Output String
      - failureTrace (dict) --
        
        Trace Part which is emitted when agent trace could not be generated
        
        traceId (string) --
        
        Identifier for trace
        
        failureReason (string) --
        
        Agent Trace Failed Reason String
  - internalServerException (dict) --
    
    This exception is thrown if there was an unexpected error during processing of request
    - message (string) --
      
      Non Blank String
  - validationException (dict) --
    
    This exception is thrown when the request's input validation fails
    - message (string) --
      
      Non Blank String
  - resourceNotFoundException (dict) --
    
    This exception is thrown when a resource referenced by the operation does not exist
    - message (string) --
      
      Non Blank String
  - serviceQuotaExceededException (dict) --
    
    This exception is thrown when a request is made beyond the service quota
    - message (string) --
      
      Non Blank String
  - throttlingException (dict) --
    
    This exception is thrown when the number of requests exceeds the limit
    - message (string) --
      
      Non Blank String
  - accessDeniedException (dict) --
    
    This exception is thrown when a request is denied per access permissions
    - message (string) --
      
      Non Blank String
  - conflictException (dict) --
    
    This exception is thrown when there is a conflict performing an operation
    - message (string) --
      
      Non Blank String
  - dependencyFailedException (dict) --
    
    This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource due to a customer fault (i.e. bad configuration)
    - message (string) --
      
      Non Blank String
    - resourceName (string) --
      
      Non Blank String
  - badGatewayException (dict) --
    
    This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource
    - message (string) --
      
      Non Blank String
    - resourceName (string) --
      
      Non Blank String
- contentType (string) --
  
  streaming response mimetype of the model
- sessionId (string) --
  
  streaming response mimetype of the model

RetrieveAndGenerate (new)

Link ¶

RetrieveAndGenerate API