Amazon Bedrock Runtime

2026/06/15 - Amazon Bedrock Runtime - 1 new api methods

Changes  InvokeGuardrailChecks API evaluates prompts and responses against safety checks (content filters, prompt attacks, sensitive info) without creating guardrail resources. It's a detect-only API, returning numeric scores so you can build adaptive logic as per your application.

InvokeGuardrailChecks (new) Link ΒΆ

Evaluates messages against inline guardrail checks. You specify the check configurations directly in the request, and Amazon Bedrock returns per-check results with severity or confidence scores.

See also: AWS API Documentation

Request Syntax

client.invoke_guardrail_checks(
    messages=[
        {
            'role': 'user'|'assistant'|'system',
            'content': [
                {
                    'text': 'string'
                },
            ]
        },
    ],
    checks={
        'contentFilter': {
            'categories': [
                {
                    'category': 'VIOLENCE'|'HATE'|'SEXUAL'|'MISCONDUCT'|'INSULTS'
                },
            ]
        },
        'promptAttack': {
            'categories': [
                {
                    'category': 'JAILBREAK'|'PROMPT_INJECTION'|'PROMPT_LEAKAGE'
                },
            ]
        },
        'sensitiveInformation': {
            'entities': [
                {
                    'type': 'ADDRESS'|'AGE'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'CA_HEALTH_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'CREDIT_DEBIT_CARD_CVV'|'CREDIT_DEBIT_CARD_EXPIRY'|'CREDIT_DEBIT_CARD_NUMBER'|'DRIVER_ID'|'EMAIL'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'IP_ADDRESS'|'LICENSE_PLATE'|'MAC_ADDRESS'|'NAME'|'PASSWORD'|'PHONE'|'PIN'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'URL'|'USERNAME'|'US_BANK_ACCOUNT_NUMBER'|'US_BANK_ROUTING_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'US_PASSPORT_NUMBER'|'US_SOCIAL_SECURITY_NUMBER'|'VEHICLE_IDENTIFICATION_NUMBER'
                },
            ]
        }
    }
)
type messages:

list

param messages:

[REQUIRED]

The messages to evaluate against the specified guardrail checks. Each message includes a role and one or more content blocks.

  • (dict) --

    A message to evaluate against guardrail checks, containing a role and content blocks.

    • role (string) -- [REQUIRED]

      The role of the message sender.

    • content (list) -- [REQUIRED]

      The content blocks for the message.

      • (dict) --

        A content block within a message to evaluate.

        • text (string) --

          The text content to evaluate.

type checks:

dict

param checks:

[REQUIRED]

The inline check configurations that specify which guardrail checks to run against the messages.

  • contentFilter (dict) --

    The content filter check configuration.

    • categories (list) -- [REQUIRED]

      The content filter categories to evaluate.

      • (dict) --

        The configuration for a single content filter category to evaluate.

        • category (string) -- [REQUIRED]

          The content filter category to evaluate.

  • promptAttack (dict) --

    The prompt attack check configuration.

    • categories (list) -- [REQUIRED]

      The prompt attack categories to evaluate.

      • (dict) --

        The configuration for a single prompt attack category to evaluate.

        • category (string) -- [REQUIRED]

          The prompt attack category to evaluate.

  • sensitiveInformation (dict) --

    The sensitive information check configuration.

    • entities (list) -- [REQUIRED]

      The sensitive information entity types to detect.

      • (dict) --

        The configuration for a single sensitive information entity type to detect.

        • type (string) -- [REQUIRED]

          The PII entity type to detect.

rtype:

dict

returns:

Response Syntax

{
    'results': {
        'contentFilter': {
            'results': [
                {
                    'category': 'VIOLENCE'|'HATE'|'SEXUAL'|'MISCONDUCT'|'INSULTS',
                    'severityScore': 123.0
                },
            ]
        },
        'promptAttack': {
            'results': [
                {
                    'category': 'JAILBREAK'|'PROMPT_INJECTION'|'PROMPT_LEAKAGE',
                    'severityScore': 123.0
                },
            ]
        },
        'sensitiveInformation': {
            'results': [
                {
                    'type': 'ADDRESS'|'AGE'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'CA_HEALTH_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'CREDIT_DEBIT_CARD_CVV'|'CREDIT_DEBIT_CARD_EXPIRY'|'CREDIT_DEBIT_CARD_NUMBER'|'DRIVER_ID'|'EMAIL'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'IP_ADDRESS'|'LICENSE_PLATE'|'MAC_ADDRESS'|'NAME'|'PASSWORD'|'PHONE'|'PIN'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'URL'|'USERNAME'|'US_BANK_ACCOUNT_NUMBER'|'US_BANK_ROUTING_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'US_PASSPORT_NUMBER'|'US_SOCIAL_SECURITY_NUMBER'|'VEHICLE_IDENTIFICATION_NUMBER',
                    'confidenceScore': 123.0,
                    'beginOffset': 123,
                    'endOffset': 123,
                    'messageIndex': 123,
                    'contentIndex': 123
                },
            ],
            'truncated': True|False
        }
    },
    'usage': {
        'contentFilter': {
            'textUnits': 123
        },
        'promptAttack': {
            'textUnits': 123
        },
        'sensitiveInformation': {
            'textUnits': 123
        }
    }
}

Response Structure

  • (dict) --

    • results (dict) --

      The per-check results containing findings from the guardrail evaluation.

      • contentFilter (dict) --

        The content filter check results.

        • results (list) --

          The per-category content filter results.

          • (dict) --

            The evaluation result for a single content filter category.

            • category (string) --

              The content filter category that was evaluated.

            • severityScore (float) --

              The severity score for the category, ranging from 0.0 to 1.0. Higher values indicate greater severity.

      • promptAttack (dict) --

        The prompt attack check results.

        • results (list) --

          The per-category prompt attack results.

          • (dict) --

            The evaluation result for a single prompt attack category.

            • category (string) --

              The prompt attack category that was evaluated.

            • severityScore (float) --

              The severity score for the category, ranging from 0.0 to 1.0. Higher values indicate greater severity.

      • sensitiveInformation (dict) --

        The sensitive information check results.

        • results (list) --

          The detected sensitive information entities.

          • (dict) --

            The detection result for a single sensitive information entity found in the evaluated messages.

            • type (string) --

              The PII entity type that was detected.

            • confidenceScore (float) --

              The confidence score for the detection, ranging from 0.0 to 1.0. Higher values indicate greater confidence.

            • beginOffset (integer) --

              The start character offset of the detected entity within the content block.

            • endOffset (integer) --

              The end character offset of the detected entity within the content block.

            • messageIndex (integer) --

              The zero-based index of the message in the input messages array where the entity was detected.

            • contentIndex (integer) --

              The zero-based index of the content block within the message where the entity was detected.

        • truncated (boolean) --

          Specifies whether the results were truncated because the number of detected entities exceeded the maximum limit.

    • usage (dict) --

      The per-check text unit consumption for the guardrail evaluation.

      • contentFilter (dict) --

        The text unit usage for the content filter check.

        • textUnits (integer) --

          The number of text units consumed by the content filter check.

      • promptAttack (dict) --

        The text unit usage for the prompt attack check.

        • textUnits (integer) --

          The number of text units consumed by the prompt attack check.

      • sensitiveInformation (dict) --

        The text unit usage for the sensitive information check.

        • textUnits (integer) --

          The number of text units consumed by the sensitive information check.