Amazon Transcribe Service

2018/07/09 - Amazon Transcribe Service - 2 updated api methods

Changes  You can now specify an Amazon S3 output bucket to store the transcription of your audio file when you call the StartTranscriptionJob operation.

ListTranscriptionJobs (updated) Link ¶
Changes (response)
{'TranscriptionJobSummaries': {'OutputLocationType': 'CUSTOMER_BUCKET | '
                                                     'SERVICE_BUCKET'}}

Lists transcription jobs with the specified status.

See also: AWS API Documentation

Request Syntax

client.list_transcription_jobs(
    Status='IN_PROGRESS'|'FAILED'|'COMPLETED',
    JobNameContains='string',
    NextToken='string',
    MaxResults=123
)
type Status

string

param Status

When specified, returns only transcription jobs with the specified status.

type JobNameContains

string

param JobNameContains

When specified, the jobs returned in the list are limited to jobs whose name contains the specified string.

type NextToken

string

param NextToken

If the result of the previous request to ListTranscriptionJobs was truncated, include the NextToken to fetch the next set of jobs.

type MaxResults

integer

param MaxResults

The maximum number of jobs to return in the response. If there are fewer results in the list, this response contains only the actual results.

rtype

dict

returns

Response Syntax

{
    'Status': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'TranscriptionJobSummaries': [
        {
            'TranscriptionJobName': 'string',
            'CreationTime': datetime(2015, 1, 1),
            'CompletionTime': datetime(2015, 1, 1),
            'LanguageCode': 'en-US'|'es-US',
            'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
            'FailureReason': 'string',
            'OutputLocationType': 'CUSTOMER_BUCKET'|'SERVICE_BUCKET'
        },
    ]
}

Response Structure

  • (dict) --

    • Status (string) --

      The requested status of the jobs returned.

    • NextToken (string) --

      The ListTranscriptionJobs operation returns a page of jobs at a time. The maximum size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListTranscriptionJobs operation to return in the next page of jobs.

    • TranscriptionJobSummaries (list) --

      A list of objects containing summary information for a transcription job.

      • (dict) --

        Provides a summary of information about a transcription job.

        • TranscriptionJobName (string) --

          The name of the transcription job.

        • CreationTime (datetime) --

          A timestamp that shows when the job was created.

        • CompletionTime (datetime) --

          A timestamp that shows when the job was completed.

        • LanguageCode (string) --

          The language code for the input speech.

        • TranscriptionJobStatus (string) --

          The status of the transcription job. When the status is COMPLETED , use the GetTranscriptionJob operation to get the results of the transcription.

        • FailureReason (string) --

          If the TranscriptionJobStatus field is FAILED , a description of the error.

        • OutputLocationType (string) --

          Indicates the location of the output of the transcription job.

          If the value is CUSTOMER_BUCKET then the location is the S3 bucket specified in the outputBucketName field when the transcription job was started with the StartTranscriptionJob operation.

          If the value is SERVICE_BUCKET then the output is stored by Amazon Transcribe and can be retrieved using the URI in the GetTranscriptionJob response's TranscriptFileUri field.

StartTranscriptionJob (updated) Link ¶
Changes (request)
{'OutputBucketName': 'string'}

Starts an asynchronous job to transcribe speech to text.

See also: AWS API Documentation

Request Syntax

client.start_transcription_job(
    TranscriptionJobName='string',
    LanguageCode='en-US'|'es-US',
    MediaSampleRateHertz=123,
    MediaFormat='mp3'|'mp4'|'wav'|'flac',
    Media={
        'MediaFileUri': 'string'
    },
    OutputBucketName='string',
    Settings={
        'VocabularyName': 'string',
        'ShowSpeakerLabels': True|False,
        'MaxSpeakerLabels': 123
    }
)
type TranscriptionJobName

string

param TranscriptionJobName

[REQUIRED]

The name of the job. You can't use the strings "." or ".." in the job name. The name must be unique within an AWS account.

type LanguageCode

string

param LanguageCode

[REQUIRED]

The language code for the language used in the input media file.

type MediaSampleRateHertz

integer

param MediaSampleRateHertz

The sample rate, in Hertz, of the audio track in the input media file.

type MediaFormat

string

param MediaFormat

[REQUIRED]

The format of the input media file.

type Media

dict

param Media

[REQUIRED]

An object that describes the input media for a transcription job.

  • MediaFileUri (string) --

    The S3 location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:

    https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

    For example:

    https://s3-us-east-1.amazonaws.com/examplebucket/example.mp4

    https://s3-us-east-1.amazonaws.com/examplebucket/mediadocs/example.mp4

    For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .

type OutputBucketName

string

param OutputBucketName

The location where the transcription is stored.

If you set the OutputBucketName , Amazon Transcribe puts the transcription in the specified S3 bucket. When you call the GetTranscriptionJob operation, the operation returns this location in the TranscriptFileUri field. The S3 bucket must have permissions that allow Amazon Transcribe to put files in the bucket. For more information, see Permissions Required for IAM User Roles .

If you don't set the OutputBucketName , Amazon Transcribe generates a pre-signed URL, a shareable URL that provides secure access to your transcription, and returns it in the TranscriptFileUri field. Use this URL to download the transcription.

type Settings

dict

param Settings

A Settings object that provides optional settings for a transcription job.

  • VocabularyName (string) --

    The name of a vocabulary to use when processing the transcription job.

  • ShowSpeakerLabels (boolean) --

    Determines whether the transcription job should use speaker recognition to identify different speakers in the input audio. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.

  • MaxSpeakerLabels (integer) --

    The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers will be identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.

rtype

dict

returns

Response Syntax

{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'VocabularyName': 'string',
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123
        }
    }
}

Response Structure

  • (dict) --

    • TranscriptionJob (dict) --

      An object containing details of the asynchronous transcription job.

      • TranscriptionJobName (string) --

        The name of the transcription job.

      • TranscriptionJobStatus (string) --

        The status of the transcription job.

      • LanguageCode (string) --

        The language code for the input speech.

      • MediaSampleRateHertz (integer) --

        The sample rate, in Hertz, of the audio track in the input media file.

      • MediaFormat (string) --

        The format of the input media file.

      • Media (dict) --

        An object that describes the input media for the transcription job.

        • MediaFileUri (string) --

          The S3 location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-east-1.amazonaws.com/examplebucket/example.mp4

          https://s3-us-east-1.amazonaws.com/examplebucket/mediadocs/example.mp4

          For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .

      • Transcript (dict) --

        An object that describes the output of the transcription job.

        • TranscriptFileUri (string) --

          The location where the transcription is stored.

          Use this URI to access the transcription. If you specified an S3 bucket in the OutputBucketName field when you created the job, this is the URI of that bucket. If you chose to store the transcription in Amazon Transcribe, this is a shareable URL that provides secure access to that location.

      • CreationTime (datetime) --

        A timestamp that shows when the job was created.

      • CompletionTime (datetime) --

        A timestamp that shows when the job was completed.

      • FailureReason (string) --

        If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.

      • Settings (dict) --

        Optional settings for the transcription job. Use these settings to turn on speaker recognition, to set the maximum number of speakers that should be identified and to specify a custom vocabulary to use when processing the transcription job.

        • VocabularyName (string) --

          The name of a vocabulary to use when processing the transcription job.

        • ShowSpeakerLabels (boolean) --

          Determines whether the transcription job should use speaker recognition to identify different speakers in the input audio. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.

        • MaxSpeakerLabels (integer) --

          The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers will be identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.