2023/05/10 - Amazon EMR - 3 updated api methods
Changes EMR Studio now supports programmatically executing a Notebooks on an EMR on EKS cluster. In addition, notebooks can now be executed by specifying its location in S3.
{'NotebookExecution': {'EnvironmentVariables': {'string': 'string'}, 'ExecutionEngine': {'ExecutionRoleArn': 'string'}, 'NotebookS3Location': {'Bucket': 'string', 'Key': 'string'}, 'OutputNotebookFormat': 'HTML', 'OutputNotebookS3Location': {'Bucket': 'string', 'Key': 'string'}}}
Provides details of a notebook execution.
See also: AWS API Documentation
Request Syntax
client.describe_notebook_execution( NotebookExecutionId='string' )
string
[REQUIRED]
The unique identifier of the notebook execution.
dict
Response Syntax
{ 'NotebookExecution': { 'NotebookExecutionId': 'string', 'EditorId': 'string', 'ExecutionEngine': { 'Id': 'string', 'Type': 'EMR', 'MasterInstanceSecurityGroupId': 'string', 'ExecutionRoleArn': 'string' }, 'NotebookExecutionName': 'string', 'NotebookParams': 'string', 'Status': 'START_PENDING'|'STARTING'|'RUNNING'|'FINISHING'|'FINISHED'|'FAILING'|'FAILED'|'STOP_PENDING'|'STOPPING'|'STOPPED', 'StartTime': datetime(2015, 1, 1), 'EndTime': datetime(2015, 1, 1), 'Arn': 'string', 'OutputNotebookURI': 'string', 'LastStateChangeReason': 'string', 'NotebookInstanceSecurityGroupId': 'string', 'Tags': [ { 'Key': 'string', 'Value': 'string' }, ], 'NotebookS3Location': { 'Bucket': 'string', 'Key': 'string' }, 'OutputNotebookS3Location': { 'Bucket': 'string', 'Key': 'string' }, 'OutputNotebookFormat': 'HTML', 'EnvironmentVariables': { 'string': 'string' } } }
Response Structure
(dict) --
NotebookExecution (dict) --
Properties of the notebook execution.
NotebookExecutionId (string) --
The unique identifier of a notebook execution.
EditorId (string) --
The unique identifier of the Amazon EMR Notebook that is used for the notebook execution.
ExecutionEngine (dict) --
The execution engine, such as an Amazon EMR cluster, used to run the Amazon EMR notebook and perform the notebook execution.
Id (string) --
The unique identifier of the execution engine. For an Amazon EMR cluster, this is the cluster ID.
Type (string) --
The type of execution engine. A value of EMR specifies an Amazon EMR cluster.
MasterInstanceSecurityGroupId (string) --
An optional unique ID of an Amazon EC2 security group to associate with the master instance of the Amazon EMR cluster for this notebook execution. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the EMR Management Guide .
ExecutionRoleArn (string) --
The execution role ARN required for the notebook execution.
NotebookExecutionName (string) --
A name for the notebook execution.
NotebookParams (string) --
Input parameters in JSON format passed to the Amazon EMR Notebook at runtime for execution.
Status (string) --
The status of the notebook execution.
START_PENDING indicates that the cluster has received the execution request but execution has not begun.
STARTING indicates that the execution is starting on the cluster.
RUNNING indicates that the execution is being processed by the cluster.
FINISHING indicates that execution processing is in the final stages.
FINISHED indicates that the execution has completed without error.
FAILING indicates that the execution is failing and will not finish successfully.
FAILED indicates that the execution failed.
STOP_PENDING indicates that the cluster has received a StopNotebookExecution request and the stop is pending.
STOPPING indicates that the cluster is in the process of stopping the execution as a result of a StopNotebookExecution request.
STOPPED indicates that the execution stopped because of a StopNotebookExecution request.
StartTime (datetime) --
The timestamp when notebook execution started.
EndTime (datetime) --
The timestamp when notebook execution ended.
Arn (string) --
The Amazon Resource Name (ARN) of the notebook execution.
OutputNotebookURI (string) --
The location of the notebook execution's output file in Amazon S3.
LastStateChangeReason (string) --
The reason for the latest status change of the notebook execution.
NotebookInstanceSecurityGroupId (string) --
The unique identifier of the Amazon EC2 security group associated with the Amazon EMR Notebook instance. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the Amazon EMR Management Guide .
Tags (list) --
A list of tags associated with a notebook execution. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters and an optional value string with a maximum of 256 characters.
(dict) --
A key-value pair containing user-defined metadata that you can associate with an Amazon EMR resource. Tags make it easier to associate clusters in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters .
Key (string) --
A user-defined key, which is the minimum required information for a valid tag. For more information, see Tag .
Value (string) --
A user-defined value, which is optional in a tag. For more information, see Tag Clusters .
NotebookS3Location (dict) --
The Amazon S3 location that stores the notebook execution input.
Bucket (string) --
The Amazon S3 bucket that stores the notebook execution input.
Key (string) --
The key to the Amazon S3 location that stores the notebook execution input.
OutputNotebookS3Location (dict) --
The Amazon S3 location for the notebook execution output.
Bucket (string) --
The Amazon S3 bucket that stores the notebook execution output.
Key (string) --
The key to the Amazon S3 location that stores the notebook execution output.
OutputNotebookFormat (string) --
The output format for the notebook execution.
EnvironmentVariables (dict) --
The environment variables associated with the notebook execution.
(string) --
(string) --
{'ExecutionEngineId': 'string'}Response
{'NotebookExecutions': {'ExecutionEngineId': 'string', 'NotebookS3Location': {'Bucket': 'string', 'Key': 'string'}}}
Provides summaries of all notebook executions. You can filter the list based on multiple criteria such as status, time range, and editor id. Returns a maximum of 50 notebook executions and a marker to track the paging of a longer notebook execution list across multiple ListNotebookExecutions calls.
See also: AWS API Documentation
Request Syntax
client.list_notebook_executions( EditorId='string', Status='START_PENDING'|'STARTING'|'RUNNING'|'FINISHING'|'FINISHED'|'FAILING'|'FAILED'|'STOP_PENDING'|'STOPPING'|'STOPPED', From=datetime(2015, 1, 1), To=datetime(2015, 1, 1), Marker='string', ExecutionEngineId='string' )
string
The unique ID of the editor associated with the notebook execution.
string
The status filter for listing notebook executions.
START_PENDING indicates that the cluster has received the execution request but execution has not begun.
STARTING indicates that the execution is starting on the cluster.
RUNNING indicates that the execution is being processed by the cluster.
FINISHING indicates that execution processing is in the final stages.
FINISHED indicates that the execution has completed without error.
FAILING indicates that the execution is failing and will not finish successfully.
FAILED indicates that the execution failed.
STOP_PENDING indicates that the cluster has received a StopNotebookExecution request and the stop is pending.
STOPPING indicates that the cluster is in the process of stopping the execution as a result of a StopNotebookExecution request.
STOPPED indicates that the execution stopped because of a StopNotebookExecution request.
datetime
The beginning of time range filter for listing notebook executions. The default is the timestamp of 30 days ago.
datetime
The end of time range filter for listing notebook executions. The default is the current timestamp.
string
The pagination token, returned by a previous ListNotebookExecutions call, that indicates the start of the list for this ListNotebookExecutions call.
string
The unique ID of the execution engine.
dict
Response Syntax
{ 'NotebookExecutions': [ { 'NotebookExecutionId': 'string', 'EditorId': 'string', 'NotebookExecutionName': 'string', 'Status': 'START_PENDING'|'STARTING'|'RUNNING'|'FINISHING'|'FINISHED'|'FAILING'|'FAILED'|'STOP_PENDING'|'STOPPING'|'STOPPED', 'StartTime': datetime(2015, 1, 1), 'EndTime': datetime(2015, 1, 1), 'NotebookS3Location': { 'Bucket': 'string', 'Key': 'string' }, 'ExecutionEngineId': 'string' }, ], 'Marker': 'string' }
Response Structure
(dict) --
NotebookExecutions (list) --
A list of notebook executions.
(dict) --
Details for a notebook execution. The details include information such as the unique ID and status of the notebook execution.
NotebookExecutionId (string) --
The unique identifier of the notebook execution.
EditorId (string) --
The unique identifier of the editor associated with the notebook execution.
NotebookExecutionName (string) --
The name of the notebook execution.
Status (string) --
The status of the notebook execution.
START_PENDING indicates that the cluster has received the execution request but execution has not begun.
STARTING indicates that the execution is starting on the cluster.
RUNNING indicates that the execution is being processed by the cluster.
FINISHING indicates that execution processing is in the final stages.
FINISHED indicates that the execution has completed without error.
FAILING indicates that the execution is failing and will not finish successfully.
FAILED indicates that the execution failed.
STOP_PENDING indicates that the cluster has received a StopNotebookExecution request and the stop is pending.
STOPPING indicates that the cluster is in the process of stopping the execution as a result of a StopNotebookExecution request.
STOPPED indicates that the execution stopped because of a StopNotebookExecution request.
StartTime (datetime) --
The timestamp when notebook execution started.
EndTime (datetime) --
The timestamp when notebook execution started.
NotebookS3Location (dict) --
The Amazon S3 location that stores the notebook execution input.
Bucket (string) --
The Amazon S3 bucket that stores the notebook execution input.
Key (string) --
The key to the Amazon S3 location that stores the notebook execution input.
ExecutionEngineId (string) --
The unique ID of the execution engine for the notebook execution.
Marker (string) --
A pagination token that a subsequent ListNotebookExecutions can use to determine the next set of results to retrieve.
{'EnvironmentVariables': {'string': 'string'}, 'ExecutionEngine': {'ExecutionRoleArn': 'string'}, 'NotebookS3Location': {'Bucket': 'string', 'Key': 'string'}, 'OutputNotebookFormat': 'HTML', 'OutputNotebookS3Location': {'Bucket': 'string', 'Key': 'string'}}
Starts a notebook execution.
See also: AWS API Documentation
Request Syntax
client.start_notebook_execution( EditorId='string', RelativePath='string', NotebookExecutionName='string', NotebookParams='string', ExecutionEngine={ 'Id': 'string', 'Type': 'EMR', 'MasterInstanceSecurityGroupId': 'string', 'ExecutionRoleArn': 'string' }, ServiceRole='string', NotebookInstanceSecurityGroupId='string', Tags=[ { 'Key': 'string', 'Value': 'string' }, ], NotebookS3Location={ 'Bucket': 'string', 'Key': 'string' }, OutputNotebookS3Location={ 'Bucket': 'string', 'Key': 'string' }, OutputNotebookFormat='HTML', EnvironmentVariables={ 'string': 'string' } )
string
The unique identifier of the Amazon EMR Notebook to use for notebook execution.
string
The path and file name of the notebook file for this execution, relative to the path specified for the Amazon EMR Notebook. For example, if you specify a path of s3://MyBucket/MyNotebooks when you create an Amazon EMR Notebook for a notebook with an ID of e-ABCDEFGHIJK1234567890ABCD (the EditorID of this request), and you specify a RelativePath of my_notebook_executions/notebook_execution.ipynb , the location of the file for the notebook execution is s3://MyBucket/MyNotebooks/e-ABCDEFGHIJK1234567890ABCD/my_notebook_executions/notebook_execution.ipynb .
string
An optional name for the notebook execution.
string
Input parameters in JSON format passed to the Amazon EMR Notebook at runtime for execution.
dict
[REQUIRED]
Specifies the execution engine (cluster) that runs the notebook execution.
Id (string) -- [REQUIRED]
The unique identifier of the execution engine. For an Amazon EMR cluster, this is the cluster ID.
Type (string) --
The type of execution engine. A value of EMR specifies an Amazon EMR cluster.
MasterInstanceSecurityGroupId (string) --
An optional unique ID of an Amazon EC2 security group to associate with the master instance of the Amazon EMR cluster for this notebook execution. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the EMR Management Guide .
ExecutionRoleArn (string) --
The execution role ARN required for the notebook execution.
string
[REQUIRED]
The name or ARN of the IAM role that is used as the service role for Amazon EMR (the Amazon EMR role) for the notebook execution.
string
The unique identifier of the Amazon EC2 security group to associate with the Amazon EMR Notebook for this notebook execution.
list
A list of tags associated with a notebook execution. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters and an optional value string with a maximum of 256 characters.
(dict) --
A key-value pair containing user-defined metadata that you can associate with an Amazon EMR resource. Tags make it easier to associate clusters in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters .
Key (string) --
A user-defined key, which is the minimum required information for a valid tag. For more information, see Tag .
Value (string) --
A user-defined value, which is optional in a tag. For more information, see Tag Clusters .
dict
The Amazon S3 location for the notebook execution input.
Bucket (string) --
The Amazon S3 bucket that stores the notebook execution input.
Key (string) --
The key to the Amazon S3 location that stores the notebook execution input.
dict
The Amazon S3 location for the notebook execution output.
Bucket (string) --
The Amazon S3 bucket that stores the notebook execution output.
Key (string) --
The key to the Amazon S3 location that stores the notebook execution output.
string
The output format for the notebook execution.
dict
The environment variables associated with the notebook execution.
(string) --
(string) --
dict
Response Syntax
{ 'NotebookExecutionId': 'string' }
Response Structure
(dict) --
NotebookExecutionId (string) --
The unique identifier of the notebook execution.