2018/08/23 - AWS IoT Analytics - 4 updated api methods
Changes AWS IoT Analytics announces three new features: (1) Bring Your Custom Container - import your custom authored code containers. (2) Automate Container Execution - lets you automate the execution of containers hosting custom authored analytical code or Jupyter Notebooks to perform continuous analysis. (3) Incremental Data Capture with Customizable Time Windows - enables users to perform analysis on new incremental data captured since the last analysis.
{'actions': {'containerAction': {'executionRoleArn': 'string', 'image': 'string', 'resourceConfiguration': {'computeType': 'ACU_1 ' '| ' 'ACU_2', 'volumeSizeInGB': 'integer'}, 'variables': [{'datasetContentVersionValue': {'datasetName': 'string'}, 'doubleValue': 'double', 'name': 'string', 'outputFileUriValue': {'fileName': 'string'}, 'stringValue': 'string'}]}, 'queryAction': {'filters': [{'deltaTime': {'offsetSeconds': 'integer', 'timeExpression': 'string'}}]}}, 'retentionPeriod': {'numberOfDays': 'integer', 'unlimited': 'boolean'}, 'triggers': {'dataset': {'name': 'string'}}}Response
{'retentionPeriod': {'numberOfDays': 'integer', 'unlimited': 'boolean'}}
Creates a data set. A data set stores data retrieved from a data store by applying a "queryAction" (a SQL query) or a "containerAction" (executing a containerized application). This operation creates the skeleton of a data set. The data set can be populated manually by calling "CreateDatasetContent" or automatically according to a "trigger" you specify.
See also: AWS API Documentation
Request Syntax
client.create_dataset( datasetName='string', actions=[ { 'actionName': 'string', 'queryAction': { 'sqlQuery': 'string', 'filters': [ { 'deltaTime': { 'offsetSeconds': 123, 'timeExpression': 'string' } }, ] }, 'containerAction': { 'image': 'string', 'executionRoleArn': 'string', 'resourceConfiguration': { 'computeType': 'ACU_1'|'ACU_2', 'volumeSizeInGB': 123 }, 'variables': [ { 'name': 'string', 'stringValue': 'string', 'doubleValue': 123.0, 'datasetContentVersionValue': { 'datasetName': 'string' }, 'outputFileUriValue': { 'fileName': 'string' } }, ] } }, ], triggers=[ { 'schedule': { 'expression': 'string' }, 'dataset': { 'name': 'string' } }, ], retentionPeriod={ 'unlimited': True|False, 'numberOfDays': 123 }, tags=[ { 'key': 'string', 'value': 'string' }, ] )
string
[REQUIRED]
The name of the data set.
list
[REQUIRED]
A list of actions that create the data set contents.
(dict) --
A "DatasetAction" object specifying the query that creates the data set content.
actionName (string) --
The name of the data set action by which data set contents are automatically created.
queryAction (dict) --
An "SqlQueryDatasetAction" object that contains the SQL query to modify the message.
sqlQuery (string) -- [REQUIRED]
A SQL query string.
filters (list) --
Pre-filters applied to message data.
(dict) --
Information which is used to filter message data, to segregate it according to the time frame in which it arrives.
deltaTime (dict) --
Used to limit data to that which has arrived since the last execution of the action. When you create data set contents using message data from a specified time frame, some message data may still be "in flight" when processing begins, and so will not arrive in time to be processed. Use this field to make allowances for the "in flight" time of you message data, so that data not processed from a previous time frame will be included with the next time frame. Without this, missed message data would be excluded from processing during the next time frame as well, because its timestamp places it within the previous time frame.
offsetSeconds (integer) -- [REQUIRED]
The number of seconds of estimated "in flight" lag time of message data.
timeExpression (string) -- [REQUIRED]
An expression by which the time of the message data may be determined. This may be the name of a timestamp field, or a SQL expression which is used to derive the time the message data was generated.
containerAction (dict) --
Information which allows the system to run a containerized application in order to create the data set contents. The application must be in a Docker container along with any needed support libraries.
image (string) -- [REQUIRED]
The ARN of the Docker container stored in your account. The Docker container contains an application and needed support libraries and is used to generate data set contents.
executionRoleArn (string) -- [REQUIRED]
The ARN of the role which gives permission to the system to access needed resources in order to run the "containerAction". This includes, at minimum, permission to retrieve the data set contents which are the input to the containerized application.
resourceConfiguration (dict) -- [REQUIRED]
Configuration of the resource which executes the "containerAction".
computeType (string) -- [REQUIRED]
The type of the compute resource used to execute the "containerAction". Possible values are: ACU_1 (vCPU=4, memory=16GiB) or ACU_2 (vCPU=8, memory=32GiB).
volumeSizeInGB (integer) -- [REQUIRED]
The size (in GB) of the persistent storage available to the resource instance used to execute the "containerAction" (min: 1, max: 50).
variables (list) --
The values of variables used within the context of the execution of the containerized application (basically, parameters passed to the application). Each variable must have a name and a value given by one of "stringValue", "datasetContentVersionValue", or "outputFileUriValue".
(dict) --
An instance of a variable to be passed to the "containerAction" execution. Each variable must have a name and a value given by one of "stringValue", "datasetContentVersionValue", or "outputFileUriValue".
name (string) -- [REQUIRED]
The name of the variable.
stringValue (string) --
The value of the variable as a string.
doubleValue (float) --
The value of the variable as a double (numeric).
datasetContentVersionValue (dict) --
The value of the variable as a structure that specifies a data set content version.
datasetName (string) -- [REQUIRED]
The name of the data set whose latest contents will be used as input to the notebook or application.
outputFileUriValue (dict) --
The value of the variable as a structure that specifies an output file URI.
fileName (string) -- [REQUIRED]
The URI of the location where data set contents are stored, usually the URI of a file in an S3 bucket.
list
A list of triggers. A trigger causes data set contents to be populated at a specified time interval or when another data set's contents are created. The list of triggers can be empty or contain up to five DataSetTrigger objects.
(dict) --
The "DatasetTrigger" that specifies when the data set is automatically updated.
schedule (dict) --
The "Schedule" when the trigger is initiated.
expression (string) --
The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch documentation.
dataset (dict) --
The data set whose content creation will trigger the creation of this data set's contents.
name (string) -- [REQUIRED]
The name of the data set whose content generation will trigger the new data set content generation.
dict
[Optional] How long, in days, message data is kept for the data set. If not given or set to null, the latest version of the dataset content plus the latest succeeded version (if they are different) are retained for at most 90 days.
unlimited (boolean) --
If true, message data is kept indefinitely.
numberOfDays (integer) --
The number of days that message data is kept. The "unlimited" parameter must be false.
list
Metadata which can be used to manage the data set.
(dict) --
A set of key/value pairs which are used to manage the resource.
key (string) -- [REQUIRED]
The tag's key.
value (string) -- [REQUIRED]
The tag's value.
dict
Response Syntax
{ 'datasetName': 'string', 'datasetArn': 'string', 'retentionPeriod': { 'unlimited': True|False, 'numberOfDays': 123 } }
Response Structure
(dict) --
datasetName (string) --
The name of the data set.
datasetArn (string) --
The ARN of the data set.
retentionPeriod (dict) --
How long, in days, message data is kept for the data set.
unlimited (boolean) --
If true, message data is kept indefinitely.
numberOfDays (integer) --
The number of days that message data is kept. The "unlimited" parameter must be false.
{'dataset': {'actions': {'containerAction': {'executionRoleArn': 'string', 'image': 'string', 'resourceConfiguration': {'computeType': 'ACU_1 ' '| ' 'ACU_2', 'volumeSizeInGB': 'integer'}, 'variables': [{'datasetContentVersionValue': {'datasetName': 'string'}, 'doubleValue': 'double', 'name': 'string', 'outputFileUriValue': {'fileName': 'string'}, 'stringValue': 'string'}]}, 'queryAction': {'filters': [{'deltaTime': {'offsetSeconds': 'integer', 'timeExpression': 'string'}}]}}, 'retentionPeriod': {'numberOfDays': 'integer', 'unlimited': 'boolean'}, 'triggers': {'dataset': {'name': 'string'}}}}
Retrieves information about a data set.
See also: AWS API Documentation
Request Syntax
client.describe_dataset( datasetName='string' )
string
[REQUIRED]
The name of the data set whose information is retrieved.
dict
Response Syntax
{ 'dataset': { 'name': 'string', 'arn': 'string', 'actions': [ { 'actionName': 'string', 'queryAction': { 'sqlQuery': 'string', 'filters': [ { 'deltaTime': { 'offsetSeconds': 123, 'timeExpression': 'string' } }, ] }, 'containerAction': { 'image': 'string', 'executionRoleArn': 'string', 'resourceConfiguration': { 'computeType': 'ACU_1'|'ACU_2', 'volumeSizeInGB': 123 }, 'variables': [ { 'name': 'string', 'stringValue': 'string', 'doubleValue': 123.0, 'datasetContentVersionValue': { 'datasetName': 'string' }, 'outputFileUriValue': { 'fileName': 'string' } }, ] } }, ], 'triggers': [ { 'schedule': { 'expression': 'string' }, 'dataset': { 'name': 'string' } }, ], 'status': 'CREATING'|'ACTIVE'|'DELETING', 'creationTime': datetime(2015, 1, 1), 'lastUpdateTime': datetime(2015, 1, 1), 'retentionPeriod': { 'unlimited': True|False, 'numberOfDays': 123 } } }
Response Structure
(dict) --
dataset (dict) --
An object that contains information about the data set.
name (string) --
The name of the data set.
arn (string) --
The ARN of the data set.
actions (list) --
The "DatasetAction" objects that automatically create the data set contents.
(dict) --
A "DatasetAction" object specifying the query that creates the data set content.
actionName (string) --
The name of the data set action by which data set contents are automatically created.
queryAction (dict) --
An "SqlQueryDatasetAction" object that contains the SQL query to modify the message.
sqlQuery (string) --
A SQL query string.
filters (list) --
Pre-filters applied to message data.
(dict) --
Information which is used to filter message data, to segregate it according to the time frame in which it arrives.
deltaTime (dict) --
Used to limit data to that which has arrived since the last execution of the action. When you create data set contents using message data from a specified time frame, some message data may still be "in flight" when processing begins, and so will not arrive in time to be processed. Use this field to make allowances for the "in flight" time of you message data, so that data not processed from a previous time frame will be included with the next time frame. Without this, missed message data would be excluded from processing during the next time frame as well, because its timestamp places it within the previous time frame.
offsetSeconds (integer) --
The number of seconds of estimated "in flight" lag time of message data.
timeExpression (string) --
An expression by which the time of the message data may be determined. This may be the name of a timestamp field, or a SQL expression which is used to derive the time the message data was generated.
containerAction (dict) --
Information which allows the system to run a containerized application in order to create the data set contents. The application must be in a Docker container along with any needed support libraries.
image (string) --
The ARN of the Docker container stored in your account. The Docker container contains an application and needed support libraries and is used to generate data set contents.
executionRoleArn (string) --
The ARN of the role which gives permission to the system to access needed resources in order to run the "containerAction". This includes, at minimum, permission to retrieve the data set contents which are the input to the containerized application.
resourceConfiguration (dict) --
Configuration of the resource which executes the "containerAction".
computeType (string) --
The type of the compute resource used to execute the "containerAction". Possible values are: ACU_1 (vCPU=4, memory=16GiB) or ACU_2 (vCPU=8, memory=32GiB).
volumeSizeInGB (integer) --
The size (in GB) of the persistent storage available to the resource instance used to execute the "containerAction" (min: 1, max: 50).
variables (list) --
The values of variables used within the context of the execution of the containerized application (basically, parameters passed to the application). Each variable must have a name and a value given by one of "stringValue", "datasetContentVersionValue", or "outputFileUriValue".
(dict) --
An instance of a variable to be passed to the "containerAction" execution. Each variable must have a name and a value given by one of "stringValue", "datasetContentVersionValue", or "outputFileUriValue".
name (string) --
The name of the variable.
stringValue (string) --
The value of the variable as a string.
doubleValue (float) --
The value of the variable as a double (numeric).
datasetContentVersionValue (dict) --
The value of the variable as a structure that specifies a data set content version.
datasetName (string) --
The name of the data set whose latest contents will be used as input to the notebook or application.
outputFileUriValue (dict) --
The value of the variable as a structure that specifies an output file URI.
fileName (string) --
The URI of the location where data set contents are stored, usually the URI of a file in an S3 bucket.
triggers (list) --
The "DatasetTrigger" objects that specify when the data set is automatically updated.
(dict) --
The "DatasetTrigger" that specifies when the data set is automatically updated.
schedule (dict) --
The "Schedule" when the trigger is initiated.
expression (string) --
The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch documentation.
dataset (dict) --
The data set whose content creation will trigger the creation of this data set's contents.
name (string) --
The name of the data set whose content generation will trigger the new data set content generation.
status (string) --
The status of the data set.
creationTime (datetime) --
When the data set was created.
lastUpdateTime (datetime) --
The last time the data set was updated.
retentionPeriod (dict) --
[Optional] How long, in days, message data is kept for the data set.
unlimited (boolean) --
If true, message data is kept indefinitely.
numberOfDays (integer) --
The number of days that message data is kept. The "unlimited" parameter must be false.
{'datasetSummaries': {'actions': [{'actionName': 'string', 'actionType': 'QUERY | CONTAINER'}], 'triggers': [{'dataset': {'name': 'string'}, 'schedule': {'expression': 'string'}}]}}
Retrieves information about data sets.
See also: AWS API Documentation
Request Syntax
client.list_datasets( nextToken='string', maxResults=123 )
string
The token for the next set of results.
integer
The maximum number of results to return in this request.
The default value is 100.
dict
Response Syntax
{ 'datasetSummaries': [ { 'datasetName': 'string', 'status': 'CREATING'|'ACTIVE'|'DELETING', 'creationTime': datetime(2015, 1, 1), 'lastUpdateTime': datetime(2015, 1, 1), 'triggers': [ { 'schedule': { 'expression': 'string' }, 'dataset': { 'name': 'string' } }, ], 'actions': [ { 'actionName': 'string', 'actionType': 'QUERY'|'CONTAINER' }, ] }, ], 'nextToken': 'string' }
Response Structure
(dict) --
datasetSummaries (list) --
A list of "DatasetSummary" objects.
(dict) --
A summary of information about a data set.
datasetName (string) --
The name of the data set.
status (string) --
The status of the data set.
creationTime (datetime) --
The time the data set was created.
lastUpdateTime (datetime) --
The last time the data set was updated.
triggers (list) --
A list of triggers. A trigger causes data set content to be populated at a specified time interval or when another data set is populated. The list of triggers can be empty or contain up to five DataSetTrigger objects
(dict) --
The "DatasetTrigger" that specifies when the data set is automatically updated.
schedule (dict) --
The "Schedule" when the trigger is initiated.
expression (string) --
The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch documentation.
dataset (dict) --
The data set whose content creation will trigger the creation of this data set's contents.
name (string) --
The name of the data set whose content generation will trigger the new data set content generation.
actions (list) --
A list of "DataActionSummary" objects.
(dict) --
actionName (string) --
The name of the action which automatically creates the data set's contents.
actionType (string) --
The type of action by which the data set's contents are automatically created.
nextToken (string) --
The token to retrieve the next set of results, or null if there are no more results.
{'actions': {'containerAction': {'executionRoleArn': 'string', 'image': 'string', 'resourceConfiguration': {'computeType': 'ACU_1 ' '| ' 'ACU_2', 'volumeSizeInGB': 'integer'}, 'variables': [{'datasetContentVersionValue': {'datasetName': 'string'}, 'doubleValue': 'double', 'name': 'string', 'outputFileUriValue': {'fileName': 'string'}, 'stringValue': 'string'}]}, 'queryAction': {'filters': [{'deltaTime': {'offsetSeconds': 'integer', 'timeExpression': 'string'}}]}}, 'retentionPeriod': {'numberOfDays': 'integer', 'unlimited': 'boolean'}, 'triggers': {'dataset': {'name': 'string'}}}
Updates the settings of a data set.
See also: AWS API Documentation
Request Syntax
client.update_dataset( datasetName='string', actions=[ { 'actionName': 'string', 'queryAction': { 'sqlQuery': 'string', 'filters': [ { 'deltaTime': { 'offsetSeconds': 123, 'timeExpression': 'string' } }, ] }, 'containerAction': { 'image': 'string', 'executionRoleArn': 'string', 'resourceConfiguration': { 'computeType': 'ACU_1'|'ACU_2', 'volumeSizeInGB': 123 }, 'variables': [ { 'name': 'string', 'stringValue': 'string', 'doubleValue': 123.0, 'datasetContentVersionValue': { 'datasetName': 'string' }, 'outputFileUriValue': { 'fileName': 'string' } }, ] } }, ], triggers=[ { 'schedule': { 'expression': 'string' }, 'dataset': { 'name': 'string' } }, ], retentionPeriod={ 'unlimited': True|False, 'numberOfDays': 123 } )
string
[REQUIRED]
The name of the data set to update.
list
[REQUIRED]
A list of "DatasetAction" objects.
(dict) --
A "DatasetAction" object specifying the query that creates the data set content.
actionName (string) --
The name of the data set action by which data set contents are automatically created.
queryAction (dict) --
An "SqlQueryDatasetAction" object that contains the SQL query to modify the message.
sqlQuery (string) -- [REQUIRED]
A SQL query string.
filters (list) --
Pre-filters applied to message data.
(dict) --
Information which is used to filter message data, to segregate it according to the time frame in which it arrives.
deltaTime (dict) --
Used to limit data to that which has arrived since the last execution of the action. When you create data set contents using message data from a specified time frame, some message data may still be "in flight" when processing begins, and so will not arrive in time to be processed. Use this field to make allowances for the "in flight" time of you message data, so that data not processed from a previous time frame will be included with the next time frame. Without this, missed message data would be excluded from processing during the next time frame as well, because its timestamp places it within the previous time frame.
offsetSeconds (integer) -- [REQUIRED]
The number of seconds of estimated "in flight" lag time of message data.
timeExpression (string) -- [REQUIRED]
An expression by which the time of the message data may be determined. This may be the name of a timestamp field, or a SQL expression which is used to derive the time the message data was generated.
containerAction (dict) --
Information which allows the system to run a containerized application in order to create the data set contents. The application must be in a Docker container along with any needed support libraries.
image (string) -- [REQUIRED]
The ARN of the Docker container stored in your account. The Docker container contains an application and needed support libraries and is used to generate data set contents.
executionRoleArn (string) -- [REQUIRED]
The ARN of the role which gives permission to the system to access needed resources in order to run the "containerAction". This includes, at minimum, permission to retrieve the data set contents which are the input to the containerized application.
resourceConfiguration (dict) -- [REQUIRED]
Configuration of the resource which executes the "containerAction".
computeType (string) -- [REQUIRED]
The type of the compute resource used to execute the "containerAction". Possible values are: ACU_1 (vCPU=4, memory=16GiB) or ACU_2 (vCPU=8, memory=32GiB).
volumeSizeInGB (integer) -- [REQUIRED]
The size (in GB) of the persistent storage available to the resource instance used to execute the "containerAction" (min: 1, max: 50).
variables (list) --
The values of variables used within the context of the execution of the containerized application (basically, parameters passed to the application). Each variable must have a name and a value given by one of "stringValue", "datasetContentVersionValue", or "outputFileUriValue".
(dict) --
An instance of a variable to be passed to the "containerAction" execution. Each variable must have a name and a value given by one of "stringValue", "datasetContentVersionValue", or "outputFileUriValue".
name (string) -- [REQUIRED]
The name of the variable.
stringValue (string) --
The value of the variable as a string.
doubleValue (float) --
The value of the variable as a double (numeric).
datasetContentVersionValue (dict) --
The value of the variable as a structure that specifies a data set content version.
datasetName (string) -- [REQUIRED]
The name of the data set whose latest contents will be used as input to the notebook or application.
outputFileUriValue (dict) --
The value of the variable as a structure that specifies an output file URI.
fileName (string) -- [REQUIRED]
The URI of the location where data set contents are stored, usually the URI of a file in an S3 bucket.
list
A list of "DatasetTrigger" objects. The list can be empty or can contain up to five DataSetTrigger objects.
(dict) --
The "DatasetTrigger" that specifies when the data set is automatically updated.
schedule (dict) --
The "Schedule" when the trigger is initiated.
expression (string) --
The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch documentation.
dataset (dict) --
The data set whose content creation will trigger the creation of this data set's contents.
name (string) -- [REQUIRED]
The name of the data set whose content generation will trigger the new data set content generation.
dict
How long, in days, message data is kept for the data set.
unlimited (boolean) --
If true, message data is kept indefinitely.
numberOfDays (integer) --
The number of days that message data is kept. The "unlimited" parameter must be false.
None