2023/08/30 - Amazon NeptuneData - 43 new api methods
Changes Allows customers to execute data plane actions like bulk loading graphs, issuing graph queries using Gremlin and openCypher directly from the SDK.
Cancels a Gremlin query. See Gremlin query cancellation for more information.
See also: AWS API Documentation
Request Syntax
client.cancel_gremlin_query( queryId='string' )
string
[REQUIRED]
The unique identifier that identifies the query to be canceled.
dict
Response Syntax
{ 'status': 'string' }
Response Structure
(dict) --
status (string) --
The status of the cancelation
Manages the generation and use of RDF graph statistics.
See also: AWS API Documentation
Request Syntax
client.manage_sparql_statistics( mode='disableAutoCompute'|'enableAutoCompute'|'refresh' )
string
The statistics generation mode. One of: DISABLE_AUTOCOMPUTE , ENABLE_AUTOCOMPUTE , or REFRESH , the last of which manually triggers DFE statistics generation.
dict
Response Syntax
{ 'status': 'string', 'payload': { 'statisticsId': 'string' } }
Response Structure
(dict) --
status (string) --
The HTTP return code of the request. If the request succeeded, the code is 200.
payload (dict) --
This is only returned for refresh mode.
statisticsId (string) --
The ID of the statistics generation run that is currently occurring.
Lists active Gremlin queries. See Gremlin query status API for details about the output.
See also: AWS API Documentation
Request Syntax
client.list_gremlin_queries( includeWaiting=True|False )
boolean
If set to TRUE , the list returned includes waiting queries. The default is FALSE ;
dict
Response Syntax
{ 'acceptedQueryCount': 123, 'runningQueryCount': 123, 'queries': [ { 'queryId': 'string', 'queryString': 'string', 'queryEvalStats': { 'waited': 123, 'elapsed': 123, 'cancelled': True|False, 'subqueries': {} } }, ] }
Response Structure
(dict) --
acceptedQueryCount (integer) --
The number of queries that have been accepted but not yet completed, including queries in the queue.
runningQueryCount (integer) --
The number of Gremlin queries currently running.
queries (list) --
A list of the current queries.
(dict) --
Captures the status of a Gremlin query (see the Gremlin query status API page).
queryId (string) --
The ID of the Gremlin query.
queryString (string) --
The query string of the Gremlin query.
queryEvalStats (dict) --
The query statistics of the Gremlin query.
waited (integer) --
Indicates how long the query waited, in milliseconds.
elapsed (integer) --
The number of milliseconds the query has been running so far.
cancelled (boolean) --
Set to TRUE if the query was cancelled, or FALSE otherwise.
subqueries (dict) --
The number of subqueries in this query.
Gets information about a specified model transform job. See Use a trained model to generate new model artifacts .
See also: AWS API Documentation
Request Syntax
client.get_ml_model_transform_job( id='string', neptuneIamRoleArn='string' )
string
[REQUIRED]
The unique identifier of the model-transform job to be reetrieved.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'status': 'string', 'id': 'string', 'baseProcessingJob': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' }, 'remoteModelTransformJob': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' }, 'models': [ { 'name': 'string', 'arn': 'string' }, ] }
Response Structure
(dict) --
status (string) --
The status of the model-transform job.
id (string) --
The unique identifier of the model-transform job to be retrieved.
baseProcessingJob (dict) --
The base data processing job.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
remoteModelTransformJob (dict) --
The remote model transform job.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
models (list) --
A list of the configuration information for the models being used.
(dict) --
Contains a Neptune ML configuration.
name (string) --
The configuration name.
arn (string) --
The ARN for the configuration.
Lists active openCypher queries. See Neptune openCypher status endpoint for more information.
See also: AWS API Documentation
Request Syntax
client.list_open_cypher_queries( includeWaiting=True|False )
boolean
When set to TRUE and other parameters are not present, causes status information to be returned for waiting queries as well as for running queries.
dict
Response Syntax
{ 'acceptedQueryCount': 123, 'runningQueryCount': 123, 'queries': [ { 'queryId': 'string', 'queryString': 'string', 'queryEvalStats': { 'waited': 123, 'elapsed': 123, 'cancelled': True|False, 'subqueries': {} } }, ] }
Response Structure
(dict) --
acceptedQueryCount (integer) --
The number of queries that have been accepted but not yet completed, including queries in the queue.
runningQueryCount (integer) --
The number of currently running openCypher queries.
queries (list) --
A list of current openCypher queries.
(dict) --
Captures the status of a Gremlin query (see the Gremlin query status API page).
queryId (string) --
The ID of the Gremlin query.
queryString (string) --
The query string of the Gremlin query.
queryEvalStats (dict) --
The query statistics of the Gremlin query.
waited (integer) --
Indicates how long the query waited, in milliseconds.
elapsed (integer) --
The number of milliseconds the query has been running so far.
cancelled (boolean) --
Set to TRUE if the query was cancelled, or FALSE otherwise.
subqueries (dict) --
The number of subqueries in this query.
Cancels a Neptune ML model training job. See Model training using the ``modeltraining` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-modeltraining.html>`__ .
See also: AWS API Documentation
Request Syntax
client.cancel_ml_model_training_job( id='string', neptuneIamRoleArn='string', clean=True|False )
string
[REQUIRED]
The unique identifier of the model-training job to be canceled.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
boolean
If set to TRUE , this flag specifies that all Amazon S3 artifacts should be deleted when the job is stopped. The default is FALSE .
dict
Response Syntax
{ 'status': 'string' }
Response Structure
(dict) --
status (string) --
The status of the cancellation.
Retrieves details about an inference endpoint. See Managing inference endpoints using the endpoints command .
See also: AWS API Documentation
Request Syntax
client.get_ml_endpoint( id='string', neptuneIamRoleArn='string' )
string
[REQUIRED]
The unique identifier of the inference endpoint.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'status': 'string', 'id': 'string', 'endpoint': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' }, 'endpointConfig': { 'name': 'string', 'arn': 'string' } }
Response Structure
(dict) --
status (string) --
The status of the inference endpoint.
id (string) --
The unique identifier of the inference endpoint.
endpoint (dict) --
The endpoint definition.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
endpointConfig (dict) --
The endpoint configuration
name (string) --
The configuration name.
arn (string) --
The ARN for the configuration.
Cancels the creation of a Neptune ML inference endpoint. See Managing inference endpoints using the endpoints command .
See also: AWS API Documentation
Request Syntax
client.delete_ml_endpoint( id='string', neptuneIamRoleArn='string', clean=True|False )
string
[REQUIRED]
The unique identifier of the inference endpoint.
string
The ARN of an IAM role providing Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will be thrown.
boolean
If this flag is set to TRUE , all Neptune ML S3 artifacts should be deleted when the job is stopped. The default is FALSE .
dict
Response Syntax
{ 'status': 'string' }
Response Structure
(dict) --
status (string) --
The status of the cancellation.
Starts a Neptune bulk loader job to load data from an Amazon S3 bucket into a Neptune DB instance. See Using the Amazon Neptune Bulk Loader to Ingest Data .
See also: AWS API Documentation
Request Syntax
client.start_loader_job( source='string', format='csv'|'opencypher'|'ntriples'|'nquads'|'rdfxml'|'turtle', s3BucketRegion='us-east-1'|'us-east-2'|'us-west-1'|'us-west-2'|'ca-central-1'|'sa-east-1'|'eu-north-1'|'eu-west-1'|'eu-west-2'|'eu-west-3'|'eu-central-1'|'me-south-1'|'af-south-1'|'ap-east-1'|'ap-northeast-1'|'ap-northeast-2'|'ap-southeast-1'|'ap-southeast-2'|'ap-south-1'|'cn-north-1'|'cn-northwest-1'|'us-gov-west-1'|'us-gov-east-1', iamRoleArn='string', mode='RESUME'|'NEW'|'AUTO', failOnError=True|False, parallelism='LOW'|'MEDIUM'|'HIGH'|'OVERSUBSCRIBE', parserConfiguration={ 'string': 'string' }, updateSingleCardinalityProperties=True|False, queueRequest=True|False, dependencies=[ 'string', ], userProvidedEdgeIds=True|False )
string
[REQUIRED]
The source parameter accepts an S3 URI that identifies a single file, multiple files, a folder, or multiple folders. Neptune loads every data file in any folder that is specified.
The URI can be in any of the following formats.
s3://(bucket_name)/(object-key-name)
https://s3.amazonaws.com/(bucket_name)/(object-key-name)
https://s3.us-east-1.amazonaws.com/(bucket_name)/(object-key-name)
The object-key-name element of the URI is equivalent to the prefix parameter in an S3 ListObjects API call. It identifies all the objects in the specified S3 bucket whose names begin with that prefix. That can be a single file or folder, or multiple files and/or folders.
The specified folder or folders can contain multiple vertex files and multiple edge files.
string
[REQUIRED]
The format of the data. For more information about data formats for the Neptune Loader command, see Load Data Formats .
Allowed values
**csv ** for the Gremlin CSV data format .
**opencypher ** for the openCypher CSV data format .
**ntriples ** for the N-Triples RDF data format .
**nquads ** for the N-Quads RDF data format .
**rdfxml ** for the RDFXML RDF data format .
**turtle ** for the Turtle RDF data format .
string
[REQUIRED]
The Amazon region of the S3 bucket. This must match the Amazon Region of the DB cluster.
string
[REQUIRED]
The Amazon Resource Name (ARN) for an IAM role to be assumed by the Neptune DB instance for access to the S3 bucket. The IAM role ARN provided here should be attached to the DB cluster (see Adding the IAM Role to an Amazon Neptune Cluster .
string
The load job mode.
Allowed values : RESUME , NEW , AUTO .
Default value : AUTO .
RESUME – In RESUME mode, the loader looks for a previous load from this source, and if it finds one, resumes that load job. If no previous load job is found, the loader stops. The loader avoids reloading files that were successfully loaded in a previous job. It only tries to process failed files. If you dropped previously loaded data from your Neptune cluster, that data is not reloaded in this mode. If a previous load job loaded all files from the same source successfully, nothing is reloaded, and the loader returns success.
NEW – In NEW mode, the creates a new load request regardless of any previous loads. You can use this mode to reload all the data from a source after dropping previously loaded data from your Neptune cluster, or to load new data available at the same source.
AUTO – In AUTO mode, the loader looks for a previous load job from the same source, and if it finds one, resumes that job, just as in RESUME mode. If the loader doesn't find a previous load job from the same source, it loads all data from the source, just as in NEW mode.
boolean
** failOnError ** – A flag to toggle a complete stop on an error.
Allowed values : "TRUE" , "FALSE" .
Default value : "TRUE" .
When this parameter is set to "FALSE" , the loader tries to load all the data in the location specified, skipping any entries with errors.
When this parameter is set to "TRUE" , the loader stops as soon as it encounters an error. Data loaded up to that point persists.
string
The optional parallelism parameter can be set to reduce the number of threads used by the bulk load process.
Allowed values :
LOW – The number of threads used is the number of available vCPUs divided by 8.
MEDIUM – The number of threads used is the number of available vCPUs divided by 2.
HIGH – The number of threads used is the same as the number of available vCPUs.
OVERSUBSCRIBE – The number of threads used is the number of available vCPUs multiplied by 2. If this value is used, the bulk loader takes up all available resources. This does not mean, however, that the OVERSUBSCRIBE setting results in 100% CPU utilization. Because the load operation is I/O bound, the highest CPU utilization to expect is in the 60% to 70% range.
Default value : HIGH
The parallelism setting can sometimes result in a deadlock between threads when loading openCypher data. When this happens, Neptune returns the LOAD_DATA_DEADLOCK error. You can generally fix the issue by setting parallelism to a lower setting and retrying the load command.
dict
** parserConfiguration ** – An optional object with additional parser configuration values. Each of the child parameters is also optional:
**namedGraphUri ** – The default graph for all RDF formats when no graph is specified (for non-quads formats and NQUAD entries with no graph). The default is https://aws.amazon.com/neptune/vocab/v01/DefaultNamedGraph .
**baseUri ** – The base URI for RDF/XML and Turtle formats. The default is https://aws.amazon.com/neptune/default .
**allowEmptyStrings ** – Gremlin users need to be able to pass empty string values("") as node and edge properties when loading CSV data. If allowEmptyStrings is set to false (the default), such empty strings are treated as nulls and are not loaded. If allowEmptyStrings is set to true , the loader treats empty strings as valid property values and loads them accordingly.
(string) --
(string) --
boolean
updateSingleCardinalityProperties is an optional parameter that controls how the bulk loader treats a new value for single-cardinality vertex or edge properties. This is not supported for loading openCypher data.
Allowed values : "TRUE" , "FALSE" .
Default value : "FALSE" .
By default, or when updateSingleCardinalityProperties is explicitly set to "FALSE" , the loader treats a new value as an error, because it violates single cardinality.
When updateSingleCardinalityProperties is set to "TRUE" , on the other hand, the bulk loader replaces the existing value with the new one. If multiple edge or single-cardinality vertex property values are provided in the source file(s) being loaded, the final value at the end of the bulk load could be any one of those new values. The loader only guarantees that the existing value has been replaced by one of the new ones.
boolean
This is an optional flag parameter that indicates whether the load request can be queued up or not.
You don't have to wait for one load job to complete before issuing the next one, because Neptune can queue up as many as 64 jobs at a time, provided that their queueRequest parameters are all set to "TRUE" .
If the queueRequest parameter is omitted or set to "FALSE" , the load request will fail if another load job is already running.
Allowed values : "TRUE" , "FALSE" .
Default value : "FALSE" .
list
This is an optional parameter that can make a queued load request contingent on the successful completion of one or more previous jobs in the queue.
Neptune can queue up as many as 64 load requests at a time, if their queueRequest parameters are set to "TRUE" . The dependencies parameter lets you make execution of such a queued request dependent on the successful completion of one or more specified previous requests in the queue.
For example, if load Job-A and Job-B are independent of each other, but load Job-C needs Job-A and Job-B to be finished before it begins, proceed as follows:
Submit load-job-A and load-job-B one after another in any order, and save their load-ids.
Submit load-job-C with the load-ids of the two jobs in its dependencies field:
Because of the dependencies parameter, the bulk loader will not start Job-C until Job-A and Job-B have completed successfully. If either one of them fails, Job-C will not be executed, and its status will be set to LOAD_FAILED_BECAUSE_DEPENDENCY_NOT_SATISFIED .
You can set up multiple levels of dependency in this way, so that the failure of one job will cause all requests that are directly or indirectly dependent on it to be cancelled.
(string) --
boolean
This parameter is required only when loading openCypher data that contains relationship IDs. It must be included and set to True when openCypher relationship IDs are explicitly provided in the load data (recommended).
When userProvidedEdgeIds is absent or set to True , an :ID column must be present in every relationship file in the load.
When userProvidedEdgeIds is present and set to False , relationship files in the load must not contain an :ID column. Instead, the Neptune loader automatically generates an ID for each relationship.
It's useful to provide relationship IDs explicitly so that the loader can resume loading after error in the CSV data have been fixed, without having to reload any relationships that have already been loaded. If relationship IDs have not been explicitly assigned, the loader cannot resume a failed load if any relationship file has had to be corrected, and must instead reload all the relationships.
dict
Response Syntax
{ 'status': 'string', 'payload': { 'string': 'string' } }
Response Structure
(dict) --
status (string) --
The HTTP return code indicating the status of the load job.
payload (dict) --
Contains a loadId name-value pair that provides an identifier for the load operation.
(string) --
(string) --
Retrieves information about a Neptune ML model training job. See Model training using the ``modeltraining` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-modeltraining.html>`__ .
See also: AWS API Documentation
Request Syntax
client.get_ml_model_training_job( id='string', neptuneIamRoleArn='string' )
string
[REQUIRED]
The unique identifier of the model-training job to retrieve.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'status': 'string', 'id': 'string', 'processingJob': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' }, 'hpoJob': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' }, 'modelTransformJob': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' }, 'mlModels': [ { 'name': 'string', 'arn': 'string' }, ] }
Response Structure
(dict) --
status (string) --
The status of the model training job.
id (string) --
The unique identifier of this model-training job.
processingJob (dict) --
The data processing job.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
hpoJob (dict) --
The HPO job.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
modelTransformJob (dict) --
The model transform job.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
mlModels (list) --
A list of the configurations of the ML models being used.
(dict) --
Contains a Neptune ML configuration.
name (string) --
The configuration name.
arn (string) --
The ARN for the configuration.
Cancels a specified model transform job. See Use a trained model to generate new model artifacts .
See also: AWS API Documentation
Request Syntax
client.cancel_ml_model_transform_job( id='string', neptuneIamRoleArn='string', clean=True|False )
string
[REQUIRED]
The unique ID of the model transform job to be canceled.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
boolean
If this flag is set to TRUE , all Neptune ML S3 artifacts should be deleted when the job is stopped. The default is FALSE .
dict
Response Syntax
{ 'status': 'string' }
Response Structure
(dict) --
status (string) --
the status of the cancelation.
Gets property graph statistics (Gremlin and openCypher).
See also: AWS API Documentation
Request Syntax
client.get_propertygraph_statistics()
dict
Response Syntax
{ 'status': 'string', 'payload': { 'autoCompute': True|False, 'active': True|False, 'statisticsId': 'string', 'date': datetime(2015, 1, 1), 'note': 'string', 'signatureInfo': { 'signatureCount': 123, 'instanceCount': 123, 'predicateCount': 123 } } }
Response Structure
(dict) --
status (string) --
The HTTP return code of the request. If the request succeeded, the code is 200. See Common error codes for DFE statistics request for a list of common errors.
payload (dict) --
Statistics for property-graph data.
autoCompute (boolean) --
Indicates whether or not automatic statistics generation is enabled.
active (boolean) --
Indicates whether or not DFE statistics generation is enabled at all.
statisticsId (string) --
Reports the ID of the current statistics generation run. A value of -1 indicates that no statistics have been generated.
date (datetime) --
The UTC time at which DFE statistics have most recently been generated.
note (string) --
A note about problems in the case where statistics are invalid.
signatureInfo (dict) --
A StatisticsSummary structure that contains:
signatureCount - The total number of signatures across all characteristic sets.
instanceCount - The total number of characteristic-set instances.
predicateCount - The total number of unique predicates.
signatureCount (integer) --
The total number of signatures across all characteristic sets.
instanceCount (integer) --
The total number of characteristic-set instances.
predicateCount (integer) --
The total number of unique predicates.
Creates a new Neptune ML model training job. See Model training using the ``modeltraining` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-modeltraining.html>`__ .
See also: AWS API Documentation
Request Syntax
client.start_ml_model_training_job( id='string', previousModelTrainingJobId='string', dataProcessingJobId='string', trainModelS3Location='string', sagemakerIamRoleArn='string', neptuneIamRoleArn='string', baseProcessingInstanceType='string', trainingInstanceType='string', trainingInstanceVolumeSizeInGB=123, trainingTimeOutInSeconds=123, maxHPONumberOfTrainingJobs=123, maxHPOParallelTrainingJobs=123, subnets=[ 'string', ], securityGroupIds=[ 'string', ], volumeEncryptionKMSKey='string', s3OutputEncryptionKMSKey='string', enableManagedSpotTraining=True|False, customModelTrainingParameters={ 'sourceS3DirectoryPath': 'string', 'trainingEntryPointScript': 'string', 'transformEntryPointScript': 'string' } )
string
A unique identifier for the new job. The default is An autogenerated UUID.
string
The job ID of a completed model-training job that you want to update incrementally based on updated data.
string
[REQUIRED]
The job ID of the completed data-processing job that has created the data that the training will work with.
string
[REQUIRED]
The location in Amazon S3 where the model artifacts are to be stored.
string
The ARN of an IAM role for SageMaker execution.This must be listed in your DB cluster parameter group or an error will occur.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
string
The type of ML instance used in preparing and managing training of ML models. This is a CPU instance chosen based on memory requirements for processing the training data and model.
string
The type of ML instance used for model training. All Neptune ML models support CPU, GPU, and multiGPU training. The default is ml.p3.2xlarge . Choosing the right instance type for training depends on the task type, graph size, and your budget.
integer
The disk volume size of the training instance. Both input data and the output model are stored on disk, so the volume size must be large enough to hold both data sets. The default is 0. If not specified or 0, Neptune ML selects a disk volume size based on the recommendation generated in the data processing step.
integer
Timeout in seconds for the training job. The default is 86,400 (1 day).
integer
Maximum total number of training jobs to start for the hyperparameter tuning job. The default is 2. Neptune ML automatically tunes the hyperparameters of the machine learning model. To obtain a model that performs well, use at least 10 jobs (in other words, set maxHPONumberOfTrainingJobs to 10). In general, the more tuning runs, the better the results.
integer
Maximum number of parallel training jobs to start for the hyperparameter tuning job. The default is 2. The number of parallel jobs you can run is limited by the available resources on your training instance.
list
The IDs of the subnets in the Neptune VPC. The default is None.
(string) --
list
The VPC security group IDs. The default is None.
(string) --
string
The Amazon Key Management Service (KMS) key that SageMaker uses to encrypt data on the storage volume attached to the ML compute instances that run the training job. The default is None.
string
The Amazon Key Management Service (KMS) key that SageMaker uses to encrypt the output of the processing job. The default is none.
boolean
Optimizes the cost of training machine-learning models by using Amazon Elastic Compute Cloud spot instances. The default is False .
dict
The configuration for custom model training. This is a JSON object.
sourceS3DirectoryPath (string) -- [REQUIRED]
The path to the Amazon S3 location where the Python module implementing your model is located. This must point to a valid existing Amazon S3 location that contains, at a minimum, a training script, a transform script, and a model-hpo-configuration.json file.
trainingEntryPointScript (string) --
The name of the entry point in your module of a script that performs model training and takes hyperparameters as command-line arguments, including fixed hyperparameters. The default is training.py .
transformEntryPointScript (string) --
The name of the entry point in your module of a script that should be run after the best model from the hyperparameter search has been identified, to compute the model artifacts necessary for model deployment. It should be able to run with no command-line arguments.The default is transform.py .
dict
Response Syntax
{ 'id': 'string', 'arn': 'string', 'creationTimeInMillis': 123 }
Response Structure
(dict) --
id (string) --
The unique ID of the new model training job.
arn (string) --
The ARN of the new model training job.
creationTimeInMillis (integer) --
The model training job creation time, in milliseconds.
Manages the generation and use of property graph statistics.
See also: AWS API Documentation
Request Syntax
client.manage_propertygraph_statistics( mode='disableAutoCompute'|'enableAutoCompute'|'refresh' )
string
The statistics generation mode. One of: DISABLE_AUTOCOMPUTE , ENABLE_AUTOCOMPUTE , or REFRESH , the last of which manually triggers DFE statistics generation.
dict
Response Syntax
{ 'status': 'string', 'payload': { 'statisticsId': 'string' } }
Response Structure
(dict) --
status (string) --
The HTTP return code of the request. If the request succeeded, the code is 200.
payload (dict) --
This is only returned for refresh mode.
statisticsId (string) --
The ID of the statistics generation run that is currently occurring.
Cancels a specified load job. This is an HTTP DELETE request.
See Neptune Loader Get-Status API for more information.
See also: AWS API Documentation
Request Syntax
client.cancel_loader_job( loadId='string' )
string
[REQUIRED]
The ID of the load job to be deleted.
dict
Response Syntax
{ 'status': 'string' }
Response Structure
(dict) --
status (string) --
The cancellation status.
Deletes SPARQL statistics
See also: AWS API Documentation
Request Syntax
client.delete_sparql_statistics()
dict
Response Syntax
{ 'statusCode': 123, 'status': 'string', 'payload': { 'active': True|False, 'statisticsId': 'string' } }
Response Structure
(dict) --
statusCode (integer) --
The HTTP response code: 200 if the delete was successful, or 204 if there were no statistics to delete.
status (string) --
The cancel status.
payload (dict) --
The deletion payload.
active (boolean) --
The current status of the statistics.
statisticsId (string) --
The ID of the statistics generation run that is currently occurring.
Creates a new Neptune ML data processing job for processing the graph data exported from Neptune for training. See The ``dataprocessing` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-dataprocessing.html>`__ .
See also: AWS API Documentation
Request Syntax
client.start_ml_data_processing_job( id='string', previousDataProcessingJobId='string', inputDataS3Location='string', processedDataS3Location='string', sagemakerIamRoleArn='string', neptuneIamRoleArn='string', processingInstanceType='string', processingInstanceVolumeSizeInGB=123, processingTimeOutInSeconds=123, modelType='string', configFileName='string', subnets=[ 'string', ], securityGroupIds=[ 'string', ], volumeEncryptionKMSKey='string', s3OutputEncryptionKMSKey='string' )
string
A unique identifier for the new job. The default is an autogenerated UUID.
string
The job ID of a completed data processing job run on an earlier version of the data.
string
[REQUIRED]
The URI of the Amazon S3 location where you want SageMaker to download the data needed to run the data processing job.
string
[REQUIRED]
The URI of the Amazon S3 location where you want SageMaker to save the results of a data processing job.
string
The ARN of an IAM role for SageMaker execution. This must be listed in your DB cluster parameter group or an error will occur.
string
The Amazon Resource Name (ARN) of an IAM role that SageMaker can assume to perform tasks on your behalf. This must be listed in your DB cluster parameter group or an error will occur.
string
The type of ML instance used during data processing. Its memory should be large enough to hold the processed dataset. The default is the smallest ml.r5 type whose memory is ten times larger than the size of the exported graph data on disk.
integer
The disk volume size of the processing instance. Both input data and processed data are stored on disk, so the volume size must be large enough to hold both data sets. The default is 0. If not specified or 0, Neptune ML chooses the volume size automatically based on the data size.
integer
Timeout in seconds for the data processing job. The default is 86,400 (1 day).
string
One of the two model types that Neptune ML currently supports: heterogeneous graph models (heterogeneous ), and knowledge graph (kge ). The default is none. If not specified, Neptune ML chooses the model type automatically based on the data.
string
A data specification file that describes how to load the exported graph data for training. The file is automatically generated by the Neptune export toolkit. The default is training-data-configuration.json .
list
The IDs of the subnets in the Neptune VPC. The default is None.
(string) --
list
The VPC security group IDs. The default is None.
(string) --
string
The Amazon Key Management Service (Amazon KMS) key that SageMaker uses to encrypt data on the storage volume attached to the ML compute instances that run the training job. The default is None.
string
The Amazon Key Management Service (Amazon KMS) key that SageMaker uses to encrypt the output of the processing job. The default is none.
dict
Response Syntax
{ 'id': 'string', 'arn': 'string', 'creationTimeInMillis': 123 }
Response Structure
(dict) --
id (string) --
The unique ID of the new data processing job.
arn (string) --
The ARN of the data processing job.
creationTimeInMillis (integer) --
The time it took to create the new processing job, in milliseconds.
Executes an openCypher explain request. See The openCypher explain feature for more information.
See also: AWS API Documentation
Request Syntax
client.execute_open_cypher_explain_query( openCypherQuery='string', parameters='string', explainMode='static'|'dynamic'|'details' )
string
[REQUIRED]
The openCypher query string.
string
The openCypher query parameters.
string
[REQUIRED]
The openCypher explain mode. Can be one of: static , dynamic , or details .
dict
Response Syntax
{ 'results': StreamingBody() }
Response Structure
(dict) --
results (:class:`.StreamingBody`) --
A text blob containing the openCypher explain results.
Gets a stream for a property graph.
With the Neptune Streams feature, you can generate a complete sequence of change-log entries that record every change made to your graph data as it happens. GetPropertygraphStream lets you collect these change-log entries for a property graph.
The Neptune streams feature needs to be enabled on your Neptune DBcluster. To enable streams, set the neptune_streams DB cluster parameter to 1 .
See Capturing graph changes in real time using Neptune streams .
See also: AWS API Documentation
Request Syntax
client.get_propertygraph_stream( limit=123, iteratorType='AT_SEQUENCE_NUMBER'|'AFTER_SEQUENCE_NUMBER'|'TRIM_HORIZON'|'LATEST', commitNum=123, opNum=123, encoding='gzip' )
integer
Specifies the maximum number of records to return. There is also a size limit of 10 MB on the response that can't be modified and that takes precedence over the number of records specified in the limit parameter. The response does include a threshold-breaching record if the 10 MB limit was reached.
The range for limit is 1 to 100,000, with a default of 10.
string
Can be one of:
AT_SEQUENCE_NUMBER – Indicates that reading should start from the event sequence number specified jointly by the commitNum and opNum parameters.
AFTER_SEQUENCE_NUMBER – Indicates that reading should start right after the event sequence number specified jointly by the commitNum and opNum parameters.
TRIM_HORIZON – Indicates that reading should start at the last untrimmed record in the system, which is the oldest unexpired (not yet deleted) record in the change-log stream.
LATEST – Indicates that reading should start at the most recent record in the system, which is the latest unexpired (not yet deleted) record in the change-log stream.
integer
The commit number of the starting record to read from the change-log stream. This parameter is required when iteratorType is``AT_SEQUENCE_NUMBER`` or AFTER_SEQUENCE_NUMBER , and ignored when iteratorType is TRIM_HORIZON or LATEST .
integer
The operation sequence number within the specified commit to start reading from in the change-log stream data. The default is 1 .
string
If set to TRUE, Neptune compresses the response using gzip encoding.
dict
Response Syntax
{ 'lastEventId': { 'string': 'string' }, 'lastTrxTimestampInMillis': 123, 'format': 'string', 'records': [ { 'commitTimestampInMillis': 123, 'eventId': { 'string': 'string' }, 'data': { 'id': 'string', 'type': 'string', 'key': 'string', 'value': {}, 'from': 'string', 'to': 'string' }, 'op': 'string', 'isLastOp': True|False }, ], 'totalRecords': 123 }
Response Structure
(dict) --
lastEventId (dict) --
Sequence identifier of the last change in the stream response.
An event ID is composed of two fields: a commitNum , which identifies a transaction that changed the graph, and an opNum , which identifies a specific operation within that transaction:
(string) --
(string) --
lastTrxTimestampInMillis (integer) --
The time at which the commit for the transaction was requested, in milliseconds from the Unix epoch.
format (string) --
Serialization format for the change records being returned. Currently, the only supported value is PG_JSON .
records (list) --
An array of serialized change-log stream records included in the response.
(dict) --
Structure of a property graph record.
commitTimestampInMillis (integer) --
The time at which the commit for the transaction was requested, in milliseconds from the Unix epoch.
eventId (dict) --
The sequence identifier of the stream change record.
(string) --
(string) --
data (dict) --
The serialized Gremlin or openCypher change record.
id (string) --
The ID of the Gremlin or openCypher element.
type (string) --
The type of this Gremlin or openCypher element. Must be one of:
**v1 ** - Vertex label for Gremlin, or node label for openCypher.
**vp ** - Vertex properties for Gremlin, or node properties for openCypher.
**e ** - Edge and edge label for Gremlin, or relationship and relationship type for openCypher.
**ep ** - Edge properties for Gremlin, or relationship properties for openCypher.
key (string) --
The property name. For element labels, this is label .
value (dict) --
This is a JSON object that contains a value field for the value itself, and a datatype field for the JSON data type of that value:
from (string) --
If this is an edge (type = e ), the ID of the corresponding from vertex or source node.
to (string) --
If this is an edge (type = e ), the ID of the corresponding to vertex or target node.
op (string) --
The operation that created the change.
isLastOp (boolean) --
Only present if this operation is the last one in its transaction. If present, it is set to true. It is useful for ensuring that an entire transaction is consumed.
totalRecords (integer) --
The total number of records in the response.
Returns a list of Neptune ML data processing jobs. See Listing active data-processing jobs using the Neptune ML dataprocessing command .
See also: AWS API Documentation
Request Syntax
client.list_ml_data_processing_jobs( maxItems=123, neptuneIamRoleArn='string' )
integer
The maximum number of items to return (from 1 to 1024; the default is 10).
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'ids': [ 'string', ] }
Response Structure
(dict) --
ids (list) --
A page listing data processing job IDs.
(string) --
The fast reset REST API lets you reset a Neptune graph quicky and easily, removing all of its data.
Neptune fast reset is a two-step process. First you call ExecuteFastReset with action set to initiateDatabaseReset . This returns a UUID token which you then include when calling ExecuteFastReset again with action set to performDatabaseReset . See Empty an Amazon Neptune DB cluster using the fast reset API .
See also: AWS API Documentation
Request Syntax
client.execute_fast_reset( action='initiateDatabaseReset'|'performDatabaseReset', token='string' )
string
[REQUIRED]
The fast reset action. One of the following values:
string
The fast-reset token to initiate the reset.
dict
Response Syntax
{ 'status': 'string', 'payload': { 'token': 'string' } }
Response Structure
(dict) --
status (string) --
The status is only returned for the performDatabaseReset action, and indicates whether or not the fast reset rquest is accepted.
payload (dict) --
The payload is only returned by the initiateDatabaseReset action, and contains the unique token to use with the performDatabaseReset action to make the reset occur.
token (string) --
A UUID generated by the database in the initiateDatabaseReset action, and then consumed by the performDatabaseReset to reset the database.
Check the status of the graph database on the host.
See also: AWS API Documentation
Request Syntax
client.get_engine_status()
dict
Response Syntax
{ 'status': 'string', 'startTime': 'string', 'dbEngineVersion': 'string', 'role': 'string', 'dfeQueryEngine': 'string', 'gremlin': { 'version': 'string' }, 'sparql': { 'version': 'string' }, 'opencypher': { 'version': 'string' }, 'labMode': { 'string': 'string' }, 'rollingBackTrxCount': 123, 'rollingBackTrxEarliestStartTime': 'string', 'features': { 'string': {} }, 'settings': { 'string': 'string' } }
Response Structure
(dict) --
status (string) --
Set to healthy if the instance is not experiencing problems. If the instance is recovering from a crash or from being rebooted and there are active transactions running from the latest server shutdown, status is set to recovery .
startTime (string) --
Set to the UTC time at which the current server process started.
dbEngineVersion (string) --
Set to the Neptune engine version running on your DB cluster. If this engine version has been manually patched since it was released, the version number is prefixed by Patch- .
role (string) --
Set to reader if the instance is a read-replica, or to writer if the instance is the primary instance.
dfeQueryEngine (string) --
Set to enabled if the DFE engine is fully enabled, or to viaQueryHint (the default) if the DFE engine is only used with queries that have the useDFE query hint set to true .
gremlin (dict) --
Contains information about the Gremlin query language available on your cluster. Specifically, it contains a version field that specifies the current TinkerPop version being used by the engine.
version (string) --
The version of the query language.
sparql (dict) --
Contains information about the SPARQL query language available on your cluster. Specifically, it contains a version field that specifies the current SPARQL version being used by the engine.
version (string) --
The version of the query language.
opencypher (dict) --
Contains information about the openCypher query language available on your cluster. Specifically, it contains a version field that specifies the current operCypher version being used by the engine.
version (string) --
The version of the query language.
labMode (dict) --
Contains Lab Mode settings being used by the engine.
(string) --
(string) --
rollingBackTrxCount (integer) --
If there are transactions being rolled back, this field is set to the number of such transactions. If there are none, the field doesn't appear at all.
rollingBackTrxEarliestStartTime (string) --
Set to the start time of the earliest transaction being rolled back. If no transactions are being rolled back, the field doesn't appear at all.
features (dict) --
Contains status information about the features enabled on your DB cluster.
(string) --
(dict) --
settings (dict) --
Contains information about the current settings on your DB cluster. For example, contains the current cluster query timeout setting (clusterQueryTimeoutInMs ).
(string) --
(string) --
Retrieves information about a specified data processing job. See The ``dataprocessing` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-dataprocessing.html>`__ .
See also: AWS API Documentation
Request Syntax
client.get_ml_data_processing_job( id='string', neptuneIamRoleArn='string' )
string
[REQUIRED]
The unique identifier of the data-processing job to be retrieved.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'status': 'string', 'id': 'string', 'processingJob': { 'name': 'string', 'arn': 'string', 'status': 'string', 'outputLocation': 'string', 'failureReason': 'string', 'cloudwatchLogUrl': 'string' } }
Response Structure
(dict) --
status (string) --
Status of the data processing job.
id (string) --
The unique identifier of this data-processing job.
processingJob (dict) --
Definition of the data processing job.
name (string) --
The resource name.
arn (string) --
The resource ARN.
status (string) --
The resource status.
outputLocation (string) --
The output location.
failureReason (string) --
The failure reason, in case of a failure.
cloudwatchLogUrl (string) --
The CloudWatch log URL for the resource.
Gets RDF statistics (SPARQL).
See also: AWS API Documentation
Request Syntax
client.get_sparql_statistics()
dict
Response Syntax
{ 'status': 'string', 'payload': { 'autoCompute': True|False, 'active': True|False, 'statisticsId': 'string', 'date': datetime(2015, 1, 1), 'note': 'string', 'signatureInfo': { 'signatureCount': 123, 'instanceCount': 123, 'predicateCount': 123 } } }
Response Structure
(dict) --
status (string) --
The HTTP return code of the request. If the request succeeded, the code is 200. See Common error codes for DFE statistics request for a list of common errors.
payload (dict) --
Statistics for RDF data.
autoCompute (boolean) --
Indicates whether or not automatic statistics generation is enabled.
active (boolean) --
Indicates whether or not DFE statistics generation is enabled at all.
statisticsId (string) --
Reports the ID of the current statistics generation run. A value of -1 indicates that no statistics have been generated.
date (datetime) --
The UTC time at which DFE statistics have most recently been generated.
note (string) --
A note about problems in the case where statistics are invalid.
signatureInfo (dict) --
A StatisticsSummary structure that contains:
signatureCount - The total number of signatures across all characteristic sets.
instanceCount - The total number of characteristic-set instances.
predicateCount - The total number of unique predicates.
signatureCount (integer) --
The total number of signatures across all characteristic sets.
instanceCount (integer) --
The total number of characteristic-set instances.
predicateCount (integer) --
The total number of unique predicates.
Gets status information about a specified load job. Neptune keeps track of the most recent 1,024 bulk load jobs, and stores the last 10,000 error details per job.
See Neptune Loader Get-Status API for more information.
See also: AWS API Documentation
Request Syntax
client.get_loader_job_status( loadId='string', details=True|False, errors=True|False, page=123, errorsPerPage=123 )
string
[REQUIRED]
The load ID of the load job to get the status of.
boolean
Flag indicating whether or not to include details beyond the overall status (TRUE or FALSE ; the default is FALSE ).
boolean
Flag indicating whether or not to include a list of errors encountered (TRUE or FALSE ; the default is FALSE ).
The list of errors is paged. The page and errorsPerPage parameters allow you to page through all the errors.
integer
The error page number (a positive integer; the default is 1 ). Only valid when the errors parameter is set to TRUE .
integer
The number of errors returned in each page (a positive integer; the default is 10 ). Only valid when the errors parameter set to TRUE .
dict
Response Syntax
{ 'status': 'string', 'payload': {} }
Response Structure
(dict) --
status (string) --
The HTTP response code for the request.
payload (dict) --
Status information about the load job, in a layout that could look like this:
Cancels a specified openCypher query. See Neptune openCypher status endpoint for more information.
See also: AWS API Documentation
Request Syntax
client.cancel_open_cypher_query( queryId='string', silent=True|False )
string
[REQUIRED]
The unique ID of the openCypher query to cancel.
boolean
If set to TRUE , causes the cancelation of the openCypher query to happen silently.
dict
Response Syntax
{ 'status': 'string', 'payload': True|False }
Response Structure
(dict) --
status (string) --
The cancellation status of the openCypher query.
payload (boolean) --
The cancelation payload for the openCypher query.
Gets the status of a specified Gremlin query.
See also: AWS API Documentation
Request Syntax
client.get_gremlin_query_status( queryId='string' )
string
[REQUIRED]
The unique identifier that identifies the Gremlin query.
dict
Response Syntax
{ 'queryId': 'string', 'queryString': 'string', 'queryEvalStats': { 'waited': 123, 'elapsed': 123, 'cancelled': True|False, 'subqueries': {} } }
Response Structure
(dict) --
queryId (string) --
The ID of the query for which status is being returned.
queryString (string) --
The Gremlin query string.
queryEvalStats (dict) --
The evaluation status of the Gremlin query.
waited (integer) --
Indicates how long the query waited, in milliseconds.
elapsed (integer) --
The number of milliseconds the query has been running so far.
cancelled (boolean) --
Set to TRUE if the query was cancelled, or FALSE otherwise.
subqueries (dict) --
The number of subqueries in this query.
Retrieves a list of the loadIds for all active loader jobs.
See also: AWS API Documentation
Request Syntax
client.list_loader_jobs( limit=123, includeQueuedLoads=True|False )
integer
The number of load IDs to list. Must be a positive integer greater than zero and not more than 100 (which is the default).
boolean
An optional parameter that can be used to exclude the load IDs of queued load requests when requesting a list of load IDs by setting the parameter to FALSE . The default value is TRUE .
dict
Response Syntax
{ 'status': 'string', 'payload': { 'loadIds': [ 'string', ] } }
Response Structure
(dict) --
status (string) --
Returns the status of the job list request.
payload (dict) --
The requested list of job IDs.
loadIds (list) --
A list of load IDs.
(string) --
Deletes statistics for Gremlin and openCypher (property graph) data.
See also: AWS API Documentation
Request Syntax
client.delete_propertygraph_statistics()
dict
Response Syntax
{ 'statusCode': 123, 'status': 'string', 'payload': { 'active': True|False, 'statisticsId': 'string' } }
Response Structure
(dict) --
statusCode (integer) --
The HTTP response code: 200 if the delete was successful, or 204 if there were no statistics to delete.
status (string) --
The cancel status.
payload (dict) --
The deletion payload.
active (boolean) --
The current status of the statistics.
statisticsId (string) --
The ID of the statistics generation run that is currently occurring.
Creates a new Neptune ML inference endpoint that lets you query one specific model that the model-training process constructed. See Managing inference endpoints using the endpoints command .
See also: AWS API Documentation
Request Syntax
client.create_ml_endpoint( id='string', mlModelTrainingJobId='string', mlModelTransformJobId='string', update=True|False, neptuneIamRoleArn='string', modelName='string', instanceType='string', instanceCount=123, volumeEncryptionKMSKey='string' )
string
A unique identifier for the new inference endpoint. The default is an autogenerated timestamped name.
string
The job Id of the completed model-training job that has created the model that the inference endpoint will point to. You must supply either the mlModelTrainingJobId or the mlModelTransformJobId .
string
The job Id of the completed model-transform job. You must supply either the mlModelTrainingJobId or the mlModelTransformJobId .
boolean
If set to true , update indicates that this is an update request. The default is false . You must supply either the mlModelTrainingJobId or the mlModelTransformJobId .
string
The ARN of an IAM role providing Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will be thrown.
string
Model type for training. By default the Neptune ML model is automatically based on the modelType used in data processing, but you can specify a different model type here. The default is rgcn for heterogeneous graphs and kge for knowledge graphs. The only valid value for heterogeneous graphs is rgcn . Valid values for knowledge graphs are: kge , transe , distmult , and rotate .
string
The type of Neptune ML instance to use for online servicing. The default is ml.m5.xlarge . Choosing the ML instance for an inference endpoint depends on the task type, the graph size, and your budget.
integer
The minimum number of Amazon EC2 instances to deploy to an endpoint for prediction. The default is 1
string
The Amazon Key Management Service (Amazon KMS) key that SageMaker uses to encrypt data on the storage volume attached to the ML compute instances that run the training job. The default is None.
dict
Response Syntax
{ 'id': 'string', 'arn': 'string', 'creationTimeInMillis': 123 }
Response Structure
(dict) --
id (string) --
The unique ID of the new inference endpoint.
arn (string) --
The ARN for the new inference endpoint.
creationTimeInMillis (integer) --
The endpoint creation time, in milliseconds.
Returns a list of model transform job IDs. See Use a trained model to generate new model artifacts .
See also: AWS API Documentation
Request Syntax
client.list_ml_model_transform_jobs( maxItems=123, neptuneIamRoleArn='string' )
integer
The maximum number of items to return (from 1 to 1024; the default is 10).
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'ids': [ 'string', ] }
Response Structure
(dict) --
ids (list) --
A page from the list of model transform IDs.
(string) --
Creates a new model transform job. See Use a trained model to generate new model artifacts .
See also: AWS API Documentation
Request Syntax
client.start_ml_model_transform_job( id='string', dataProcessingJobId='string', mlModelTrainingJobId='string', trainingJobName='string', modelTransformOutputS3Location='string', sagemakerIamRoleArn='string', neptuneIamRoleArn='string', customModelTransformParameters={ 'sourceS3DirectoryPath': 'string', 'transformEntryPointScript': 'string' }, baseProcessingInstanceType='string', baseProcessingInstanceVolumeSizeInGB=123, subnets=[ 'string', ], securityGroupIds=[ 'string', ], volumeEncryptionKMSKey='string', s3OutputEncryptionKMSKey='string' )
string
A unique identifier for the new job. The default is an autogenerated UUID.
string
The job ID of a completed data-processing job. You must include either dataProcessingJobId and a mlModelTrainingJobId , or a trainingJobName .
string
The job ID of a completed model-training job. You must include either dataProcessingJobId and a mlModelTrainingJobId , or a trainingJobName .
string
The name of a completed SageMaker training job. You must include either dataProcessingJobId and a mlModelTrainingJobId , or a trainingJobName .
string
[REQUIRED]
The location in Amazon S3 where the model artifacts are to be stored.
string
The ARN of an IAM role for SageMaker execution. This must be listed in your DB cluster parameter group or an error will occur.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Configuration information for a model transform using a custom model. The customModelTransformParameters object contains the following fields, which must have values compatible with the saved model parameters from the training job:
sourceS3DirectoryPath (string) -- [REQUIRED]
The path to the Amazon S3 location where the Python module implementing your model is located. This must point to a valid existing Amazon S3 location that contains, at a minimum, a training script, a transform script, and a model-hpo-configuration.json file.
transformEntryPointScript (string) --
The name of the entry point in your module of a script that should be run after the best model from the hyperparameter search has been identified, to compute the model artifacts necessary for model deployment. It should be able to run with no command-line arguments. The default is transform.py .
string
The type of ML instance used in preparing and managing training of ML models. This is an ML compute instance chosen based on memory requirements for processing the training data and model.
integer
The disk volume size of the training instance in gigabytes. The default is 0. Both input data and the output model are stored on disk, so the volume size must be large enough to hold both data sets. If not specified or 0, Neptune ML selects a disk volume size based on the recommendation generated in the data processing step.
list
The IDs of the subnets in the Neptune VPC. The default is None.
(string) --
list
The VPC security group IDs. The default is None.
(string) --
string
The Amazon Key Management Service (KMS) key that SageMaker uses to encrypt data on the storage volume attached to the ML compute instances that run the training job. The default is None.
string
The Amazon Key Management Service (KMS) key that SageMaker uses to encrypt the output of the processing job. The default is none.
dict
Response Syntax
{ 'id': 'string', 'arn': 'string', 'creationTimeInMillis': 123 }
Response Structure
(dict) --
id (string) --
The unique ID of the new model transform job.
arn (string) --
The ARN of the model transform job.
creationTimeInMillis (integer) --
The creation time of the model transform job, in milliseconds.
Lists Neptune ML model-training jobs. See Model training using the ``modeltraining` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-modeltraining.html>`__ .
See also: AWS API Documentation
Request Syntax
client.list_ml_model_training_jobs( maxItems=123, neptuneIamRoleArn='string' )
integer
The maximum number of items to return (from 1 to 1024; the default is 10).
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'ids': [ 'string', ] }
Response Structure
(dict) --
ids (list) --
A page of the list of model training job IDs.
(string) --
Executes a Gremlin Profile query, which runs a specified traversal, collects various metrics about the run, and produces a profile report as output. See Gremlin profile API in Neptune for details.
See also: AWS API Documentation
Request Syntax
client.execute_gremlin_profile_query( gremlinQuery='string', results=True|False, chop=123, serializer='string', indexOps=True|False )
string
[REQUIRED]
The Gremlin query string to profile.
boolean
If this flag is set to TRUE , the query results are gathered and displayed as part of the profile report. If FALSE , only the result count is displayed.
integer
If non-zero, causes the results string to be truncated at that number of characters. If set to zero, the string contains all the results.
string
If non-null, the gathered results are returned in a serialized response message in the format specified by this parameter. See Gremlin profile API in Neptune for more information.
boolean
If this flag is set to TRUE , the results include a detailed report of all index operations that took place during query execution and serialization.
dict
Response Syntax
{ 'output': StreamingBody() }
Response Structure
(dict) --
output (:class:`.StreamingBody`) --
A text blob containing the Gremlin Profile result. See Gremlin profile API in Neptune for details.
Lists existing inference endpoints. See Managing inference endpoints using the endpoints command .
See also: AWS API Documentation
Request Syntax
client.list_ml_endpoints( maxItems=123, neptuneIamRoleArn='string' )
integer
The maximum number of items to return (from 1 to 1024; the default is 10.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
dict
Response Syntax
{ 'ids': [ 'string', ] }
Response Structure
(dict) --
ids (list) --
A page from the list of inference endpoint IDs.
(string) --
This commands executes a Gremlin query. Amazon Neptune is compatible with Apache TinkerPop3 and Gremlin, so you can use the Gremlin traversal language to query the graph, as described under The Graph in the Apache TinkerPop3 documentation. More details can also be found in Accessing a Neptune graph with Gremlin .
See also: AWS API Documentation
Request Syntax
client.execute_gremlin_query( gremlinQuery='string', serializer='string' )
string
[REQUIRED]
Using this API, you can run Gremlin queries in string format much as you can using the HTTP endpoint. The interface is compatible with whatever Gremlin version your DB cluster is using (see the Tinkerpop client section to determine which Gremlin releases your engine version supports).
string
If non-null, the query results are returned in a serialized response message in the format specified by this parameter. See the GraphSON section in the TinkerPop documentation for a list of the formats that are currently supported.
dict
Response Syntax
{ 'requestId': 'string', 'status': { 'message': 'string', 'code': 123, 'attributes': {} }, 'result': {}, 'meta': {} }
Response Structure
(dict) --
requestId (string) --
The unique identifier of the Gremlin query.
status (dict) --
The status of the Gremlin query.
message (string) --
The status message.
code (integer) --
The HTTP response code returned fro the Gremlin query request..
attributes (dict) --
Attributes of the Gremlin query status.
result (dict) --
The Gremlin query output from the server.
meta (dict) --
Metadata about the Gremlin query.
Cancels a Neptune ML data processing job. See The ``dataprocessing` command <https://docs.aws.amazon.com/neptune/latest/userguide/machine-learning-api-dataprocessing.html>`__ .
See also: AWS API Documentation
Request Syntax
client.cancel_ml_data_processing_job( id='string', neptuneIamRoleArn='string', clean=True|False )
string
[REQUIRED]
The unique identifier of the data-processing job.
string
The ARN of an IAM role that provides Neptune access to SageMaker and Amazon S3 resources. This must be listed in your DB cluster parameter group or an error will occur.
boolean
If set to TRUE , this flag specifies that all Neptune ML S3 artifacts should be deleted when the job is stopped. The default is FALSE .
dict
Response Syntax
{ 'status': 'string' }
Response Structure
(dict) --
status (string) --
The status of the cancellation request.
Gets a graph summary for an RDF graph.
See also: AWS API Documentation
Request Syntax
client.get_rdf_graph_summary( mode='basic'|'detailed' )
string
Mode can take one of two values: BASIC (the default), and DETAILED .
dict
Response Syntax
{ 'statusCode': 123, 'payload': { 'version': 'string', 'lastStatisticsComputationTime': datetime(2015, 1, 1), 'graphSummary': { 'numDistinctSubjects': 123, 'numDistinctPredicates': 123, 'numQuads': 123, 'numClasses': 123, 'classes': [ 'string', ], 'predicates': [ { 'string': 123 }, ], 'subjectStructures': [ { 'count': 123, 'predicates': [ 'string', ] }, ] } } }
Response Structure
(dict) --
statusCode (integer) --
The HTTP return code of the request. If the request succeeded, the code is 200.
payload (dict) --
Payload for an RDF graph summary response
version (string) --
The version of this graph summary response.
lastStatisticsComputationTime (datetime) --
The timestamp, in ISO 8601 format, of the time at which Neptune last computed statistics.
graphSummary (dict) --
The graph summary of an RDF graph. See Graph summary response for an RDF graph .
numDistinctSubjects (integer) --
The number of distinct subjects in the graph.
numDistinctPredicates (integer) --
The number of distinct predicates in the graph.
numQuads (integer) --
The number of quads in the graph.
numClasses (integer) --
The number of classes in the graph.
classes (list) --
A list of the classes in the graph.
(string) --
predicates (list) --
"A list of predicates in the graph, along with the predicate counts.
(dict) --
(string) --
(integer) --
subjectStructures (list) --
This field is only present when the request mode is DETAILED . It contains a list of subject structures.
(dict) --
A subject structure.
count (integer) --
Number of occurrences of this specific structure.
predicates (list) --
A list of predicates present in this specific structure.
(string) --
Gets a stream for an RDF graph.
With the Neptune Streams feature, you can generate a complete sequence of change-log entries that record every change made to your graph data as it happens. GetSparqlStream lets you collect these change-log entries for an RDF graph.
The Neptune streams feature needs to be enabled on your Neptune DBcluster. To enable streams, set the neptune_streams DB cluster parameter to 1 .
See Capturing graph changes in real time using Neptune streams .
See also: AWS API Documentation
Request Syntax
client.get_sparql_stream( limit=123, iteratorType='AT_SEQUENCE_NUMBER'|'AFTER_SEQUENCE_NUMBER'|'TRIM_HORIZON'|'LATEST', commitNum=123, opNum=123, encoding='gzip' )
integer
Specifies the maximum number of records to return. There is also a size limit of 10 MB on the response that can't be modified and that takes precedence over the number of records specified in the limit parameter. The response does include a threshold-breaching record if the 10 MB limit was reached.
The range for limit is 1 to 100,000, with a default of 10.
string
Can be one of:
AT_SEQUENCE_NUMBER – Indicates that reading should start from the event sequence number specified jointly by the commitNum and opNum parameters.
AFTER_SEQUENCE_NUMBER – Indicates that reading should start right after the event sequence number specified jointly by the commitNum and opNum parameters.
TRIM_HORIZON – Indicates that reading should start at the last untrimmed record in the system, which is the oldest unexpired (not yet deleted) record in the change-log stream.
LATEST – Indicates that reading should start at the most recent record in the system, which is the latest unexpired (not yet deleted) record in the change-log stream.
integer
The commit number of the starting record to read from the change-log stream. This parameter is required when iteratorType is``AT_SEQUENCE_NUMBER`` or AFTER_SEQUENCE_NUMBER , and ignored when iteratorType is TRIM_HORIZON or LATEST .
integer
The operation sequence number within the specified commit to start reading from in the change-log stream data. The default is 1 .
string
If set to TRUE, Neptune compresses the response using gzip encoding.
dict
Response Syntax
{ 'lastEventId': { 'string': 'string' }, 'lastTrxTimestampInMillis': 123, 'format': 'string', 'records': [ { 'commitTimestampInMillis': 123, 'eventId': { 'string': 'string' }, 'data': { 'stmt': 'string' }, 'op': 'string', 'isLastOp': True|False }, ], 'totalRecords': 123 }
Response Structure
(dict) --
lastEventId (dict) --
Sequence identifier of the last change in the stream response.
An event ID is composed of two fields: a commitNum , which identifies a transaction that changed the graph, and an opNum , which identifies a specific operation within that transaction:
(string) --
(string) --
lastTrxTimestampInMillis (integer) --
The time at which the commit for the transaction was requested, in milliseconds from the Unix epoch.
format (string) --
Serialization format for the change records being returned. Currently, the only supported value is NQUADS .
records (list) --
An array of serialized change-log stream records included in the response.
(dict) --
A serialized SPARQL stream record capturing a change-log entry for the RDF graph.
commitTimestampInMillis (integer) --
The time at which the commit for the transaction was requested, in milliseconds from the Unix epoch.
eventId (dict) --
The sequence identifier of the stream change record.
(string) --
(string) --
data (dict) --
The serialized SPARQL change record. The serialization formats of each record are described in more detail in Serialization Formats in Neptune Streams .
stmt (string) --
Holds an N-QUADS statement expressing the changed quad.
op (string) --
The operation that created the change.
isLastOp (boolean) --
Only present if this operation is the last one in its transaction. If present, it is set to true. It is useful for ensuring that an entire transaction is consumed.
totalRecords (integer) --
The total number of records in the response.
Retrieves the status of a specified openCypher query.
See also: AWS API Documentation
Request Syntax
client.get_open_cypher_query_status( queryId='string' )
string
[REQUIRED]
The unique ID of the openCypher query for which to retrieve the query status.
dict
Response Syntax
{ 'queryId': 'string', 'queryString': 'string', 'queryEvalStats': { 'waited': 123, 'elapsed': 123, 'cancelled': True|False, 'subqueries': {} } }
Response Structure
(dict) --
queryId (string) --
The unique ID of the query for which status is being returned.
queryString (string) --
The openCypher query string.
queryEvalStats (dict) --
The openCypher query evaluation status.
waited (integer) --
Indicates how long the query waited, in milliseconds.
elapsed (integer) --
The number of milliseconds the query has been running so far.
cancelled (boolean) --
Set to TRUE if the query was cancelled, or FALSE otherwise.
subqueries (dict) --
The number of subqueries in this query.
Executes an openCypher query. See Accessing the Neptune Graph with openCypher for more information.
Neptune supports building graph applications using openCypher, which is currently one of the most popular query languages among developers working with graph databases. Developers, business analysts, and data scientists like openCypher's declarative, SQL-inspired syntax because it provides a familiar structure in which to querying property graphs.
The openCypher language was originally developed by Neo4j, then open-sourced in 2015 and contributed to the openCypher project under an Apache 2 open-source license.
See also: AWS API Documentation
Request Syntax
client.execute_open_cypher_query( openCypherQuery='string', parameters='string' )
string
[REQUIRED]
The openCypher query string to be executed.
string
The openCypher query parameters for query execution. See Examples of openCypher parameterized queries for more information.
dict
Response Syntax
{ 'results': {} }
Response Structure
(dict) --
results (dict) --
The openCypherquery results.
Executes a Gremlin Explain query.
Amazon Neptune has added a Gremlin feature named explain that provides is a self-service tool for understanding the execution approach being taken by the Neptune engine for the query. You invoke it by adding an explain parameter to an HTTP call that submits a Gremlin query.
The explain feature provides information about the logical structure of query execution plans. You can use this information to identify potential evaluation and execution bottlenecks and to tune your query, as explained in Tuning Gremlin queries . You can also use query hints to improve query execution plans.
See also: AWS API Documentation
Request Syntax
client.execute_gremlin_explain_query( gremlinQuery='string' )
string
[REQUIRED]
The Gremlin explain query string.
dict
Response Syntax
{ 'output': StreamingBody() }
Response Structure
(dict) --
output (:class:`.StreamingBody`) --
A text blob containing the Gremlin explain result, as described in Tuning Gremlin queries .
Gets a graph summary for a property graph.
See also: AWS API Documentation
Request Syntax
client.get_propertygraph_summary( mode='basic'|'detailed' )
string
Mode can take one of two values: BASIC (the default), and DETAILED .
dict
Response Syntax
{ 'statusCode': 123, 'payload': { 'version': 'string', 'lastStatisticsComputationTime': datetime(2015, 1, 1), 'graphSummary': { 'numNodes': 123, 'numEdges': 123, 'numNodeLabels': 123, 'numEdgeLabels': 123, 'nodeLabels': [ 'string', ], 'edgeLabels': [ 'string', ], 'numNodeProperties': 123, 'numEdgeProperties': 123, 'nodeProperties': [ { 'string': 123 }, ], 'edgeProperties': [ { 'string': 123 }, ], 'totalNodePropertyValues': 123, 'totalEdgePropertyValues': 123, 'nodeStructures': [ { 'count': 123, 'nodeProperties': [ 'string', ], 'distinctOutgoingEdgeLabels': [ 'string', ] }, ], 'edgeStructures': [ { 'count': 123, 'edgeProperties': [ 'string', ] }, ] } } }
Response Structure
(dict) --
statusCode (integer) --
The HTTP return code of the request. If the request succeeded, the code is 200.
payload (dict) --
Payload containing the property graph summary response.
version (string) --
The version of this graph summary response.
lastStatisticsComputationTime (datetime) --
The timestamp, in ISO 8601 format, of the time at which Neptune last computed statistics.
graphSummary (dict) --
The graph summary.
numNodes (integer) --
The number of nodes in the graph.
numEdges (integer) --
The number of edges in the graph.
numNodeLabels (integer) --
The number of distinct node labels in the graph.
numEdgeLabels (integer) --
The number of distinct edge labels in the graph.
nodeLabels (list) --
A list of the distinct node labels in the graph.
(string) --
edgeLabels (list) --
A list of the distinct edge labels in the graph.
(string) --
numNodeProperties (integer) --
A list of the distinct node properties in the graph, along with the count of nodes where each property is used.
numEdgeProperties (integer) --
The number of distinct edge properties in the graph.
nodeProperties (list) --
The number of distinct node properties in the graph.
(dict) --
(string) --
(integer) --
edgeProperties (list) --
A list of the distinct edge properties in the graph, along with the count of edges where each property is used.
(dict) --
(string) --
(integer) --
totalNodePropertyValues (integer) --
The total number of usages of all node properties.
totalEdgePropertyValues (integer) --
The total number of usages of all edge properties.
nodeStructures (list) --
This field is only present when the requested mode is DETAILED . It contains a list of node structures.
(dict) --
A node structure.
count (integer) --
Number of nodes that have this specific structure.
nodeProperties (list) --
A list of the node properties present in this specific structure.
(string) --
distinctOutgoingEdgeLabels (list) --
A list of distinct outgoing edge labels present in this specific structure.
(string) --
edgeStructures (list) --
This field is only present when the requested mode is DETAILED . It contains a list of edge structures.
(dict) --
An edge structure.
count (integer) --
The number of edges that have this specific structure.
edgeProperties (list) --
A list of edge properties present in this specific structure.
(string) --