2024/04/02 - 6 updated api methods
Changes Adding View related fields to responses of read-only Table APIs.
2024/02/05 - 2 updated api methods
Changes Introduce Catalog Encryption Role within Glue Data Catalog Settings. Introduce SASL/PLAIN as an authentication method for Glue Kafka connections
2023/12/22 - 3 updated api methods
Changes This release adds additional configurations for Query Session Context on the following APIs: GetUnfilteredTableMetadata, GetUnfilteredPartitionMetadata, GetUnfilteredPartitionsMetadata.
2023/11/30 - 2 updated api methods
Changes Adds observation and analyzer support to the GetDataQualityResult and BatchGetDataQualityResult APIs.
2023/11/16 - 5 new api methods
Changes Introduces new column statistics APIs to support statistics generation for tables within the Glue Data Catalog.
2023/11/14 - 6 new api methods
Changes Introduces new storage optimization APIs to support automatic compaction of Apache Iceberg tables.
2023/11/02 - 5 updated api methods
Changes This release introduces Google BigQuery Source and Target in AWS Glue CodeGenConfigurationNode.
2023/10/12 - 7 updated api methods
Changes Extending version control support to GitLab and Bitbucket from AWSGlue
2023/08/24 - 3 updated api methods
Changes Added API attributes that help in the monitoring of sessions.
2023/08/15 - 4 updated api methods
Changes AWS Glue Crawlers can now accept SerDe overrides from a custom csv classifier. The two SerDe options are LazySimpleSerDe and OpenCSVSerDe. In case, the user wants crawler to do the selection, "None" can be selected for this purpose.
2023/07/26 - 5 updated api methods
Changes Release Glue Studio Snowflake Connector Node for SDK/CLI
2023/07/24 - 5 updated api methods
Changes Added support for Data Preparation Recipe node in Glue Studio jobs
2023/07/21 - 5 updated api methods
Changes This release adds support for AWS Glue Crawler with Apache Hudi Tables, allowing Crawlers to discover Hudi Tables in S3 and register them in Glue Data Catalog for query engines to query against.
2023/07/17 - 3 updated api methods
Changes Adding new supported permission type flags to get-unfiltered endpoints that callers may pass to indicate support for enforcing Lake Formation fine-grained access control on nested column attributes.
2023/07/07 - 1 updated api methods
Changes This release enables customers to create new Apache Iceberg tables and associated metadata in Amazon S3 by using native AWS Glue CreateTable operation.
2023/06/29 - 5 updated api methods
Changes This release adds support for AWS Glue Crawler with Iceberg Tables, allowing Crawlers to discover Iceberg Tables in S3 and register them in Glue Data Catalog for query engines to query against.
2023/06/26 - 5 updated api methods
Changes Timestamp Starting Position For Kinesis and Kafka Data Sources in a Glue Streaming Job
2023/06/19 - 12 updated api methods
Changes This release adds support for creating cross region table/database resource links
2023/05/30 - 21 updated api methods
Changes Added Runtime parameter to allow selection of Ray Runtime
2023/05/25 - 12 updated api methods
Changes Added ability to create data quality rulesets for shared, cross-account Glue Data Catalog tables. Added support for dataset comparison rules through a new parameter called AdditionalDataSources. Enhanced the data quality results with a map containing profiled metric values.
2023/05/16 - 2 updated api methods
Changes Add Support for Tags for Custom Entity Types
2023/05/09 - 5 updated api methods
Changes This release adds AmazonRedshift Source and Target nodes in addition to DynamicTransform OutputSchemas
2023/05/08 - 21 updated api methods
Changes We don't do release notes https://w.amazon.com/bin/view/AWSDocs/common-tasks/release-notes
2023/04/03 - 10 updated api methods
Changes Add support for database-level federation
2023/02/17 - 5 updated api methods
Changes Release of Delta Lake Data Lake Format for Glue Studio Service
2023/02/15 - 5 updated api methods
Changes Fix DirectJDBCSource not showing up in CLI code gen
2023/02/08 - 5 updated api methods
Changes DirectJDBCSource + Glue 4.0 streaming options
2023/01/19 - 5 updated api methods
Changes Release Glue Studio Hudi Data Lake Format for SDK/CLI
2022/12/15 - 5 updated api methods
Changes This release adds support for AWS Glue Crawler with native DeltaLake tables, allowing Crawlers to classify Delta Lake format tables and catalog them for query engines to query against.
2022/11/30 - 16 new 8 updated api methods
Changes This release adds support for AWS Glue Data Quality, which helps you evaluate and monitor the quality of your data and includes the API for creating, deleting, or updating data quality rulesets, runs and evaluations.
2022/11/29 - 5 updated api methods
Changes This release allows the creation of Custom Visual Transforms (Dynamic Transforms) to be created via AWS Glue CLI/SDK.
2022/11/18 - 5 updated api methods
Changes AWSGlue Crawler - Adding support for Table and Column level Comments with database level datatypes for JDBC based crawler.
2022/10/27 - 4 updated api methods
Changes Added support for custom datatypes when using custom csv classifier.
2022/10/05 - 2 new 5 updated api methods
Changes This SDK release adds support to sync glue jobs with source control provider. Additionally, a new parameter called SourceControlDetails will be added to Job model.
2022/09/22 - 5 updated api methods
Changes Added support for S3 Event Notifications for Catalog Target Crawlers.
2022/08/08 - 17 updated api methods
Changes Add an option to run non-urgent or non-time sensitive Glue Jobs on spare capacity
2022/07/14 - 21 updated api methods
Changes This release adds an additional worker type for Glue Streaming jobs.
2022/06/30 - 1 updated api methods
Changes This release adds tag as an input of CreateDatabase
2022/06/24 - 1 new api methods
Changes This release enables the new ListCrawls API for viewing the AWS Glue Crawler run history.
2022/05/17 - 5 updated api methods
Changes This release adds a new optional parameter called codeGenNodeConfiguration to CRUD job APIs that allows users to manage visual jobs via APIs. The updated CreateJob and UpdateJob will create jobs that can be viewed in Glue Studio as a visual graph. GetJob can be used to get codeGenNodeConfiguration.
2022/04/21 - 5 new api methods
Changes This release adds APIs to create, read, delete, list, and batch read of Glue custom entity types
2022/04/14 - 6 updated api methods
Changes Auto Scaling for Glue version 3.0 and later jobs to dynamically scale compute resources. This SDK change provides customers with the auto-scaled DPU usage
2022/03/18 - 9 new 3 updated api methods
Changes Added 9 new APIs for AWS Glue Interactive Sessions: ListSessions, StopSession, CreateSession, GetSession, DeleteSession, RunStatement, GetStatement, ListStatements, CancelStatement
2022/02/16 - 7 updated api methods
Changes Support for optimistic locking in UpdateTable
2022/02/02 - 5 updated api methods
Changes Launch Protobuf support for AWS Glue Schema Registry
2022/01/13 - 1 updated api methods
Changes This SDK release adds support to pass run properties when starting a workflow run
2022/01/05 - 3 new 19 updated api methods
Changes Add Delta Lake target support for Glue Crawler and 3rd Party Support for Lake Formation
2021/11/30 - 7 updated api methods
Changes Support for DataLake transactions
2021/10/15 - 5 updated api methods
Changes Enable S3 event base crawler API.
2021/10/05 - 1 updated api methods
Changes This release adds tag as an input of CreateConnection
2021/08/23 - 9 new 2 updated api methods
Changes Add support for Custom Blueprints
2021/07/14 - 9 updated api methods
Changes Add support for Event Driven Workflows
2021/06/28 - 5 updated api methods
Changes Add JSON Support for Glue Schema Registry
2021/06/07 - 5 updated api methods
Changes Add SampleSize variable to S3Target to enable s3-sampling feature through API.
2021/03/29 - 1 updated api methods
Changes Allow Dots in Registry and Schema Names for CreateRegistry, CreateSchema; Fixed issue when duplicate keys are present and not returned as part of QuerySchemaVersionMetadata.
2021/02/23 - 1 updated api methods
Changes Updating the page size for Glue catalog getter APIs.
2020/12/22 - 2 updated api methods
Changes AWS Glue Find Matches machine learning transforms now support column importance scores.
2020/12/21 - 4 updated api methods
Changes Add 4 connection properties: SECRET_ID, CONNECTOR_URL, CONNECTOR_TYPE, CONNECTOR_CLASS_NAME. Add two connection types: MARKETPLACE, CUSTOM
2020/11/23 - 2 new 6 updated api methods
Changes Feature1 - Glue crawler adds data lineage configuration option. Feature2 - AWS Glue Data Catalog adds APIs for PartitionIndex creation and deletion as part of Enhancement Partition Management feature.
2020/11/19 - 20 new 14 updated api methods
Changes Adding support for Glue Schema Registry. The AWS Glue Schema Registry is a new feature that allows you to centrally discover, control, and evolve data stream schemas.
2020/10/27 - 3 updated api methods
Changes AWS Glue machine learning transforms now support encryption-at-rest for labels and trained models.
2020/10/21 - 5 updated api methods
Changes AWS Glue crawlers now support incremental crawls for the Amazon Simple Storage Service (Amazon S3) data source.
2020/10/05 - 5 updated api methods
Changes AWS Glue crawlers now support Amazon DocumentDB (with MongoDB compatibility) and MongoDB collections. You can choose to crawl the entire data set or only a small sample to reduce crawl time.
2020/10/01 - 1 updated api methods
Changes Adding additional optional map parameter to get-plan api
2020/09/21 - 1 new api methods
Changes Adding support to update multiple partitions of a table in a single request
2020/09/09 - 1 new 1 updated api methods
Changes Adding support for partitionIndexes to improve GetPartitions performance.
2020/08/10 - 6 updated api methods
Changes Starting today, you can further control orchestration of your ETL workloads in AWS Glue by specifying the maximum number of concurrent runs for a Glue workflow.
2020/08/07 - 9 updated api methods
Changes AWS Glue now adds support for Network connection type enabling you to access resources inside your VPC using Glue crawlers and Glue ETL jobs.
2020/07/27 - 1 new 4 updated api methods
Changes Add ability to manually resume workflows in AWS Glue providing customers further control over the orchestration of ETL workloads.
2020/07/07 - 1 new 19 updated api methods
Changes AWS Glue Data Catalog supports cross account sharing of tables through AWS Lake Formation
2020/06/25 - 6 new api methods
Changes This release adds new APIs to support column level statistics in AWS Glue Data Catalog
2020/06/12 - 5 updated api methods
Changes You can now choose to crawl the entire table or just a sample of records in DynamoDB when using AWS Glue crawlers. Additionally, you can also specify a scanning rate for crawling DynamoDB tables.
2020/06/03 - 2 updated api methods
Changes Adding databaseName in the response for GetUserDefinedFunctions() API.
2020/05/15 - 1 new 9 updated api methods
Changes Starting today, you can stop the execution of Glue workflows that are running. AWS Glue workflows are directed acyclic graphs (DAGs) of Glue triggers, crawlers and jobs. Using a workflow, you can design a complex multi-job extract, transform, and load (ETL) activity that AWS Glue can execute and track as single entity.
2020/04/20 - 4 updated api methods
Changes Added a new ConnectionType "KAFKA" and a ConnectionProperty "KAFKA_BOOTSTRAP_SERVERS" to support Kafka connection.
2020/03/31 - 4 updated api methods
Changes Add two enums for MongoDB connection: Added "CONNECTION_URL" to "ConnectionPropertyKey" and added "MONGODB" to "ConnectionType"
2020/02/28 - 1 new 1 updated api methods
Changes AWS Glue adds resource tagging support for Machine Learning Transforms and adds a new API, ListMLTransforms to support tag filtering. With this feature, customers can use tags in AWS Glue to organize and control access to Machine Learning Transforms.
2020/02/12 - 5 updated api methods
Changes Adding ability to add arguments that cannot be overridden to AWS Glue jobs
2019/11/21 - 4 updated api methods
Changes This release adds support for Glue 1.0 compatible ML Transforms.
2019/09/19 - 4 updated api methods
Changes AWS Glue DevEndpoints now supports GlueVersion, enabling you to choose Apache Spark 2.4.3 (in addition to Apache Spark 2.2.1). In addition to supporting the latest version of Spark, you will also have the ability to choose between Python 2 and Python 3.
2019/08/08 - 13 new 16 updated api methods
Changes You can now use AWS Glue to find matching records across dataset even without identifiers to join on by using the new FindMatches ML Transform. Find related products, places, suppliers, customers, and more by teaching a custom machine learning transformation that you can use to identify matching matching records as part of your analysis, data cleaning, or master data management project by adding the FindMatches transformation to your Glue ETL Jobs. If your problem is more along the lines of deduplication, you can use the FindMatches in much the same way to identify customers who have signed up more than ones, products that have accidentally been added to your product catalog more than once, and so forth. Using the FindMatches MLTransform, you can teach a Transform your definition of a duplicate through examples, and it will use machine learning to identify other potential duplicates in your dataset. As with data integration, you can then use your new Transform in your deduplication projects by adding the FindMatches transformation to your Glue ETL Jobs. This release also contains additional APIs that support AWS Lake Formation.
2019/07/26 - 2 new 1 updated api methods
Changes This release provides GetJobBookmark and GetJobBookmarks APIs. These APIs enable users to look at specific versions or all versions of the JobBookmark for a specific job. This release also enables resetting the job bookmark to a specific run via an enhancement of the ResetJobBookmark API.
2019/07/24 - 16 updated api methods
Changes This release provides GlueVersion option for Job APIs and WorkerType option for DevEndpoint APIs. Job APIs enable users to pick specific GlueVersion for a specific job and pin the job to a specific runtime environment. DevEndpoint APIs enable users to pick different WorkerType for memory intensive workload.
2019/06/20 - 11 new 5 updated api methods
Changes Starting today, you can now use workflows in AWS Glue to author directed acyclic graphs (DAGs) of Glue triggers, crawlers and jobs. Workflows enable orchestration of your ETL workloads by building dependencies between Glue entities (triggers, crawlers and jobs). You can visually track status of the different nodes in the workflows on the console making it easier to monitor progress and troubleshoot issues. Also, you can share parameters across entities in the workflow.
2019/06/05 - 5 updated api methods
Changes Support specifying python version for Python shell jobs. A new parameter PythonVersion is added to the JobCommand data type.
2019/05/10 - 5 updated api methods
Changes AWS Glue now supports specifying existing catalog tables for a crawler to examine as a data source. A new parameter CatalogTargets is added to the CrawlerTargets data type.
2019/04/05 - 8 updated api methods
Changes AWS Glue now supports workerType choices in the CreateJob, UpdateJob, and StartJobRun APIs, to be used for memory-intensive jobs.
2019/03/26 - 4 updated api methods
Changes This new feature will now allow customers to add a customized csv classifier with classifier API. They can specify a custom delimiter, quote symbol and control other behavior they'd like crawlers to have while recognizing csv files
2019/03/11 - 5 updated api methods
Changes CreateDevEndpoint and UpdateDevEndpoint now support Arguments to configure the DevEndpoint.
2019/02/22 - 11 new 4 updated api methods
Changes AWS Glue adds support for assigning AWS resource tags to jobs, triggers, development endpoints, and crawlers. Each tag consists of a key and an optional value, both of which you define. With this capacity, customers can use tags in AWS Glue to easily organize and identify your resources, create cost allocation reports, and control access to resources.
2019/01/18 - 7 updated api methods
Changes AllocatedCapacity field is being deprecated and replaced with MaxCapacity field
2018/12/12 - 4 updated api methods
Changes API Update for Glue: this update enables encryption of password inside connection objects stored in AWS Glue Data Catalog using DataCatalogEncryptionSettings. In addition, a new "HidePassword" flag is added to GetConnection and GetConnections to return connections without passwords.
2018/10/16 - 3 new api methods
Changes New Glue APIs for creating, updating, reading and deleting Data Catalog resource-based policies.
2018/09/26 - 1 new api methods
Changes AWS Glue now supports data encryption at rest for ETL jobs and development endpoints. With encryption enabled, when you run ETL jobs, or development endpoints, Glue will use AWS KMS keys to write encrypted data at rest. You can also encrypt the metadata stored in the Glue Data Catalog using keys that you manage with AWS KMS. Additionally, you can use AWS KMS keys to encrypt the logs generated by crawlers and ETL jobs as well as encrypt ETL job bookmarks. Encryption settings for Glue crawlers, ETL jobs, and development endpoints can be configured using the security configurations in Glue. Glue Data Catalog encryption can be enabled via the settings for the Glue Data Catalog.
2018/08/28 - 3 new api methods
Changes New Glue APIs for creating, updating, reading and deleting Data Catalog resource-based policies.
2018/08/25 - 5 new 18 updated api methods
Changes AWS Glue now supports data encryption at rest for ETL jobs and development endpoints. With encryption enabled, when you run ETL jobs, or development endpoints, Glue will use AWS KMS keys to write encrypted data at rest. You can also encrypt the metadata stored in the Glue Data Catalog using keys that you manage with AWS KMS. Additionally, you can use AWS KMS keys to encrypt the logs generated by crawlers and ETL jobs as well as encrypt ETL job bookmarks. Encryption settings for Glue crawlers, ETL jobs, and development endpoints can be configured using the security configurations in Glue. Glue Data Catalog encryption can be enabled via the settings for the Glue Data Catalog.
2018/07/30 - 4 updated api methods
Changes Glue Development Endpoints now support association of multiple SSH public keys with a development endpoint.
2018/07/10 - 6 updated api methods
Changes AWS Glue adds the ability to crawl DynamoDB tables.
2018/05/25 - 11 updated api methods
Changes AWS Glue now sends a delay notification to Amazon CloudWatch Events when an ETL job runs longer than the specified delay notification threshold.
2018/04/10 - 11 updated api methods
Changes "AWS Glue now supports timeout values for ETL jobs. With this release, all new ETL jobs have a default timeout value of 48 hours. AWS Glue also now supports the ability to start a schedule or job events trigger when it is created."
2018/03/20 - 2 updated api methods
Changes API Updates for DevEndpoint: PublicKey is now optional for CreateDevEndpoint. The new DevEndpoint field PrivateAddress will be populated for DevEndpoints associated with a VPC.
2018/02/06 - 4 updated api methods
Changes This new feature will now allow customers to add a customized json classifier. They can specify a json path to indicate the object, array or field of the json documents they'd like crawlers to inspect when they crawl json files.
2018/01/19 - 3 new 1 updated api methods
Changes New AWS Glue DataCatalog APIs to manage table versions and a new feature to skip archiving of the old table version when updating table.
2018/01/12 - 6 updated api methods
Changes Support is added to generate ETL scripts in Scala which can now be run by AWS Glue ETL jobs. In addition, the trigger API now supports firing when any conditions are met (in addition to all conditions). Also, jobs can be triggered based on a "failed" or "stopped" job run (in addition to a "succeeded" job run).
2017/11/16 - 8 updated api methods
Changes API update for AWS Glue. New crawler configuration attribute enables customers to specify crawler behavior. New XML classifier enables classification of XML data.
2017/10/24 - 1 new 4 updated api methods
Changes AWS Glue: Adding a new API, BatchStopJobRun, to stop one or more job runs for a specified Job.
2017/08/14 - 74 new api methods
Changes AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL. AWS Glue generates the code to execute your data transformations and data loading processes. AWS Glue generates Python code that is entirely customizable, reusable, and portable. Once your ETL job is ready, you can schedule it to run on AWS Glue's fully managed, scale-out Spark environment. AWS Glue provides a flexible scheduler with dependency resolution, job monitoring, and alerting. AWS Glue is serverless, so there is no infrastructure to buy, set up, or manage. It automatically provisions the environment needed to complete the job, and customers pay only for the compute resources consumed while running ETL jobs. With AWS Glue, data can be available for analytics in minutes.