2020/03/27 - AWSKendraFrontendService - 4 updated api methods
Changes The Amazon Kendra Microsoft SharePoint data source now supports include and exclude regular expressions and change log features. Include and exclude regular expressions enable you to provide a list of regular expressions to match the display URL of SharePoint documents to either include or exclude documents respectively. When you enable the changelog feature it enables Amazon Kendra to use the SharePoint change log to determine which documents to update in the index.
{'Configuration': {'SharePointConfiguration': {'ExclusionPatterns': ['string'], 'InclusionPatterns': ['string'], 'UseChangeLog': 'boolean'}}}
Creates a data source that you use to with an Amazon Kendra index.
You specify a name, connector type and description for your data source. You can choose between an S3 connector, a SharePoint Online connector, and a database connector.
You also specify configuration information such as document metadata (author, source URI, and so on) and user context information.
CreateDataSource is a synchronous operation. The operation returns 200 if the data source was successfully created. Otherwise, an exception is raised.
See also: AWS API Documentation
Request Syntax
client.create_data_source( Name='string', IndexId='string', Type='S3'|'SHAREPOINT'|'DATABASE', Configuration={ 'S3Configuration': { 'BucketName': 'string', 'InclusionPrefixes': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'DocumentsMetadataConfiguration': { 'S3Prefix': 'string' }, 'AccessControlListConfiguration': { 'KeyPath': 'string' } }, 'SharePointConfiguration': { 'SharePointVersion': 'SHAREPOINT_ONLINE', 'Urls': [ 'string', ], 'SecretArn': 'string', 'CrawlAttachments': True|False, 'UseChangeLog': True|False, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DocumentTitleFieldName': 'string' }, 'DatabaseConfiguration': { 'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL', 'ConnectionConfiguration': { 'DatabaseHost': 'string', 'DatabasePort': 123, 'DatabaseName': 'string', 'TableName': 'string', 'SecretArn': 'string' }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'ColumnConfiguration': { 'DocumentIdColumnName': 'string', 'DocumentDataColumnName': 'string', 'DocumentTitleColumnName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ChangeDetectingColumns': [ 'string', ] }, 'AclConfiguration': { 'AllowedGroupsColumnName': 'string' } } }, Description='string', Schedule='string', RoleArn='string' )
string
[REQUIRED]
A unique name for the data source. A data source name can't be changed without deleting and recreating the data source.
string
[REQUIRED]
The identifier of the index that should be associated with this data source.
string
[REQUIRED]
The type of repository that contains the data source.
dict
[REQUIRED]
The connector configuration information that is required to access the repository.
S3Configuration (dict) --
Provides information to create a connector for a document repository in an Amazon S3 bucket.
BucketName (string) -- [REQUIRED]
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
(string) --
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia .
(string) --
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the AWS S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source.
KeyPath (string) --
Path to the AWS S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides information necessary to create a connector for a Microsoft SharePoint site.
SharePointVersion (string) -- [REQUIRED]
The version of Microsoft SharePoint that you are using as a data source.
Urls (list) -- [REQUIRED]
The URLs of the Microsoft SharePoint site that contains the documents that should be indexed.
(string) --
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Microsoft SharePoint Data Source . For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
CrawlAttachments (boolean) --
TRUE to include attachments to documents stored in your Microsoft SharePoint site in the index; otherwise, FALSE .
UseChangeLog (boolean) --
Set to TRUE to use the Microsoft SharePoint change log to determine the documents that need to be updated in the index. Depending on the size of the SharePoint change log, it may take longer for Amazon Kendra to use the change log than it takes it to determine the changed documents using the Amazon Kendra document crawler.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft SharePoint attributes to custom fields in the Amazon Kendra index. You must first create the index fields using the operation before you map SharePoint attributes. For more information, see Mapping Data Source Fields .
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DatabaseConfiguration (dict) --
Provides information necessary to create a connector for a database.
DatabaseEngineType (string) -- [REQUIRED]
The type of database engine that runs the database.
ConnectionConfiguration (dict) -- [REQUIRED]
The information necessary to connect to a database.
DatabaseHost (string) -- [REQUIRED]
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) -- [REQUIRED]
The port that the database uses for connections.
DatabaseName (string) -- [REQUIRED]
The name of the database containing the document data.
TableName (string) -- [REQUIRED]
The name of the table that contains the document data.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source . For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
ColumnConfiguration (dict) -- [REQUIRED]
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) -- [REQUIRED]
The column that provides the document's unique identifier.
DocumentDataColumnName (string) -- [REQUIRED]
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ChangeDetectingColumns (list) -- [REQUIRED]
One to five columns that indicate when a document in the database has changed.
(string) --
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) -- [REQUIRED]
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext field of the Query operation.
string
A description for the data source.
string
Sets the frequency that Amazon Kendra will check the documents in your repository and update the index. If you don't set a schedule Amazon Kendra will not periodically update the index. You can call the StartDataSourceSyncJob operation to update the index.
string
[REQUIRED]
The Amazon Resource Name (ARN) of a role with permission to access the data source. For more information, see IAM Roles for Amazon Kendra .
dict
Response Syntax
{ 'Id': 'string' }
Response Structure
(dict) --
Id (string) --
A unique identifier for the data source.
{'ClientToken': 'string'}
Creates a new Amazon Kendra index. Index creation is an asynchronous operation. To determine if index creation has completed, check the Status field returned from a call to . The Status field is set to ACTIVE when the index is ready to use.
Once the index is active you can index your documents using the operation or using one of the supported data sources.
See also: AWS API Documentation
Request Syntax
client.create_index( Name='string', RoleArn='string', ServerSideEncryptionConfiguration={ 'KmsKeyId': 'string' }, Description='string', ClientToken='string' )
string
[REQUIRED]
The name for the new index.
string
[REQUIRED]
An IAM role that gives Amazon Kendra permissions to access your Amazon CloudWatch logs and metrics. This is also the role used when you use the BatchPutDocument operation to index documents from an Amazon S3 bucket.
dict
The identifier of the AWS KMS customer managed key (CMK) to use to encrypt data indexed by Amazon Kendra. Amazon Kendra doesn't support asymmetric CMKs.
KmsKeyId (string) --
The identifier of the AWS KMS customer master key (CMK). Amazon Kendra doesn't support asymmetric CMKs.
string
A description for the index.
string
A token that you provide to identify the request to create an index. Multiple calls to the CreateIndex operation with the same client token will create only one index.”
This field is autopopulated if not provided.
dict
Response Syntax
{ 'Id': 'string' }
Response Structure
(dict) --
Id (string) --
The unique identifier of the index. Use this identifier when you query an index, set up a data source, or index a document.
{'Configuration': {'SharePointConfiguration': {'ExclusionPatterns': ['string'], 'InclusionPatterns': ['string'], 'UseChangeLog': 'boolean'}}}
Gets information about a Amazon Kendra data source.
See also: AWS API Documentation
Request Syntax
client.describe_data_source( Id='string', IndexId='string' )
string
[REQUIRED]
The unique identifier of the data source to describe.
string
[REQUIRED]
The identifier of the index that contains the data source.
dict
Response Syntax
{ 'Id': 'string', 'IndexId': 'string', 'Name': 'string', 'Type': 'S3'|'SHAREPOINT'|'DATABASE', 'Configuration': { 'S3Configuration': { 'BucketName': 'string', 'InclusionPrefixes': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'DocumentsMetadataConfiguration': { 'S3Prefix': 'string' }, 'AccessControlListConfiguration': { 'KeyPath': 'string' } }, 'SharePointConfiguration': { 'SharePointVersion': 'SHAREPOINT_ONLINE', 'Urls': [ 'string', ], 'SecretArn': 'string', 'CrawlAttachments': True|False, 'UseChangeLog': True|False, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DocumentTitleFieldName': 'string' }, 'DatabaseConfiguration': { 'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL', 'ConnectionConfiguration': { 'DatabaseHost': 'string', 'DatabasePort': 123, 'DatabaseName': 'string', 'TableName': 'string', 'SecretArn': 'string' }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'ColumnConfiguration': { 'DocumentIdColumnName': 'string', 'DocumentDataColumnName': 'string', 'DocumentTitleColumnName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ChangeDetectingColumns': [ 'string', ] }, 'AclConfiguration': { 'AllowedGroupsColumnName': 'string' } } }, 'CreatedAt': datetime(2015, 1, 1), 'UpdatedAt': datetime(2015, 1, 1), 'Description': 'string', 'Status': 'CREATING'|'DELETING'|'FAILED'|'UPDATING'|'ACTIVE', 'Schedule': 'string', 'RoleArn': 'string', 'ErrorMessage': 'string' }
Response Structure
(dict) --
Id (string) --
The identifier of the data source.
IndexId (string) --
The identifier of the index that contains the data source.
Name (string) --
The name that you gave the data source when it was created.
Type (string) --
The type of the data source.
Configuration (dict) --
Information that describes where the data source is located and how the data source is configured. The specific information in the description depends on the data source provider.
S3Configuration (dict) --
Provides information to create a connector for a document repository in an Amazon S3 bucket.
BucketName (string) --
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
(string) --
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia .
(string) --
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the AWS S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source.
KeyPath (string) --
Path to the AWS S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides information necessary to create a connector for a Microsoft SharePoint site.
SharePointVersion (string) --
The version of Microsoft SharePoint that you are using as a data source.
Urls (list) --
The URLs of the Microsoft SharePoint site that contains the documents that should be indexed.
(string) --
SecretArn (string) --
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Microsoft SharePoint Data Source . For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
CrawlAttachments (boolean) --
TRUE to include attachments to documents stored in your Microsoft SharePoint site in the index; otherwise, FALSE .
UseChangeLog (boolean) --
Set to TRUE to use the Microsoft SharePoint change log to determine the documents that need to be updated in the index. Depending on the size of the SharePoint change log, it may take longer for Amazon Kendra to use the change log than it takes it to determine the changed documents using the Amazon Kendra document crawler.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft SharePoint attributes to custom fields in the Amazon Kendra index. You must first create the index fields using the operation before you map SharePoint attributes. For more information, see Mapping Data Source Fields .
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DatabaseConfiguration (dict) --
Provides information necessary to create a connector for a database.
DatabaseEngineType (string) --
The type of database engine that runs the database.
ConnectionConfiguration (dict) --
The information necessary to connect to a database.
DatabaseHost (string) --
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) --
The port that the database uses for connections.
DatabaseName (string) --
The name of the database containing the document data.
TableName (string) --
The name of the table that contains the document data.
SecretArn (string) --
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source . For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
ColumnConfiguration (dict) --
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) --
The column that provides the document's unique identifier.
DocumentDataColumnName (string) --
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ChangeDetectingColumns (list) --
One to five columns that indicate when a document in the database has changed.
(string) --
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) --
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext field of the Query operation.
CreatedAt (datetime) --
The Unix timestamp of when the data source was created.
UpdatedAt (datetime) --
The Unix timestamp of when the data source was last updated.
Description (string) --
The description of the data source.
Status (string) --
The current status of the data source. When the status is ACTIVE the data source is ready to use. When the status is FAILED , the ErrorMessage field contains the reason that the data source failed.
Schedule (string) --
The schedule that Amazon Kendra will update the data source.
RoleArn (string) --
The Amazon Resource Name (ARN) of the role that enables the data source to access its resources.
ErrorMessage (string) --
When the Status field value is FAILED , the ErrorMessage field contains a description of the error that caused the data source to fail.
{'Configuration': {'SharePointConfiguration': {'ExclusionPatterns': ['string'], 'InclusionPatterns': ['string'], 'UseChangeLog': 'boolean'}}}
Updates an existing Amazon Kendra data source.
See also: AWS API Documentation
Request Syntax
client.update_data_source( Id='string', Name='string', IndexId='string', Configuration={ 'S3Configuration': { 'BucketName': 'string', 'InclusionPrefixes': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'DocumentsMetadataConfiguration': { 'S3Prefix': 'string' }, 'AccessControlListConfiguration': { 'KeyPath': 'string' } }, 'SharePointConfiguration': { 'SharePointVersion': 'SHAREPOINT_ONLINE', 'Urls': [ 'string', ], 'SecretArn': 'string', 'CrawlAttachments': True|False, 'UseChangeLog': True|False, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DocumentTitleFieldName': 'string' }, 'DatabaseConfiguration': { 'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL', 'ConnectionConfiguration': { 'DatabaseHost': 'string', 'DatabasePort': 123, 'DatabaseName': 'string', 'TableName': 'string', 'SecretArn': 'string' }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'ColumnConfiguration': { 'DocumentIdColumnName': 'string', 'DocumentDataColumnName': 'string', 'DocumentTitleColumnName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ChangeDetectingColumns': [ 'string', ] }, 'AclConfiguration': { 'AllowedGroupsColumnName': 'string' } } }, Description='string', Schedule='string', RoleArn='string' )
string
[REQUIRED]
The unique identifier of the data source to update.
string
The name of the data source to update. The name of the data source can't be updated. To rename a data source you must delete the data source and re-create it.
string
[REQUIRED]
The identifier of the index that contains the data source to update.
dict
Configuration information for a Amazon Kendra data source.
S3Configuration (dict) --
Provides information to create a connector for a document repository in an Amazon S3 bucket.
BucketName (string) -- [REQUIRED]
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
(string) --
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia .
(string) --
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the AWS S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source.
KeyPath (string) --
Path to the AWS S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides information necessary to create a connector for a Microsoft SharePoint site.
SharePointVersion (string) -- [REQUIRED]
The version of Microsoft SharePoint that you are using as a data source.
Urls (list) -- [REQUIRED]
The URLs of the Microsoft SharePoint site that contains the documents that should be indexed.
(string) --
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Microsoft SharePoint Data Source . For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
CrawlAttachments (boolean) --
TRUE to include attachments to documents stored in your Microsoft SharePoint site in the index; otherwise, FALSE .
UseChangeLog (boolean) --
Set to TRUE to use the Microsoft SharePoint change log to determine the documents that need to be updated in the index. Depending on the size of the SharePoint change log, it may take longer for Amazon Kendra to use the change log than it takes it to determine the changed documents using the Amazon Kendra document crawler.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft SharePoint attributes to custom fields in the Amazon Kendra index. You must first create the index fields using the operation before you map SharePoint attributes. For more information, see Mapping Data Source Fields .
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DatabaseConfiguration (dict) --
Provides information necessary to create a connector for a database.
DatabaseEngineType (string) -- [REQUIRED]
The type of database engine that runs the database.
ConnectionConfiguration (dict) -- [REQUIRED]
The information necessary to connect to a database.
DatabaseHost (string) -- [REQUIRED]
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) -- [REQUIRED]
The port that the database uses for connections.
DatabaseName (string) -- [REQUIRED]
The name of the database containing the document data.
TableName (string) -- [REQUIRED]
The name of the table that contains the document data.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source . For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
ColumnConfiguration (dict) -- [REQUIRED]
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) -- [REQUIRED]
The column that provides the document's unique identifier.
DocumentDataColumnName (string) -- [REQUIRED]
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ChangeDetectingColumns (list) -- [REQUIRED]
One to five columns that indicate when a document in the database has changed.
(string) --
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) -- [REQUIRED]
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext field of the Query operation.
string
The new description for the data source.
string
The new update schedule for the data source.
string
The Amazon Resource Name (ARN) of the new role to use when the data source is accessing resources on your behalf.
None