Ingesting Data in DMM¶
DMM ingests and processes model’s training, prediction and ground truth data to monitor the model. For it to work, you need to set up a Data Source from which it can read this data. Data sources setup in your DMM Organization are accessible to all users within that Organization and can be used for feeding data to any model in that Organization.
In the current release, DMM supports reading data from AWS S3 buckets.
Note: DMM uses the Data Source as a source of truth for reading the data files and doesn’t store the raw data in any of its internal databases. This means users should not delete old data files or overwrite them. Overwriting files will affect drift calculations and cause the mismatch in calculated drift values and drift trends when Date Filters are applied in Analyze UI.
Setting up S3 Data Source¶
The Data Source section enables you to add and configure each S3 bucket as a new data source. You can have multiple such buckets linked with your DMM Organization. To use a connected S3 bucket with DMM, following permissions and access will be required:
READ, WRITE access to the Bucket.
Permission to do following actions:
Following are the steps to adding a S3 bucket as a Data Source:
Go to the Data Sources section, and click ‘Add Data Source’
You will need to provide a name to the data source and fill the Bucket Name and Region of the S3 bucket.
For authentication, you need to provide the Access Key and Secret Token for the S3 bucket on DMM.
Following Server Side Encryption modes are supported by DMM (Refer to AWS SSE Docs for more information on these):
AES256 - This scheme implements the Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3). AWS uses 256-bit Advanced Encryption Standard (AES-256) cipher to encrypt your data.
AWS:KMS - This scheme implements the Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key Management Service (SSE-KMS)
None - In this scheme Server-Side Encryption is not enabled on AWS S3.
Click Add to create the Data source.
Click on the newly created Data Source to make sure you can see files present in the S3 bucket on DMM.
You can use the detailed view of the data source to upload new files to the S3 bucket directly from DMM. (Alternatively, you can upload files to this data source by using AWS CLI, APIs, UI or any other means supported by AWS.
Delete operation is supported on the Data Source.