Table of Contents
Amazon Macie is machine learning service from AWS that attempts to identify sensitive data in S3 buckets.
From the AWS Service Description:
Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS.
As organizations manage growing volumes of data, identifying and protecting their sensitive data at scale can become increasingly complex, expensive, and time-consuming. Amazon Macie automates the discovery of sensitive data at scale and lowers the cost of protecting your data. Macie automatically provides an inventory of Amazon S3 buckets including a list of unencrypted buckets, publicly accessible buckets, and buckets shared with AWS accounts outside those you have defined in AWS Organizations. Then, Macie applies machine learning and pattern matching techniques to the buckets you select to identify and alert you to sensitive data, such as personally identifiable information (PII). Macie’s alerts, or findings, can be searched and filtered in the AWS Management Console and sent to Amazon CloudWatch Events for easy integration with existing workflow or event management systems, or to be used in combination with AWS services, such as AWS Step Functions to take automated remediation actions. (sourced from the AWS product page)
Macie is a regional service, so you need to make sure you’re enabling an monitoring all regions. Also not all regions support Macie, so factor that into your decisioning process on which regions to allow S3 buckets to be created in.
Macie’s service offering is quite expensive. While their recent re-launch of Macie cut the price by 80%, you’re still paying $1/GB of data processed. Mandating this service across a large dataset might not be cost effective.
Effectively Leveraging Macie