Skip to main content
Skip table of contents

Amazon Simple Storage Service (S3) connector for files

The Continuous Compliance Engine supports connecting to files and mainframe datasets stored in S3 with either of the following connectors:

  • AWS S3 Connector: Use this connector for native AWS S3 storage.

  • Other S3 Compatible Storage Connector: Use this for object storage from vendors that support the S3 protocol (e.g., GCP).

image-20240513-171314.png

Configuring an AWS S3 Connector

  1. connectorName: Specifies the name of the connector.

  2. environmentId: Indicates the identifier of the environment where the connector will be configured.

  3. fileType: Denotes the type of files to be managed by the connector.

  4. connectionInfo: This section contains details necessary for establishing a connection to the S3 service.

    • connectionMode: Specifies the mode of connection, which is set to "AWS_S3" indicating that it connects to an S3 bucket.

    • prefix: Indicates the prefix to be used for identifying files within the S3 bucket.

    • delimiter: Specifies the delimiter used in the file paths within the S3 bucket.

    • awsRegion: Specifies the AWS region where the S3 bucket is located.

    • awsBucketName: Specifies the name of the S3 bucket to connect to.

    • awsAuthType: The Continuous Compliance Engine offers support for connecting to S3 through two authentication methods: AWS secret-based authentication( AWS_SECRET ) and AWS Roles based authentication( AWS_ROLE ).

Secret-based authentication requires:

  • awsAccessKey: The access key is a Alphanumeric string that uniquely identifies your AWS account.

  • awsSecretKey: The secret key is associated with your access key and is used for signing requests to AWS services.

For more information related to prefix and delimiter, please refer to the Amazon Simple Storage Service documentation.

AWS S3 Connection Mode

Sample payloads

  • AWS Secret
    To connect to S3, Continuous Compliance requires a user's AWS Access Key and Secret Key for authentication.

CODE
{
   "connectorName":"JSON S3",
   "environmentId":37,
   "fileType":"JSON",
   "connectionInfo":{
      "connectionMode":"AWS_S3",
      "prefix":"json/",
      "delimiter":"/",
      "awsRegion":"us-west-2",
      "awsBucketName":"masking-bucket-name",
      "awsAuthType":"AWS_SECRET",
      "awsAccessKey":"<Your AWS Access Key>",
      "awsSecretKey":"<Your AWS Secret Key>"
   }
}
  • AWS Role
    To establish secure communication between a masking engine hosted on EC2 instance and S3, we leverage instance profiles. This approach eliminates the need for static access keys and enhances security by dynamically providing temporary credentials. 

CODE
{
   "connectorName":"JSON S3",
   "environmentId":37,
   "fileType":"JSON",
   "connectionInfo":{
      "connectionMode":"AWS_S3",
      "prefix":"json/",
      "delimiter":"/",
      "awsRegion":"us-west-2",
      "awsBucketName":"masking-bucket-name",
      "awsAuthType":"AWS_ROLE"
   }
}

Configuring Other S3 Compatible storage Connector

  1. connectorName: Specifies the name of the connector.

  2. environmentId: Indicates the identifier of the environment where the connector will be configured.

  3. fileType: Denotes the type of files to be managed by the connector.

  4. connectionInfo: This section contains details necessary for establishing a connection to the S3 compatible storage service.

    1. connectionMode: Specifies the mode of connection, which is set to "S3_COMPATIBLE" indicating that it connects to an S3 Compatible storage.

    2. s3CompatibleDetails:

      1. bucketName: Specifies the name of the S3 compatible storages bucket to connect to.

      2. region: Specifies the region where the S3 compatible storage bucket is located.

      3. prefix: Indicates the prefix to be used for identifying files within the S3 compatible bucket.

      4. delimiter: Specifies the delimiter used in the file paths within the S3 compatible bucket.

      5. accessKey: The access key is a Alphanumeric string that uniquely identifies your account.

      6. secretKey: The secret key is associated with your access key and is used for signing requests to AWS services.

      7. serviceEndpoint: Specify the service endpoint parameter to indicate the URL of the storage service you want to connect to

image-20240513-172129.png

Other S3 Compatible Connection Mode

At present, this connection mode is only supported for storages that are compatible with GCP-hosted buckets. S3 buckets hosted on other clouds may not be compatible.

S3 Compatible Cloud Storage should support the following API.

  • Bucket’s APIs

    • ListBuckets

  • Object APIs

    • DeleteObject

    • GetObject

    • PutObject

    • ListObject

    • CreateObject

  • Multipart Upload APIs

    • InitiateMultipartUpload

    • CompleteMultipartUpload

    • UploadPart

    • ListParts

Required permissions for connecting to GCP object storage

At a minimum, the following permissions are required for accessing the object storage on the GCP cloud

CODE
storage.buckets.list
storage.multipartUploads.abort
storage.multipartUploads.create
storage.multipartUploads.list
storage.multipartUploads.listParts
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
storage.objects.update

Sample payloads

To connect to Other S3 Compatible storage, Continuous Compliance requires a user's Access Key and Secret Key for authentication.

CODE
{
   "connectorName":"S3 Compatible storage",
   "environmentId":37,
   "fileType":"JSON",
   "connectionInfo":{
      "connectionMode":"S3_COMPATIBLE",
      "s3CompatibleDetails" : {
        "prefix":"json/",
        "delimiter":"/",
        "region":"us-west-2",
        "bucketName":"s3-compatible-bucket-name",
        "accessKey":"<Your Access Key>",
        "secretKey":"<Your Secret Key>"
      }
   }
}

S3 upload sizing

S3 supports object sizes up to 5 TB. When uploading objects larger than 100 MB, Amazon recommends using a multipart upload. As the name implies, a multipart upload breaks the object into smaller parts where each is assigned a part number. S3 supports breaking an object into at most 10,000 parts.

When uploading a masked object, Continuous Compliance uses a multipart upload. The part size is calculated by multiplying 20% times the masking job’s maximum memory and then dividing by the masking job’s number of streams. For example if maximum memory is 2 GB and the number of streams is 1, then the part size would be (20% * 2 GB) / 1 = 400 MB.

When working with a large object, you must configure the masking job’s maximum memory and streams so that the object can be uploaded in at most 10,000 parts. For example if you need to mask a 5 TB object (the maximum size for an S3 object), then the part size must be greater than or equal to 5 TB / 10,000 parts ≅ 525 MB. If you plan to use 1 stream, then the maximum memory should be (1 * 525 MB) / .2 = 2,625 MB or more.

For more information related to multipart uploads, please refer to:

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.