Best practices for Delphix data masking
Hypervisor and host
The Continuous Data and Continuous Compliance functions are combined into a single OVA and require additional consideration for installation and configuration.
When configuring the Delphix software for the first time via the Engine Setup wizard, the engine can be setup as either Continuous Data or Continuous Compliance.
The standard configuration for a dedicated Continuous Compliance Engine:
8 vCPUs.
64 GB RAM minimum, 128 GB RAM or more recommended.
The Delphix Engine image requires 127 GB storage for the system disk.
50 GB minimum storage for the data storage pool must be added during the initial configuration via the Engine Setup wizard. There are situations where more storage may be required, particularly when mapping algorithms are in use. Additional storage can be added later, if needed.
The VMDK for the engine OS is often stored on the same VMFS volume as the VM definition file (aka VMX). In that case, the VMFS volume must have sufficient space to hold the VMX Configuration, the VDMK for the system disk, and any VMWare logging.
Additional VMFS space for swap/paging may be required if RAM reservations are not enabled (the VM will not start if reservations are lacking and disk space is not available for swap).
CPU utilization
One vCPU per concurrent masking job is considered a best practice, dependent on the algorithms used. Some are calculations such as ones using AES encryption and others are lookups and tend to do more I/O.
Memory utilization
The Continuous Compliance Engine uses its memory to cache data. More memory will provide better performance.
A minimum of 1GB per masking job is required. More memory is needed when multiple parallel streams are used in a job. Delphix recommends a minimum of 1 GB per stream. More than 1 GB per stream may be needed if the database rows are particularly wide (1,000 rows or more), or the masking job uses algorithms with a large state, for example a lookup file with 500,000 values.
Dependent on memory settings in the Continuous Compliance Engine and JVMs, an increase in parallel workloads will require more memory. Data is either cached directly or using Kettle, so larger lookups for the algorithms require more memory. If encountering performance issues, this is the first thing to evaluate.
Network and I/O
Continuous Compliance leverages the target DB server and VDB for most of the workload. This means the Continuous Compliance Engine can be I/O bound waiting for the DB server. As long as the engine can read the data faster than it can process it, this is not an issue. Slow networks with numerous hops between the DB server and the compliance server can cause performance problems. Co-locating the compliance server with the DB server is recommended in these cases.
Masking VDB tuning
Always start with the tuning recommendations for target servers and VDBs first. If the VDB is not performing well, the performance of data masking will suffer.
For Oracle, it is critical to select no archivelog mode and tune the online redo log size at provision time.
For SQL Server, the VDB should be in SIMPLE recovery with an appropriate log file and TempDB sizes.
Backup of a continuous compliance engine
Virtual machine level backups are recommended.