ASDD features and support
The ASDD profiler was introduced in Continuous Compliance version 9.0, and represents the future direction for sensitive data discovery. It offers a number of advantages as compared to the legacy profiler, but currently has some limitations as well.
The introduction of the ASDD profiler does not make any changes to the legacy profiler. Existing profiling jobs should continue to function as they have in the past.
ASDD Features
The ASDD profiler uses classifiers rather than search and type expressions. Classifiers support more features and configuration options than expressions.
The LIST classifier framework is new has no equivalent functionality in the legacy profiler.
The TYPE classifier framework uses standard Java SQLType values to identify data types, which should provide broad support for type detection across all database variants.
The PATH classifier supports exact matching and can be configured to consider table name in addition to column name when matching.
The REGEX classifier supports detection of LUHN check digits for data level recognition of credit card numbers.
The ASDD profiler provides better matching when the number of rows in a table is less than the target number of rows for profiling, and in general provides more nuanced confidence value in profiling results.
The ASDD profiler attempts to retrieve more data values when a large fraction data values for a column are null or empty. The threshold to trigger an additional query is controlled by the application setting ASDD/DefaultNullFilterThreshold.
The ASDD profiler supports statistical sampling for Oracle and SQL Server databases, so that the data sampled will better reflect the full range of values for each column across the entire table.
Sample Percentage
When data sampling is employed, the sample percentage is always set to 1% - if this percentage does not yield enough rows, the query is performed again without sampling.
The ASDD Standard profile set contains data level logic by default, allowing some columns containing sensitive information to be identified even if the column names are not meaningful.
New or improved REGEX classifiers for Zip Code and Email Address domains.
New LIST classifiers are present for First and Last Name, US City, US State and Country domains.
Classifiers and profile sets using them may be exported and imported using the Engine Sync feature. Classifiers are included when the Export Settings action is performed from the Environments tab.
ASDD Limitations
The primary limitation of the ASDD Profiler is that it is not yet supported for all connectors. The UI will report an error if the user attempts to save a job using an ASDD profile set with an unsupported connector.
Currently, the following conditions must be met to use the ASDD Profiler:
The connector must be a built-in (not extended) Database connector variant - these are Oracle, SQL Server, MySQL, Postgres, Sybase, and DB2 LUW. DB2 ISeries and Mainframe are not yet supported. File profiling is not supported.
The connector must not use kerberos authentication.
It is currently not possible to manage classifier configuration via the UI. The API client must be used to create, modify and delete classifier instances - refer to this section for details.