Algorithms

Introduction to Masking Algorithms

This article provides a brief outline of the different available algorithm options, along with other general algorithm information. More specific algorithm details can be explored in the Out-Of-The-Box Algorithm Instances or Algorithm Frameworks sections.

An algorithm plugin can be configured through the graphical user interface by entering the plugin's required configuration in JSON format. For more information, visit the General UI for Extended Algorithms page.

Algorithm options

Out-of-the-box algorithm instances

Out-of-the-box algorithm instances are pre-configured ready-to-use algorithms. The out-of-the-box algorithms with related frameworks can be customized using the corresponding extensible frameworks. For more information on algorithm instance extensibility, see the Extensible Algorithms page.

Algorithm Instances	Extensible?	Related Framework
dlpx-core:ABARoutingNumber SL	X	Secure Lookup
AccNoLookup	X	Secure Lookup
AddrLookup	X	Secure Lookup
AddrLine2Lookup	X	Secure Lookup
dlpx-core:Age SL	X	Secure Lookup
BusinessLegalEntityLookup	X	Secure Lookup
dlpx-core:CM Alpha-Numeric	X	Character Mapping
dlpx-core:CM Digits	X	Character Mapping
dlpx-core:CM Numeric	X
CommentLookup	X	Secure Lookup
Credit Card	X	Payment Card
Date Shift Discrete	X
Date Shift Fixed	X	Date Shift
Date Shift Variable	X
DrivingLicenseNoLookup	X	Secure Lookup
DummyHospitalNameLookup	X	Secure Lookup
EmailLookup	X	Secure Lookup
dlpx-core:Email SL	X	Email
dlpx-core:Email Unique	X	Email
dlpx-core:FirstName	X	Name
FirstNameLookup	X	Secure Lookup
dlpx-core:FullName	X	Full Name
FullNMLookup	X	Secure Lookup
LastCommaFirstLookup	X	Secure Lookup
dlpx-core:LastName	X	Name
LastNameLookup	X	Secure Lookup
dlpx-core:Lat_Long Coordinates	X	Regex Decompose
NullValueLookup	X
dlpx-core:Phone Unique	X
dlpx-core:Phone US	X
RandomValueLookup	X	Secure Lookup
dlpx-core:Redact Digits-Zero	X
RepeatFirstDigit	X
SchoolNameLookup	X	Secure Lookup
SecureShuffle	X
dlpx-core:TimeRange	X	Segment Mapping
dlpx-core:SwiftCode SL	X	Secure Lookup
USCitiesLookup	X	Secure Lookup
USstatecodesLookup	X	Secure Lookup
USstatesLookup	X	Secure Lookup
WebURLsLookup	X	Secure Lookup

Algorithm frameworks

Algorithm frameworks allow for the creation of algorithm instances with a custom configuration. For more information on algorithm framework extensibility, see the Extensible Algorithms page. More information on multi-column algorithms can be found in the Using Multi-Column Algorithms page.

Algorithm Framework	Extensible?	Multi-Column?	Out of the Box Instances
Binary Lookup	X
Character Mapping	X		dlpx-core:CM Alpha-Numeric dlpx-core:CM Digits
Character Replacement	X
Data Cleansing	X
Date Replacement	X
Date Shift	X		Date Shift Fixed
Dependent Date Shift	X	X
Email	X		dlpx-core:Email Unique dlpx-core:Email SL
Free Text Redaction	X
Full Name	X		dlpx-core:FullName
Mapping	X
Min Max	X
Name	X		dlpx-core:FirstName dlpx-core:LastName
Numeric Expression	X
Payment Card	X		Credit Card
Regex Decompose	X		dlpx-core:Lat_Long Coordinates
Secure Lookup	X		See the Secure Lookup (out of the box algorithm instances) page.
Segment Mapping	X		dlpx-core:TimeRange
String Algorithm Chain	X
Tokenization	X

Configuring your algorithms

Algorithm settings

The Settings > Algorithm tab displays algorithm Names along with Type and Description. This is where you add (create) new algorithms. The default algorithms and any algorithms you have defined appear on this tab.

The algorithms on the screen can be filtered or sorted by the various informational fields by clicking on the respective fields. More information on grid filtering and sorting can be found here.

Sortable fields are Algorithm Name, Framework Name, and Mask Type.
Filterable fields are Algorithm Name, Framework Name, Mask Type, and Owner.

Creating new algorithms

If none of the default algorithms meet your needs, you might want to create a new algorithm. An algorithm that you create is called a "user-defined algorithm".

Algorithm Frameworks give you the ability to quickly and easily define the algorithms you want, directly on the Settings page. After you create an algorithm, your algorithm will be available to all users.

To add an algorithm:

Click the + Algorithm button from the top-right corner above the Algorithms grid.
Enter the Algorithm Name and description, Select Framework and click next.
Enter the configuration according to the Algorithm framework, and click next.
View the summary on the last step to confirm the changes.
Click Save.

Algorithm creation and editing are types of Async Tasks. In some cases, it might take some time to create an Asycn task for creating or editing an algorithm. In that case, the User can select the Run in background option when it appears on the wizard which will close the wizard run the task in the background, and continue with other work.

Editing algorithms

Administrators and users with EDIT Algorithm permission assigned in their Role may edit any user-defined algorithm on the system.

The following algorithm instances cannot be modified:

Instances that ship with and are defined by the system
Instances defined by algorithm plugins

For editing an algorithm, click the (…) button to the right of the corresponding row under the Actions column and select the Edit option.

Algorithm name and framework type cannot be changed after creation. Users can edit the configuration on the second step of the wizard and click save to modify changes.

Viewing algorithms

Algorithm instances that are defined by the system and defined by algorithm plugins can only be viewed. Click the (…) button to the right of the corresponding row under the Actions column and select the View option.

Deleting algorithms

Click the (…) button to the right of the corresponding row under the Actions column and select the Delete option to delete an algorithm.

Algorithm instances that are defined by the system and defined by algorithm plugins cannot be deleted.

Algorithms Keys

Most masking algorithms include a key as part of their configuration. Changing this key changes the output of these algorithms. For example, if the FirstNameLookup algorithm masks “Michelle” to “Rachael,” changing the algorithm’s key might cause it to mask “Michelle” to “Ben”.

An algorithm’s key can be randomized using the following API endpoint:

CODE

PUT https://host.example.com/masking/api/algorithms/{algorithmName}/randomize-key

Multi-column algorithms

Overview

Multi-column algorithms are a special kind of algorithm that allows a single algorithm assignment to be made spanning multiple columns or fields in inventory. This allows coordinated masking of multiple fields - for example, masking two date-time values while preserving the interval between them.

The Dependent Date Shift algorithm is an example of a multi-column algorithm.

Usage

Each multi-column algorithm defines a set of Logical Fields; these logical fields are assigned to the actual fields or columns in inventory, defining how each value will be treated by the algorithm. A particular logical field may be read-only, indicating that it is considered as input but not masked by the multi-column algorithm, and/or optional, meaning the logical field is not required in order for the masking assignment to be complete. Furthermore, the Algorithm Group number allows a multi-column algorithm to be assigned multiple times in the same table or file-format, with the group number indicating which set(s) of logical fields should be processed together as a single assignment.

Incomplete multi-column masking assignments in the inventory may not be detected until such time as a masking job is executed using that inventory. It is important to review each multi-column assignment carefully to ensure that for each Algorithm Group, each non-optional Logical Field is assigned to a column or field in the table or file-format.

Limitations

Multi-column algorithms may only be applied in inventories for data connectors where entire rows or records are processed as a unit.

Specific limitations:

Multi-column algorithms are not supported for XML file masking.
Multi-column algorithm assignments must be contained with a single Record Type for delimited and fixed-width files.
Multi-column algorithm assignments must not cross redefines in VSAM copybooks.
Multi-column algorithms may not be called by other algorithms through the algorithm chaining feature.

Algorithm frameworks overview

Choosing an algorithm framework

See the Algorithm Frameworks section for a detailed description of each Algorithm Framework. The algorithm framework you choose will depend on the format of the data and your internal data security guidelines.

Choosing between character and segment mapping frameworks

The Character Mapping algorithm is intended to replace Segment Mapping for many use cases. That said, it does not replicate every feature of that algorithm, so the specific masking application will determine which one is appropriate.

Reasons to choose Character Mapping over Segment Mapping:

Character Mapping can mask all characters in the first Unicode plane. Segment Mapping can only mask "[a-zA-Z]" + "[0-9]"
Character Mapping automatically preserves all non-masked characters. Segment Mapping requires configuration of preserve characters. Character Mapping is much easier to use when the data is potentially "dirty" or not consistently formatted.
Character Mapping can process preserve ranges in reverse, allowing the last positions of an input to be preserved when inputs have different lengths. Segment Mapping preserve ranges are always processed from the beginning of input.
Character Mapping uses a more complex masking computation, so that every maskable position influences every other position in the masked value. Segment Mapping pre-computes the permutations for each segment independently.

Reasons to choose Segment Mapping over Character Mapping:

Segment mapping can mask different parts of the input, determined by position, differently. Character Mapping always masks the same groups of characters regardless of position.
Segment mapping can map inputs to different outputs at a position, like { A, B, C, D } -> { W, X, Y, Z } by specifying different Input and Mask values. This is not possible with Character Mapping.
Segment mapping supports numeric segments, with up to 6-digit segments masked to a specific range. Character Mapping doesn't allow this kind of range limiting.