Skip to main content
Skip table of contents

Algorithms

Introduction to Masking algorithms

This article provides a brief outline of the different algorithm options that are available, along with other general algorithm information. More specific algorithm details can be explored in the Out-Of-The-Box Algorithm Instances or Algorithm Frameworks sections.

An algorithm plugin can be configured through the graphical user interface by entering the plugin's required configuration in JSON format. For more information, visit the General UI for Extended Algorithms page.

Algorithm options

Out-of-the-box algorithm instances

Out-of-the-box algorithm instances are pre-configured ready to use algorithms. The out-of-the-box algorithms with related frameworks can be customized using the corresponding extensible frameworks. For more information on algorithm instance extensibility, see the Extensible algorithms page.

Algorithm frameworks

Algorithm frameworks allow for the creation of algorithm instances with a custom configuration. For more information on algorithm framework extensibility, see the Extensible Algorithms page. More information on multi-column algorithms can be found in the Using Multi-Column Algorithms page.

Configuring your own algorithms

Algorithm settings

The Algorithm tab displays algorithm Names along with Type and Description. This is where you add (create) new algorithms. The default algorithms and any algorithms you have defined appear on this tab.

At the top of the page, Nonconforming Data behavior is displayed to specify how all algorithms should behave if they encounter data values in an unexpected format. Mark job as Failed instructs algorithms to throw an exception that will result in the job failing. Mark job as Succeeded instructs algorithms to ignore the non-conformant data and not throw an exception. Note that Mark job as Succeeded will result in the non-conformant data not being masked should the job succeed, but the Monitor page will display a warning that can be used to report the non-conformant data events.

Creating new algorithms

If none of the default algorithms meet your needs, you might want to create a new algorithm. An algorithm that you create is called a "user-defined algorithm".

Algorithm Frameworks give you the ability to quickly and easily define the algorithms you want, directly on the Settings page. After you create an algorithm, your algorithm will be available to all users.

To add an algorithm:

  1. In the upper right-hand corner of the Algorithm settings tab, click Add Algorithm.

  2. Select an algorithm type.

  3. Complete the form to the right to name and describe your new algorithm.

  4. Click Save.

Editing algorithms

Administrators, as well as users with EDIT Algorithm permission assigned in their Role, may edit any user-defined algorithm on the system.

The following algorithm instances cannot be modified:

  • Instances that ship with and are defined by the system

  • Instances defined by algorithm plugins

Algorithms Keys

Most masking algorithms include a key as part of their configuration. Changing this key changes the output of these algorithms. For example, if the FirstNameLookup algorithm masks “Michelle” to “Rachael,” changing the algorithm’s key might cause it to mask “Michelle” to “Ben”.

An algorithm’s key can be randomized using the following API endpoint:

CODE
PUT https://host.example.com/masking/api/algorithms/{algorithmName}/randomize-key

Multi-column algorithms

Overview

Multi-column algorithms are a special kind of algorithm that allow a single algorithm assignment to be made spanning multiple columns or fields in inventory. This allows coordinated masking of multiple fields - for example, masking two date-time values while preserving the interval between them.

The Dependent Date Shift algorithm is an example of a multi-column algorithm.

Usage

Each multi-column algorithm defines a set of Logical Fields; these logical fields are assigned to the actual fields or columns in inventory, defining how each value will be treated by the algorithm. A particular logical field may be read-only, indicating that it is considered as input but not masked by the multi-column algorithm, and/or optional, meaning the logical field is not required in order for the masking assignment to be complete. Furthermore, the Algorithm Group number allows a multi-column algorithm to be assigned multiple times in the same table or file-format, with the group number indicating which set(s) of logical fields should be processed together as a single assignment.

Incomplete multi-column masking assignments in the inventory may not be detected until such time as a masking job is executed using that inventory. It is important to review each multi-column assignment carefully to ensure that for each Algorithm Group, each non-optional Logical Field is assigned to a column or field in the table or file-format.

Limitations

Multi-column algorithms may only be applied in inventories for data connectors where entire rows or records are processed as a unit.

Specific limitations:

  • Multi-column algorithms are not supported for XML file masking.

  • Multi-column algorithm assignments must be contained with a single Record Type for delimited and fixed-width files.

  • Multi-column algorithm assignments must not cross redefines in VSAM copybooks.

  • Multi-column algorithms may not be called by other algorithms through the algorithm chaining feature.

Algorithm frameworks overview

Choosing an algorithm framework

See the Algorithm Frameworks section for a detailed description of each Algorithm Framework. The algorithm framework you choose will depend on the format of the data and your internal data security guidelines.

Choosing between character and segment mapping frameworks

The Character Mapping algorithm is intended to replace Segment Mapping for many use cases. That said, it does not replicate every feature of that algorithm, so the specific masking application will determine which one is appropriate.

Reasons to choose Character Mapping over Segment Mapping:

  • Character Mapping can mask all characters in the first Unicode plane. Segment Mapping can only mask "[a-zA-Z]" + "[0-9]"

  • Character Mapping automatically preserves all non-masked characters. Segment Mapping requires configuration of preserve characters. Character Mapping is much easier to use when the data is potentially "dirty" or not consistently formatted.

  • Character Mapping can process preserve ranges in reverse, allowing the last positions of an input to be preserved when inputs have different lengths. Segment Mapping preserve ranges are always processed from the beginning of input.

  • Character Mapping uses a more complex masking computation, so that every maskable position influences every other position in the masked value. Segment Mapping pre-computes the permutations for each segment independently.

Reasons to choose Segment Mapping over Character Mapping:

  • Segment mapping can mask different parts of the input, determined by position, differently. Character Mapping always masks the same groups of characters regardless of position.

  • Segment mapping can map inputs to different outputs at a position, like { A, B, C, D } -> { W, X, Y, Z } by specifying different Input and Mask values. This is not possible with Character Mapping.

  • Segment mapping supports numeric segments, with up to 6-digit segments masked to a specific range. Character Mapping doesn't allow this kind of range limiting.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.