Managing inventories

An inventory describes all of the data present in a particular ruleset and defines the methods which will be used to secure it. Inventories typically include the table or file name, column/field name, data classification, and the chosen algorithm.

The inventory screen

From anywhere within an environment, click the Inventory tab to see the Inventory screen. This displays the inventory for the environment's rule sets.

Inventory settings

To specify the inventory settings:

On the left-hand side of the screen, select a Rule Set from the drop-down menu.

The UI has been updated for fixed and delimited inventory. Refer to the Delimited and fixed-width inventory settings below.

Below the rule set selection is Contents, which lists all the tables or files defined for the corresponding rule set.
Select a table or file to create or edit the inventory of sensitive data. The Columns or Fields for that specific table or file appear.
If a column is a primary key (PK), foreign key (FK), or index (IDX), an icon indicating this will appear to the right of the column name. If there is a note for the column, a note icon will appear. Click the icon to read the note.
If you selected a table, metadata for the column appears as Data Type and Length (in parentheses). This information is read-only.
Choose the desired method for viewing the inventory:
- All Fields
  Displays all columns in the table or all fields in the file (allowing you to mark new columns or fields to be masked).
- Masked Fields
  Filters the list to just those columns or fields that are already marked for masking.
- Auto
  The default value. The profiling job can determine or update the algorithm assigned to a column and whether to mask the column.
- User
  The user's choice overrides the profiling job. The user manually updates the algorithm assignment, mask/unmask option of the column. The Profiler will ignore the column, so it will not be updated as part of the Profiling job.

Delimited and fixed-width inventory settings

Select a fixed or delimited ruleset to see an updated UI that corresponds with the selection.

Select/Search a file or file format under the file format dropdown to create or edit the inventory of sensitive data. The Record Types and Fields for that specific file will appear in the grid below.
The count next to the Record Type name reflects the total fields in that record.
You can filter and sort on the grid columns. All columns are filterable except record types.

Assigning algorithms

To set criteria for sensitive columns or fields:

Click the edit icon to the right of a column or field name.
From the Domain drop-down list, select the appropriate sensitive data element type.
The Delphix Masking Engine defaults to a Masking Algorithm as specified in the Settings screen. If necessary, you can override the default algorithm.
- To select a different masking algorithm, choose one from the Algorithm drop-down list. For detailed descriptions of these algorithms, see the Out-of-the-box algorithm instances article.
Select an ID Method:
- Auto
  The default value. The profiling job can determine or update whether to mask a column.
- User
  The user decides whether to mask/unmask a column. The user's choice overrides the profiling job. (The user masking is done after the profiling job is finished.)

In the delimited and fixed-width inventory fields, ID method is replaced with the Automatic updates checkbox. Check (Enabled) is set by default, which is same as setting “Auto”. Uncheck to set to “User”.

You can add/remove notes in the Notes text field.
Once complete, click Save, which must be done for any edits to take effect.

If you select a DATESHIFT algorithm and are not masking a datetime or timestamp column, you must specify a Date Format. (This field only appears if a DATESHIFT algorithm is selected from the Masking Algorithm dropdown.) The default format is yyyy-MM-dd in the legacy UI.

A dropdown provides the capability to add a new date format or select from the existing list in the dropdown. Click on the or icon next to the dropdown for more suggestions on valid formats.

Managing a file inventory

Defining fields

You must select a delimited or fixed-width file connector from the Select Rule Set drop-down list on the left navigation pane, not a database.

To create new fields:

From an Environment Inventory tab, click the Go to File Format setting.

An information banner is added on the page to help user with navigation.

The button to Add Field will appear. Click on it to open an Add field Dialogue.

If you select a DATESHIFT algorithm or multi-column algorithms, more fields will appear in the dialogue. A DATESHIFT algorithm allows you to pick a date format from the dropdown list or specify your own date format.

Fill out the form and click Save.
1. The Field Name and Formatting sections are mandatory.
2. The masking section is optional and can be edited later as well.
Newly added fields will be reflected under the selected record type on the page.
Once added, the fields can be Viewed, Edited, or Deleted using the Action menu (…) on the field.

The user will be prompted for confirmation on Delete.
View/Edit prompts a prefilled dialogue (similar to Add Fields) and the user can make edits as needed.

Inventory, Field, and Edit actions can be used for algorithm assignment and setting up automatic updates. However, editing record-type information will be disabled via inventory.

Record types

You can use record types to perform conditional masking of the file records. If a file has a different set of records spread across multiple rows, then the masking engine should be able to understand all the unique records. For example, a file has the following record in the first 3 columns of each row; first name, last name, and age – but the last column of each row has a unique record like IP address, ethernet address, etc. In this case, you must create a new record type for every unique record present in the file, and assign a specific file format to all the record types. For more information on adding a record type, see the Managing record types article.

Record types can be managed only via the Formats settings, the Inventory screen does not allow adding, updating, and deleting record types.

Managing a mainframe inventory

Redefine conditions

For Mainframe data sets, the inventory also allows for the entry of Redefine Conditions, which are used to handle any occurrences of COBOL's REDEFINES construct that might appear in the copybook. In COBOL, the REDEFINES keyword allows an area of a record to be interpreted in multiple different ways. In the example below, for instance, each record can hold either the details of a person (PERSON-DET) or the details of a company (COMP-DET).

Depending on which group is present, different masking algorithms may need to be applied. Below is the inventory corresponding to this copybook, which allows algorithms to be selected separately for each group.

In order to do any masking, however, the Compliance Engine must be able to determine, for each record, which fields should be read, so that the correct algorithms can be applied. In order to do this, the masking engine uses Redefine Conditions, which are specified in the inventory. Redefine Conditions are boolean expressions that can reference any fields in the record when they are evaluated.

In the example copybook above, the field CUST-TYPE is used to indicate which group is present. If CUST-TYPE holds a 'P', a PERSON-DET group is present, and if it holds a 'C', COMP-DET is present. This can be expressed in the inventory by specifying a Redefine Condition with the value [CUST-TYPE]='P'. This expression indicates that, for each record read from the source file during the masking job, the value of the field CUST-TYPE should be read and compared against the string 'P'. If it is equal, the Compliance Engine will read from the record the fields subordinate to PERSON-DET, and will apply any masking algorithms specified on those fields. Similarly, a Redefine Condition with the value [CUST-TYPE]='C' should be applied to the COMP-DET field. Exactly one of the conditions should evaluate to 'true' for each group of redefined fields. For example, a copybook might have fields A, B REDEFINES A, and C REDEFINES A. Of the Redefine Conditions attached to A, B, and C, one and only one should evaluate to true for each record.

Entering a redefine condition

Click on the orange REDEFINED or REDEF button next to the redefined or redefining field.
Enter a condition in the dialog box that appears. This is the expression that when evaluated to true, causes the subordinate fields to be read and (if they have algorithms assigned) masked.
Click Submit.

Format of redefine conditions

Redefine Conditions allow fields to be compared against either number or string literals. Square brackets [ ] enclosing a field name indicate a variable, which takes on the value of the named field:

CODE

[Field1] = 'An example String'

String literals can be enclosed in either single or double quotes. For fields that are numeric (e.g. PIC S99V9), the operators <, <=, >, and >= can be used in addition to the =operator:

CODE

[Field2] <= -10.5

Also, conditions can be joined using AND, OR, and NOT to form more complex conditions:

CODE

([Field3] > 2.5 AND [Field3] < 10) OR NOT [FIELD4] = 'Z'

Importing and exporting an inventory

To export an inventory:

Click the Export icon in the upper right. The Export Inventory window appears with the name of the currently selected Rule Set as the Inventory Name and a corresponding .csv File Name.
Click Save.

A status pop-up appears. When the export operation is complete, you can click on the Download file name to access the inventory file.

To import an inventory:

In the upper right-hand corner, click the Import icon. The Import Inventory window appears.
Click Select to browse for the name of a comma-separated (.csv) file.
Click Save.

The inventory you imported appears in the Rule Set list for this environment.

Only one rule set can be imported at a time.
The format of an imported .csv file must exactly match the format of the exported inventory. If you plan to import an inventory, you should export it first and then update the exported file as needed before importing it.
After importing the inventory to the 10.0.0.0 version Compliance Engine from older versions, rule set refresh is mandatory when the inventory has any document store type assignments, or the user needs to perform document store type masking on the columns from the imported inventory.

Document store-type masking

This feature provides the ability to mask structured documents that are stored in database columns. This is done by marking a column as Structured and assigning a respective Document Store Type and File Format to it.

With the release of version 10.0.0.0 of the Continuous Compliance engine, the document store type masking will support automatic datatype identification. This will be done by using the JDBC SQL Type associated with columns. String and BLOB types will be supported for document store type masking.

With version 10.0.0.0 release

In the case of existing rulesets, a ruleset refresh is mandatory before using Document Store Type masking.
Masking jobs having rulesets with document store type assignments will need mandatory ruleset refresh. Without ruleset refresh job will not be allowed to run.
Masking jobs having rulesets without document store type assignments will not need ruleset refresh.
Ruleset refresh is not required for newly created rulesets.

The column type should be from one of the following JDBC SQL Types: CHAR, NCHAR, VARCHAR, NVARCHAR, CLOB, NCLOB, LONGVARCHAR, LONGNVARCHAR, BLOB, SQLXML.
BLOB type will not be supported for MySQL databases.
SQLXML type will be only supported for Oracle databases.
The file format must be either XML or JSON.

Columns with a supported data type have a setting called Data Model, which can be set to either “Plain” or “Structured” values.

As shown in the image below, columns with “Plain” selected as the Data Model can be masked as a single value by assigning a Domain and Algorithm.

When the “Structured” value is selected for the Data Model, a Document Store Type and File Format can be assigned as shown in the image below.

The image below shows the Inventory screen for a rule set with a structured column. To quickly access an assigned File Format from this screen (books.xml in this example), click on the file format's name in the File Format panel in the lower left.

Multi-column algorithm support

With the release of version 10.0.0.0, Multi-column algorithms will be supported for JSON and XML document store type masking with limited buffer-data size.

Buffer size (in bytes) will be using calculated using the below formula:

((Max_memory_of_Job/No_of_streams_for_job)*CharStreamingBufferLimitRate)/100

The default values will be used when the maximum memory and number of the stream for the job are not defined.
Buffer-data size is configurable via the application setting CharStreamingBufferLimitRate under Mask group settings. For adjusting CharStreamingBufferLimitRate, refer to the Masking API client.

The fields having multi-column assignments should not exceed the limit of buffer data size. In case of exceeding the limit of buffer data size, the job will fail. Users can configure buffer size by adjusting CharStreamingBufferLimitRate to avoid exceeding the buffer data size issue.

JSON file format

For details on multi-column algorithm support with JSON file format refer to JSON file masking.

XML file format

In the case of XML document store type masking, multi-column algorithm assignment to XML elements will not be validated at the time of assignment. XML can be difficult to find out if an element is a type of an array or a single element until the whole data is read. Here, the masking job will fail immediately when any of the invalid multi-column assignments are found while running the job. Make sure the algorithm assignment should follow the below rules.

Multi-column algorithm for XML file masking is not supported.
Multi-column algorithm assignment to XML attributes is not supported.
Multi-column algorithm is not supported for XML elements where,
- The element is a type of array.
- Elements are part of different arrays.
- Elements are on different levels having one or more elements of type array.

Below is a sample XML file format with valid and invalid multi-column assignment examples.