Create, view, edit, and delete jobs

This page covers how to create various jobs and manage them with actions like view, edit, and delete.

Create Masking Job

Click the + Masking Job button at the top of the page, then the Create Masking Job window will appear.
The Details step prompts for the following information:
- Job Name: A free-form name for the job you are creating. Must be unique across the entire application.
- Multi-Tenant: This option allows existing rule sets to be reused to mask identical schemas via different connectors. The connector is selected at job execution time.
  - Selective Data Distribution (SDD): This option must be enabled for the masking job to be eligible for use in VDB masked provisioning as part of the Continuous Data Engine’s SDD feature.
- Masking Method: Select either In-Place or On-The-Fly.
  - In-Place jobs update the source environment with the masked values.
  - On-The-Fly jobs read unmasked data from the source environment and write the masked data to the target environment.
- Rule Set: Select a rule set that this job will execute against.
- Non-conforming Data
  - Stop job on first occurrence (optional): To abort a job on the first occurrence of non-conformant data. The default is for this checkbox to be clear.
    - The job behavior depends on the settings specified in the Algorithm Settings page and on the individual algorithm pages that define how you view the presence of non-conforming data.
    - The setting on the Algorithm Settings page is global and can be overridden by the setting on the algorithm page for that algorithm. These settings declare if the presence of Nonconforming data is a failure or a success for the job.
    - If Mark job as Failed is selected as a result of the above settings then the job would be aborted on the first occurrence of nonconforming data. If Mark job as Succeeded is selected as a result of the above settings then the job will not be aborted.
If On-The-Fly masking method is selected, then You will be prompted for Source information.
- Source Environment: Select the Source Environment from which this job will get the data.
- Source Connector: Select the Source Connector that provides the connection to the chosen Source Environment.
- Database to File Job
  - When an On-The-Fly job is created with source as Database and Target as File, then check this option. This option will make sure you find all the database connectors in the Source Connector dropdown. Otherwise, the Source Connector dropdown will only display File Connectors.

On-The-Fly Masking Jobs

Only certain combinations of connector types are supported.
On-The-Fly jobs where the source and target connectors are of the same type (e.g. Oracle to Oracle, delimited file to delimited file), and jobs with a database source (e.g. Oracle, MS SQL) and the target is delimited files are supported.
While creating Database (Source) to File (Target) job, make sure to check “Database to File Job“ option on the UI, which will display all the database connectors in the source connector dropdown.
The target tables or files must be created in advance and the names must match the names of the source tables or files. In the case of a database to delimited file job, the file names should match the table names.

If the selected Rule Set is a type of Database then one extra step will be added for configuring database options.

You will be prompted for the following information:
- Commit Size (optional): The number of rows to process before issuing a commit to the database.
- Update Threads: The number of update threads to run in parallel to update the target database.
  - Multiple threads should not be used if the masking job contains any table without an index. Multi-threaded masking jobs can lead to deadlocks in the database engine.
  - Multiple threads can cause database engine deadlocks for databases using T-SQL If masking jobs fail and a deadlock error exists on the database engine, then reduce the number of threads.
    - By default, it sets the DefaultUpdateThreads value from Application settings under the group “mask“.
- Batch Update (optional): Enable or disable whether the database load phase to output the masked data will be performed in batches or not. The size of the batches is determined by the Commit Size field value. This option is recommended because it typically improves the performance of the masking job.
- Truncate (Optional and Only in Case of On-The-Fly jobs): To set whether the target database tables should be truncated before loading the masked data into the target database (after the masking phase is done).
- Drop Indexes (optional): Whether to automatically drop indexes on columns that are being masked and automatically re-create the index when the masking job is completed. The default is for this checkbox to not be selected and therefore not perform automatic dropping of indexes.
- Disable Triggers (optional): Whether to automatically disable database triggers. The default is for this checkbox to not be selected and therefore not perform automatic disabling of triggers.
- Disable Constraints (optional): Whether to automatically disable database constraints. The default is for this checkbox to not be selected and therefore not perform automatic disabling of constraints. For more information about database constraints see Enabling and Disabling Database Constraints.
- Enable Tasks (optional): It displays tasks implemented by the driver support plugin being used. The default is for each checkbox to not be selected and therefore not perform any of the tasks. If the masking job being created is for a built-in connector with a built-in driver support plugin, the options displayed will be Disable Constraints, Disable Triggers, and Drop Indexes. For a full list of supported built-in connectors and information on specific built-in driver support plugins, see Built-in Driver Supports.
- Pre SQL Script (optional): Specify a file that contains SQL statements to be run before the job starts. Click Browse to specify a file.
  - If you are editing the job and a Pre SQL Script file is already specified, you can click the Delete button to remove the file.
  - If you want to download the file, click the Download button .
  - The Delete and Download buttons only appear if a Pre SQL Script file was already specified.
  - For information about creating your Pre SQL Script files see Creating SQL statements to run before and after Jobs.
- Post SQL Script (optional): Specify a file that contains SQL statements to be run after the job finishes. Click Browse to specify a file.
  - If you are editing the job and a Post SQL Script file is already specified, you can click the Delete button to remove the file.
  - If you want to download the file, click the Download button .
  - The Delete and Download buttons only appear if a Post SQL Script file was already specified.
  - For information about creating your Post SQL Script files see Creating SQL statements to run before and after Jobs.

Pre/Post SQL Script will always connect to the database. If no Script is loaded, the Pre/Post SQL will make a connection with an empty statement.

After clicking next, on the next step of the wizard, You will be prompted for the following information.
1. Number of Streams: The number of parallel streams to use when running the job. For example, you can select two streams to mask two tables in the Rule Set concurrently in the job instead of one table at a time.
  - Choosing the number of streams
    - Jobs, even with a single stream, will have separate execution threads for input, masking, and output logic.
    - While it is not necessary to increase the number of streams to engage multiple CPU cores in a job, doing so may increase overall job performance dramatically, depending on a number of factors.
    - These factors include the performance characteristics of the data source and target, the number of processor cores available to the Delphix Masking Engine, and the number and types of masking algorithms applied in the Rule Set. The memory requirements for a job increase proportionately with the number of streams.
  - By default, it sets the DefaultStreams value from Application settings under the group “mask“.
2. Row Limit: The number of data rows that may be in process simultaneously for each masking stream. For file jobs, this controls the number of delimited or fixed-width lines, mainframe records, or XML elements in process at one time.
  - Setting this value to 0 allows unlimited rows into each stream.
  - Setting this value to -1 or leaving it blank will select a default limit based on rule set type.
  - Choosing the Row Limit
    - The default Row Limit values have been selected to allow typical jobs to run successfully with the default job memory and streams number settings.
    - This assumes a maximum row or record size of approximately 2000 bytes with 100 masked columns. If masked row or record size, or column count, exceed these values, it may be necessary to either allocate more memory to the job by increasing Max Memory, or reduce the Row Limit to a smaller value. Conversely, if the masked rows are quite small and have few masking assignments, increasing the Row Limit may improve job performance. Remember to consider the worst case (the largest rows, the most masking assignments) table or file format in the Rule Set when making this determination.
3. Feedback Size (optional): The number of rows to process before writing a message to the logs. Set this parameter to the appropriate level of detail required for the logs.
4. Min Memory (MB) (optional): Minimum amount of memory to allocate for the job, in megabytes.
5. Max Memory (MB) (optional): Maximum amount of memory to allocate for the job, in megabytes.
  - It is recommended that the Min/Max Memory should be set to at least 1024.
6. Comments (optional): Add comments related to this masking job.
7. Email (optional): Add e-mail address(es) to which to send status messages.
After clicking next, on the last step of the wizard, the Summary will be displayed with the entered details.
When you are finished, click Save.

Database Options

Some built-in connectors support the Disable Constraints, Disable Triggers, and Drop Indexes features (see the Data Source Support page).
For built-in connectors implemented using driver support plugins, these options are available via the Enable Tasks section. For a full list of built-in connectors using driver support plugins, see Built-in Driver Supports).
For all other built-in connectors, these features will appear as checkboxes.

Create Tokenization Job

To Create Tokenization Jobs, click on the + Tokenization Job button on the upper side of the page in the Tokenize environment. The flow and fields will be the same as the Masking job.

Create Re-Identification Job

To create a re-identification job, click on the + Re-Identification Job button on the upper side of the page. The flow and fields will be the same as the Masking job.

Edit job

To perform an edit action on any of the jobs, Click the (…) button to the right of the corresponding row under the Actions column, Edit - action will be visible. If the user doesn’t have permission to edit the job, then it will be disabled.

On clicking Edit action, a wizard will appear for editing the job. The job name is not editable after creation hence it will be disabled for Editing. For other fields, the user can edit and save.

View job

To perform a view action on any of the jobs, Click the (…) button to the right of the corresponding row under the Actions column, View - action will be visible.

Delete job

To perform a delete action on any of the jobs, Click the (…) button to the right of the corresponding row under the Actions column, Delete - action will be visible. If the user doesn’t have permission to delete the job, then it will be disabled.

Clicking Delete Action will prompt for confirmation. Click on Confirm to delete the job.