Skip to main content
Skip table of contents

Data cleansing

See Data cleansing for more information about this algorithm framework.

Creating a data cleansing algorithm via API

  1. Retrieve the frameworkIdfor the Data Cleansing Framework. This can be done via the following endpoint:

    CODE
    algorithm   GET /algorithm/frameworks

    The framework information should look similar to the following:

    CODE
    {
        "frameworkId": 24,
        "frameworkName": "Data Cleansing",
        "frameworkType": "STRING",
        "plugin": {
            "pluginId": 7,
            "pluginName": "dlpx-core",
            "pluginAuthor": "Delphix Engineering",
            "pluginType": "EXTENDED_ALGORITHM"
        }
    }
  2. Upload a lookup file via the following endpoint:

    CODE
    fileUpload   POST /file-uploads

    Copy the fileReferenceId value returned in the Response Body.

  3. Create a Data Cleansing algorithm via the following endpoint:

    CODE
    algorithm   POST /algorithms

    Using the JSON formatted input, similar to the following example:

    CODE
    {
        "algorithmName": "demoDataCleansing",
        "algorithmType": "COMPONENT",
        "frameworkId": 24,
        "algorithmExtension": {
            "lookupFile": {
                "uri": "delphix-file://upload/f_52b19f8a9125435a83a1237fa53aeaf5/sample.txt"
            },
            "delimiter": "=",
            "caseSensitive": false,
            "trimWhitespace": true
        }
    }

Data cleansing algorithm extension

  • lookupFile(required)

String The fileReferenceId value returned from the fileUpload endpoint for uploading files to the Masking Engine. The file should contain a newline separated list of {value, replacement} pairs separated by the delimiter. No extraneous whitespace should be present.

  • delimiter(required, minLength=1; maxLength=50; default="=")

String The delimiter string used to separate {value, replacement} pairs in the lookup file.

  • caseSensitive(optional, default=true)

Boolean Whether the case of the input string must match the values in the lookup file.

  • trimWhitespace(optional, default=true)

Boolean Whether to trim leading and trailing whitespace from the input string. Note: This must be true to cleanse fixed-width files and fixed-length database data types such as CHAR and NCHAR.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.