DLP Deidentify & Mask

gcp / gcp-deidentify-and-mask

Overview

Deidentify & mask sensitive data in a document or tagged nodes using GCP DLP services

Parameters

Parameter Name

Default

Required?

Type

Description

tag_name_re

.*

False

regex

Regular expression used to identify nodes for processing by tag name

tag_name_to_apply

replaced

True

string

The tag to apply to processed & replaced nodes

info_types

[]

False

list

The types of data that should be inspected and masked. If nothing is provided, types are selected automatically by the API.

masking_character

False

string

The character to mask matching sensitive data with.

number_to_mask

0

False

number

The maximum number of sensitive characters to mask in a match. If omitted or set to zero, the API will default to no maximum.

get_all_content

False

False

boolean

The text content of the processed node will be built from the complete content, including the content of children.

Code Samples

The following section provides an example of how you might use this cloud action in different languages.

YAML

The following is an example of how you register the action using YAML

- name: Example Step
organizationSlug: gcp
slug: gcp-deidentify-and-mask
options: {}

Python

The following is an example of adding the action to a Python pipeline

pipeline = Pipeline(FolderConnector(path=str(get_test_directory()), file_filter='*.*'))
pipeline.add_step(KodexaCloudService(slug='gcp/gcp-deidentify-and-mask', options={}, attach_source=False))
pipeline.set_sink(document_sink)
pipeline.run()

Get Started with Python

Java

The following is an example of adding the action to a Python pipeline

InMemorySink sink = new InMemorySink();
Pipeline pipeline = new Pipeline(new FolderConnector("local_files", "*.*"));
pipeline.addStep(new KodexaCloudService("kodexa", "html-parse",
Options.start().set("tag_name_re", ".*").set("tag_name_to_apply", "replaced").set("info_types", "[]").set("masking_character", "").set("number_to_mask", "0").set("get_all_content", "False")
));
pipeline.setSink(sink);
PipelineContext context = pipeline.run();

Get Started with Java