Data Masking

We test the application on our project using a development database created by scrubbing production data. That original scrub happened over 10 years ago. Our development data is starting to get a bit stale. It gets updated every year to try to keep pace with the new dates. But formats and structures change. This sounds like a job for Oracle Data Masking,

Data Masking is an add on pack for Enterprise Manager. It allows you to meet data privacy and data protection mandates. It replaces sensitive values with realistic replacements. Thus you can use production like data for non-production uses.

Data Masking can actually scan for sensitive data. It can make sure sensitive data does not leave the production environment. Its operation is also called scrambling or anonymization. The scrubbing is based on rules that you set up or choose from some preset formats.

Here are the data type supported for scrubbing:
  • Numeric
  • String
  • Date
  • LOB (requires Grid 11gR1)
These are the overall steps you take to perform the data masking:
  1. Clone production to a staging environment
  2. Select mask definition in staging
  3. Clone the masked staging data to a test environment
The predefined mask formats allow you to generate random numbers for replacements. You can also do some post processing to ensure the data is of believable format. Masking is controlled by the DM_FMTLIB package. Here are some presets for well known data types that are supported:
  • Credit card numbers
  • ISBN numbers
  • UPC codes
  • Social Security Numbers
  • Phone numbers
 Here are the roles you need to perform a masking operation:
  • Select catalog role
  • Select any dictionary
  • Execute privileges on DBMS_CRYPTO
Be aware that masking partitioned tables might cause some partition movement if you mask the partition key. There are some types that are not supported by masking:
  • Clustered tables
  • Object tables
  • XML tables
  • Virtual columns
Masking can happen quickly because of the following optimiations:
  • Bulk operations
  • Disabled logging
  • Running in parallel