Microsoft: No data for you!
We need data to move our project ahead. What are some strategies to consider?
In some cases, data that is structured identically to production data is not needed. Many public data sets (like these and these) are available for development, testing, and training. The downside, of course, is that public data sets come with a pre-existing structure, so they will not be useful if your work requires your existing data structure or any control over the data structure.
To mitigate concerns about inappropriate disclosure of sensitive data, “scrubbing” or “masking” approaches can obscure, alter, or remove sensitive data from copies or snapshots of the data (for example, see my post on Dynamic Data Masking). The specific approaches vary depending on the nature and sensitivity of the data, ranging from manual, to automated, to using tools like Delphix. In general, an understanding of the data – especially which elements are sensitive – is needed, though more sophisticated approaches or tools can infer what parts of the data need protection by looking for patterns like email address, social security or insurance number, and so on.
However, in highly regulated or very risk-averse environments, scrubbing or masking approaches may not be acceptable due to the concern of “what if we miss something, and inappropriate disclosure happens despite our scrubbing or masking measures?”. Something as simple as finding (and scrubbing or masking) permutations of personal identifiers in large text fields, for example, can be challenging to automate with very high accuracy across many records.
Read the entire article here, No data for you! – Patrick’s Azure Blog
via the fine folks at Microsoft