When enterprises developed their own applications, particularly applications using structured databases with highly prescribed schemas and highly predictable input variables, testing applications was a herculean task. Test engineers would create testing environments, and test data, at considerable costs (typically an exact replica of the final production environment including the datacenter infrastructure).
Augmenting this challenge is the of moving from Waterfall to Agile development methodologies. This can decrease development time by over 90%, and increase the number of builds for any given period of time. Applications also need to be tested for the highly virtualized (and again, unpredictable) nature of the underlying infrastructure on which the application runs.
Also, it is impossible to test the application manually every sprint, so automation is mandatory. While significant strides have been made in automation of testing, there is very little work that has been done for automation of provisioning of test data for these tests.
What does this all mean? It means that testing cycles increase, and the granularity and thoroughness of testing can be compromised.
The market has not let this go unaddressed. A number of acquisitions associated with test data management occurred in 2015 alone:
- HP acquired Voltage (for its data protection capabilities, as opposed to test data management);
- Computer Associates (CA) acquired Grid-Tools and Rally; and
- Delphix acquired Axis Technologies.
The challenge is how to protect the innocent (or not so innocent) when gathering potential test data (either artificially generated or from production environments.) Therefore, in order to use either public or proprietary data for testing, it must be properly masked in such a way that the masked data is valid for the context in which it is being tested. Identification, phone, and zip code numbers need to conform to limits and formats.
Privacy is a hot topic around the world. In the first nine months of 2015, over 92 million records were hacked and stolen[1]. But as many new applications were not migrated to the cloud, but “born” in the cloud (e.g. Workday, Salesforce.com, etc.) the test data required must also come from the cloud. Testing cloud applications in the cloud makes better sense.
Another vendor focusing on this area is Informatica. Its Test Data Management (TDM) solution is a mature product that integrates sensitive data discovery, business classification, and policy-driven data masking for the safe use of production data used in test/dev environments.
Solutions such as these make it easier for enterprises to take preventive actions to use production data without compromising privacy of customer and other sensitive data.
[1] Informationisbeautiful.net