Sanjoy Chowdhury
Data is the ‘new oil’. We have been hearing this term for quite some time now. And, why not? Just as oil has been a prime driver of economies world-wide, data is taking up that space, if it has not done so.
Data is almost capable of ‘making’ or ‘breaking’ an enterprise in today’s world. This is because there has been a deluge of data that all organisations are capturing, storing and analyzing for gaining competitive advantages in the global as well as the local market.
This deluge of data contain various information, some of which are extremely sensitive in nature, such as social security numbers, credit card information, health records, sensitive communication between employees, product information, etc. And, a lot of this data is stored online as they are needed for day-to-day functioning of the organization.
This brings us to a very important aspect of the necessity to secure the data that is literally driving the organizations’ bottom-line.
So what exactly is data security? It can be defined as the act of protecting data residing in computers, data storage devices, websites as well as other data repositories, from unauthorized access as well as intentional or accidental corruption, by elements inside or outside of an organization.
The importance of data security cannot be over-emphasized with the recent case of the Ashley Madison data breach making headlines globally and the class-action suits pouring-in thereafter. This data breach has cost the organization hundreds of millions of dollars in lawsuits as well as a loss of reputation, for not being able to protect sensitive and confidential data of its customers.
Apart from avoiding lawsuits and a loss of face, organizations are also bound to protect sensitive data based on data security rules and regulations as determined by laws of the country where the organization operates.
With the advent of Big Data, data, sensitive or otherwise, has started residing on systems other than traditional databases, as well. This has increased the vulnerability of the data to a great extent as large volumes of sensitive data are being stored on distributed file-systems for cost-effective online storage and analysis.
Thus securing ‘sensitive’ Big Data, with its volume, velocity and variety, pose a constant and a different set of challenges for data security of organizations. This challenge is compounded many-fold by the fact that the adoption of cloud-computing has resulted in organizations storing a lot of their data (sensitive and otherwise) on off-premise data-centers, in the cloud.
This paradigm shift in the architecture of data storage and processing requires organizations to re-look at their data security mechanisms (processes and tools) to ensure all sensitive data is well-protected but still accessible to users to perform their daily tasks. Depending on peripheral security mechanism for protecting sensitive data may not be the only and best way forward to protect data from being misused by entities within and outside the organisation.
So how do organizations go about securing their data, specifically Big Data, other than adopting the standard peripheral securing mechanisms like authentication, authorization and audits? These methods, working in tandem, may deter to a certain extent the access to sensitive data by outside elements and also entities within the organization, but it does not protect the data from legitimate access by internal entities that have an intention of using the data illegitimately.
This is where security at individual data level is of paramount importance. If for example, the credit card data in a file is not visible to the employees, then even having legitimate access to that file, will not allow the employee to read the credit card numbers. It will be the same for any other sensitive or confidential data.
Securing individual pieces of data is usually done using techniques like data masking and data encryption.
So what is data masking and data encryption?
Data masking is the process of converting the ‘authentic’ or ‘actual’ data into a structurally similar format, but ‘inauthentic’ in nature, so that a person who has access to the ‘masked’ data will not be able to identify its ‘true’ value. Hence, the chance of misusing sensitive data which has been ‘masked’ is almost nil.
Similarly, encryption is also a technique in which the ‘actual’ data (plain text) is converted to an illegible format (called cipher text) using various algorithms. This data has to be de-crypted to convert it back to the original form.
Data masking is usually done on production data that needs to be used in a non-production and consequently in a less secure environment. The masked data is ‘production quality’ and can be used to test applications that will be using the data in the production environment.
Encrypted data is usually used when data is in ‘motion’ – being transferred from one system to another over the network, and needs to be returned to its original form when it has reached its destination.
Thus we see that, peripheral security solutions, along with data masking and data encryption, have the capability to provide a multi-layered security framework for ensuring a more holistic data security for the enterprises.