The Importance of Data ClassificationA Critical Step in Protecting Information
Every piece of data is not created equal, and demands for data protection and storage capacity have been increasing exponentially. Many organizations, however, are not reacting fast enough to meet these demands. While enterprises need to make labor, time and money investments in data protection and storage, it's crucial that they first get a real handle on classifying their data before allocating resources in the wrong places.
Information classification is the process of separating information into distinct categories or levels by which different controls, policies and requirements apply. Electronic health records, e-mail systems and picture archiving and communication systems alone are using terabytes of information that are not classified. As a result, too many organizations protect, manage and store their information the same way regardless of its importance.
From the time information is created until it is destroyed, it should be labeled with a classification designation to ensure it is protected, stored and managed appropriately.
With automation enhancements and interoperability initiatives occurring at a dizzying pace, information classification has become increasingly important. However in healthcare, for example, the information security guidance required by the HITECH Act is a primary driver. This guidance specifies only two methods for securing protected health information, or PHI, in a manner that would avoid breach notification provisions: encryption or destruction.
Before organizations encrypt data, they should determine whether the PHI is considered "data at rest" or "data in motion." It will be difficult, if not impossible, to make this determination without identifying and classifying the PHI.
The first step to take before classifying any information is for stakeholders and data owners to define the levels of classification and what attributes constitute each one. Consider classifying information into one of four categories to make the description as intuitive as possible: public information, internal use only, confidential information or restricted information.
There are several stakeholders in the classification effort. The legal, compliance, human resources and medical records departments, as well as business or process owners, will all need to be consulted in determining how information should be classified and in which category it belongs.
Once classifications are defined, the locations of the various information types must be pinpointed so that the proper protection and storage requirements can be determined and implemented. Data loss prevention technology, or DLP, can help automate the identification and protection of any type of data by mapping the location of information throughout the organization.
Policies define what type of data is considered sensitive, what actions are allowed and what protections are required. A DLP system can create and manage policies and generate reports that support these policies.
Policies can also tell what actions need to be taken when sensitive data is found. Through policy, a DLP system can support records retention schedules by identifying the types of data specified and their location, allowing for proper archiving or destruction to occur.
Once the policies are defined, the DLP system uses the policy to find sensitive information. As a result, there is a chance for both false positives and false negatives to occur. A false positive happens when data that are not sensitive are mistakenly identified as such, and a false negative happens when sensitive data are mistakenly identified as non-sensitive data. In the case of a false negative, sensitive data ends up unprotected, which may result in its loss or compromise.
Once a DLP system has found sensitive data, the system takes whatever actions required by its policy. Blocking and encrypting are two common actions. For example, a DLP system might block the transfer of data onto a USB drive. Another action could result in the system encrypting the sensitive data in such a way that only authorized users can decrypt it. If a user is saving a file that contains sensitive data, the DLP system might encrypt the file before it is saved.
Information labeling ensures that information gets tagged according to the defined levels: restricted, confidential, internal use only or public. From the time information is created until it is destroyed, it should be labeled with a classification designation to ensure it is protected, stored and managed appropriately.
Without information classification, healthcare organizations will continue to be less prepared to adequately prevent a confidentiality breach and will find it difficult to respond to legal hold and discovery orders.
Without information classification, healthcare organizations also will continue to pay more than necessary in storage infrastructure and facility costs to house and maintain duplicate and out-of-date information. Ultimately, organizations will realize that it is not efficient or sustainable to apply the same level of protection, storage and management requirements to all of its information.
Brian Evans, CISSP, CISM, CISA, CGEIT, is director, program management, at CynergisTek, a healthcare information security consulting firm.