- 1 -

From PLI’s Course Handbook

Ninth Annual Institute on Privacy and Security Law

#14648

39

protecting data: security, minimizatioin and anonymization

Michael Hintze

Microsoft Corporation


Michael Hintze

Associate General Counsel, Legal & Corporate Affairs

Microsoft Corporation

Michael Hintze leads a team within Microsoft Corporation’s Legal and Corporate Affairs (LCA) group that focuses on a number of regulatory and public policy issues, including privacy, security, telecom, online safety, and free expression matters worldwide. He joined Microsoft in early 1998, originally to work on export controls and the regulation of encryption technologies. Soon thereafter, his role expanded to include privacy and related issues, which has been the focus of his practice for most of his time at Microsoft.

Prior to joining Microsoft, Mr. Hintze was an associate with the Washington, D.C.-based law firm of Steptoe & Johnson LLP, where his practice focused on export controls and commercial matters for technology companies. He joined the firm following a judicial clerkship on the Washington State Supreme Court.

Mr. Hintze is a graduate of the University of Washington and the Columbia University School of Law, where he served as Editor-in-Chief of the Columbia Human Rights Law Review. He has published numerous articles on a wide range of subjects including data privacy, U.S. export regulations, and capital punishment.


Protecting Data:
Security, Minimization and Anonymization

Michael Hintze

Microsoft Corp.

Inherent in the protection of privacy is the prevention of the unauthorized access, use or disclosure of personal data. Hence, data security is a critical part of data protection. However, traditional notions of data security are not the only means of preventing unauthorized access or use of data. Concepts such as collection limitations, data minimization, data destruction, pseudonymization and anonymization can also play an enormous role in the reduction of these risks.

I.  Information Security

A.  The Relationship Between Privacy and Security

It will be helpful to clarify for the purposes of this discussion what we mean when we use the terms “privacy” and “security.” Security involves the protection of data from unauthorized access, use, or disclosure of information, whereas privacy involves the constraints placed on authorized access, use or disclosure of information.[1]

Privacy and security experts have characterized the relationship between privacy and security in different ways. But is it clear that they are closely intertwined.

To put it most bluntly, you can’t have privacy without security. Your organization may have top notch privacy policies and practices, but if a hacker gets in a steals your customer records, your customers’ privacy has been seriously compromised. But the opposite is not necessarily true—you can have excellent security in your organization, but lousy privacy practices. In other words, if your privacy policy states that you will sell your customer’s information to the highest bidder, or will post it on the internet at your whim, then even without any breach of security, your customers’ don’t have much privacy.

Another example of the way in which privacy and security are intertwined is found in the security breach notification requirements that have been enacted in recent years. User notice and transparency are core privacy concepts that are based on the idea that if users understand what is happening with the data about them, then they can make more informed decisions and take steps to protect themselves. In recent years, bills that have been characterized as data security bills have focused on these same concepts in requiring companies to notify users in the event of a security breach involving certain kinds of personal information. The idea is that the affected individuals can use this information to take steps to protect themselves (e.g. closing a compromised financial account, monitoring account statements for suspicious activity, ordering credit reports, etc.). These ideas of notice and transparency underlying these data security requirements are very familiar to privacy professionals.

B.  The Sources of Data Security Obligations

There are many sources for data security obligations. For example, contractual obligations may drive a company’s approach to data security. This may especially be the case if the company is acting as a service provider that accesses or processes data on behalf of another company.

For merchants, banks, payment processors, etc. that accept or process payment card information, the Payment Card Industry (PCI) standards create a number of specific obligations. The standards contain 12 categories of technical requirements, each supported by detailed controls.

The statutory and regulatory requirements range from the very general to the highly proscriptive. For example, companies that are subject to EU rules will find that they must address the data security requirements that are contained in the data protection statutes. These typically mirror the very general requirements contained in the European Data Protection Directive, which requires “organizational measures to protect personal data against accidental or unlawful destruction or accidental loss, alteration, unauthorized disclosure or access.” The Directive also provides that the level of security must be proportionate to the risks represented by the nature of the data.

For companies in the U.S. financial industry, the Safeguards Rule under the Gramm-Leech-Bliley (GLB) Act, provides a bit more guidance. These regulations require financial institutions to develop an Information Security Program that is designed to protect the security, confidentiality, and integrity of personal information collected from or about customers. Among other things, the plan must include: (1) a designated security organization, (2) appropriate administrative, technical, and physical safeguards, (3) analysis and remediation of risks in each area of operations, and (4) a requirement that service providers meet the same standards.

An even more proscriptive set of requirements applies to certain entities in the U.S. health care sector. The Health Insurance Portability and Accountability Act (HIPAA) Security Regulations include a very comprehensive and proscriptive set of security requirements for covered health care entities. These include standards for physical and personnel security, technical security standards for electronically stored information, and technical communication security standards. Specific items such as access controls, audit controls, data authentication, and entity authentication are described within these regulations.

While not established by statute or regulations, the standards developed by the Federal Trade Commission (FTC) are increasingly important. The FTC’s authority over data security is based on Section 5 of the FTC Act, which prohibits “unfair or deceptive” business practices. The Commission’s earlier data security cases were all based on the notion of “deception”—making promises or representations regarding data security to which the company didn’t or couldn’t adhere. However, more recent cases, beginning with the BJ Wholesale case, have been based on “unfairness”—requiring reasonable security for consumer information even in the absence of any security representations or promises.

In its growing body of security cases, the FTC has required companies to adopt a comprehensive information security program, with requirements that in effect mirror those found in the GLB Safeguards rule. Additionally, the FTC has provided guidance on its website for creating a sound data security plan (http://www.ftc.gov/infosecurity/), which should be viewed as a sound way of avoiding an FTC consent order.

C.  Elements of an Information Security Program

Based on the FTC guidance, and on the various data security regulations, it is advisable to develop an information security program that meets these various requirements.

First and foremost, all elements of an information security program must be documented. But documenting a program is not enough; it must be rolled out effectively within an organization. This means having a broad communication plan to create awareness of the program, as well as formal (and mandatory) training for all employees who can access personal information or who build or operate systems that collect, store or process personal information. Additionally, and most importantly, the program must be built into the company’s business processes. By including security checkpoints and requirements into those processes that the business already uses, there is a much greater chance of the program being followed.

An information security program should designate dedicated personnel who are accountable for carrying out the program. It should require that the business conduct risk assessments to determine the areas of greatest risk. And it should establish written standards that include safeguards designed to control identified risks. Such standards would typically include:

·  Firewalls

·  Intrusion detection

·  Virus protection

·  Authentication and access controls

·  Encryption of data is transit and storage

·  Auditability

It’s also critical to not overlook the “low tech” elements of physical security. All the firewalls, passwords, and encryption won’t help if the back door to the building is unlocked and an intruder can walk in and sit down in front of a computer that is already logged into the network.

Vendor management is also a critical component of a comprehensive security program. If vendors are used to collect, store, or process personal information, the contracts with those vendors should address the security requirements that the vendors should meet. But putting language in a contract is not always enough. Depending upon the nature and sensitivity of the data involved, and perhaps on the level of confidence you have in the particular vendor, it may be useful to have a more complete program to oversee the activities of the vendor, including periodic audits or assessments.

An information security program should also include periodic compliance assessments and audits on the company’s own systems and processes. And finally, it should include incident response processes and plans that are developed and established in advance—before a problem occurs.

II.  Other Methods of Protecting Data

Traditional notions of data security, and the development of an information security program, provide important protections for the data that a company holds. But another way to reduce the security risk and protect data is to reduce the amount and/or sensitivity of the data that is held. There are several different ways to achieve this.

A.  Limitations on Collection

A common and memorable phrase that describes the principle of collection limitation is: “if you can’t protect it, don’t collect it.” To take that idea one step further, even if you think you can protect it, collecting data that you don’t need creates unnecessary risk and expense. It seems obvious that if there is not an articulated need for the data, it shouldn’t be collected. But all too often, businesses err on the side of collecting more data because it might turn out to be useful. More often than not, that additional data merely creates a security risk, while providing little or no business value. That is not a smart tradeoff.

B.  Data Retention and Destruction

Don’t keep data longer than you need it. Some data may be very useful to collect, but the utility of that data diminishes over time. Given that the cost of data storage is continually decreasing, unless prodded, the business may not think of deleting data that is no longer useful. But like data that is collected without an articulated need, data that is retained without an articulated need results in an ongoing security risk without any significant business justification. Again, not a smart tradeoff.

When data is deleted or destroyed, it must be done securely. Just like paper documents containing personal data should not be placed, unshredded, into a dumpster, electronic records should not be disposed of carelessly. Backup tapes, hard drives and other electronic storage media should not be just thrown out, recycled, or sold as surplus without being thoroughly scrubbed. And that means more than simply deleting the data, because deleted data can be recovered. Thus, it pays to make sure that appropriate techniques are used to destroy the data so that it cannot be recovered.

C.  Post-Collection Data Minimization

Once data is collected, and before it is destroyed, it is worth examining the nature of the data and comparing it to the business need. In many cases, it may be possible to reduce the precision or sensitivity of the information you maintain. For example, is aggregate data sufficient rather than individual records? It might be necessary to collect a lot of information at the individual user level, but the business need is to understand how a service is used at an aggregate level. If that is the case, then once the aggregate reports are generated, there may be no need to retain the individual-level data. So a process can be developed to quickly aggregate the data and to delete the underlying user logs. Doing so greatly reduces the privacy implications of a security problem.

Another example would be a company that collets date of birth in order to determine whether or not the user is an adult. But for those users who are an adult, if there is no reason to know the actual age, it may be advisable to store only an attribute that the user is an adult, and deleting the actual data of birth. The retained, lower-precision, data is less sensitive than the data originally collected, so the security risk can be reduced.

D.  Anonymization and Pseudonymization

Finally, security and privacy risks can be reduced through the use “anonymization” or “pseudonymization” techniques.

There are different degrees of anonymization or pseudonymization that provide different levels of protection. For example, some techniques have been characterized as partial versus complete anonymization; or reversible versus irreversible anonymization. Typically, these differences come down to whether the technique makes it impossible to connect the data to an individual, or whether it is merely obscuring the connection.[2]

A couple of examples may help illustrate the differences.

Imagine a simple scenario where a company maintains two separate customer databases. One, Database A, contains customer records that have all the information that would traditionally be considered PII—data that, by itself, could be used to identify or contact the individual (name, e-mail address, phone number, etc.). The second database, Database B, contains records on these same customers, but only containing demographic attributes (e.g., age, gender, zip code, and favorite color). Both customer records contain a common unique ID number that allows them to be correlated. Database B could not be said to be truly anonymized, since the company could always use the unique ID to connect these records to the identifiable information in Database A. But it might be considered “pseudonymous” in that someone who has access only to Database B could not identify the individuals involved. One can see real advantages to such an architecture, since it would be reasonable to have a lower level of security and access controls apply to Database B. An intruder who gains access to Database B would not be able to compromise the individuals’ privacy without also gaining access to Database A.