Data security lessons from the equifax incident

The year 2017 was marked by what could be easily called the fall of an empire. Equifax, one of the largest actors in the credit data business, was deeply breached, affecting millions of people around the world. Personal data of at least 143 million people was stolen: Social Security numbers, dates of birth and addresses, all “personally identifiable information”, or PII. A group of hackers, using highly sophisticated tools, had a whole summer to extract databases, design tools, analyze data, and avoid any detection by the Equifax security team. In less than three months, a group of people –whose motives and actions are still being investigated– were able to settle deeply into several computer systems of a company that everyone assumed safe.

The ramifications behind the violent theft of personal data extend far beyond. Millions of people having their data compromised and exposed end up paying the consequences of security problems, specific to databases, frameworks, and applications. The data becomes a gold mine for scammers, who use that information to commit more crimes.

Besides that is the legal problem of the use of our personal data, which tried to be regulated in the US with a law introduced to the Senate in 2019, which has not yet been debated.

Three years after the breach, insecurity regarding the fate of the personal data of millions of people is still lingering. There are several tools available to know if our data was stolen or not, but the uncertainty about whether the data will be used to commit crimes cannot be denied.

That is why cases like Equifax, a breach that can still happen in the future with any other entity that stores our data, allow us to understand the big picture. Analyzing the breach step by step, we can protect and prevent another one, now from both sides: consumers, and those who store our data.

The origins of data

Our personally identifiable information (PII), or what we normally define as PII, has been extended to more and more things over time. What originally was just your Social Security Number or email address, is now your IP address, the biometrics of your face, and your geolocation. Regulations such as the GDPR in Europe have broadened that spectrum with the intention of protecting our data as broadly as possible. However, even with regulations in place, it’s probable that our data will be stolen en masse in one way or another.

A part of the information we provide to organizations is mandatory, but we give it willingly. We must give our Social Security Number to our bank to request a credit, for example. Several entities, public and private, need our personal data to function properly. Nonetheless, data has levels of importance: your email address receiving spam is not the same as someone posing as you to commit fraud using your SSN.

At the same time, there is data that we unintentionally provide. Social networks –in particular Facebook but extensive to many of the platforms we use– hide powerful biometric analysis engines. Many health entities store our medical data, also considered important, without our express authorization. In Equifax's case, the credit company and its competitors (primarily Experian and TransUnion) have the law in their favor. The current US law gives them the power to store your date of birth, Social Security number, driver's license number, employment and purchase history, credit card details, and a bunch of other sensitive data.

With that information in their hands, Equifax and its competitors can get a general idea of our credit landscape: how much can we pay, that is. Those companies then sell that information to financial institutions in order to determine our credit value. This information is given to Equifax because we live in a society based on capital, where ordinary citizens use financial instruments, credit cards and loans, and the institutions behind those products require a centralized credit check. Equifax did not knock on our door to ask for our data: they collected it, and saved it to sell it to the highest bidder. This is important for two reasons:

The data that we unintentionally provide is not our responsibility.
Voluntary data can be safeguarded or denied, if possible.

On one hand, any person, entity or service that is responsible for our data must take charge of all eventualities within the framework stipulated by the law of each country, and the use of each platform. An example could be the use of our biometric data for criminal profiling, which requires regulation and definition. This is especially important in cases where the training of neural networks or the detection of suspects is done with data that at first glance is public –such as photos on our social networks– but that after further investigation ends up being private biometric data.

On the other hand, as citizens, it’s also our responsibility to have strict control over the data we disclose. We have to proactively prevent that our data, which we voluntarily disclose, is used to commit crimes. If someone asks us for our PII, we have the right to safeguard it, to know what they will be used for, and to deny it if we believe that our privacy may be breached.

The Equifax data breach, in short

Equifax Inc. headquarters — AP Photo/Mike Stewart

Having defined what was lost in the Equifax data breach, it’s time to visualize what happened in broad terms.

Learn More - What is Cybersecurity?

Most studies about the subject pinpoint the responsibility of the first breach on a vulnerability in a popular enterprise backend software called Apache Struts. According to an article published by Bloomberg, it was Nike Zheng, a Chinese cybersecurity investigator, who discovered the vulnerability. He then reported it to Apache, and the company posted a fix on March 6 on its site. However, in less than 24 hours that information was already on popular hacking sites, which led to the community taking advantage of the vulnerability. Four days later, an attacker found a vulnerable machine in Atlanta, and the rest is history.

The chain reaction allowed hackers to take advantage of multiple security issues at Equifax. CSO puts it in context: "Like plane crashes, major infosec disasters are typically the result of multiple failures". The sum of errors include, but is not limited to:

The attackers used the vulnerability described above, which had already been disclosed but had not been patched on a multitude of company computers.
The attackers entered a computer that served as a server for a web portal, and were able to access the rest of the infrastructure due to poor segmentation between them.
Usernames and passwords for users and accounts were stored in plain text.
One of the key encryption certificates in one of the security tools being used had not been renewed, allowing attackers to exfiltrate encrypted information for months.
The gap was not disclosed by executives, which knew the consequences and even sold shares of the company.

What happened to the data?

This is where the story becomes something of a mystery. After almost all major gaps it’s possible to find the stolen data on the dark web, being sold to the highest bidder. However, the Equifax breach data, contrary to expectations, was never released to the public.

It instead ended up on the hard drives of actors much more interested in espionage than theft. An investigation by the US government, whose results were only divulged recently, revealed the hackers were part of a Chinese military cell, which has used the stolen data to profile important people in companies and government. Such profiling would allow them –according to various theories– to identify officers in financial trouble to blackmail or bribe them, with the intent to facilitate espionage.

Another precedent supports this theory: a New York Times investigation links the cyberattack on the Marriott hotel chain to the same effort by the Chinese government to create a financial database of politicians, businessmen and key people for profiling purposes, extortion and blackmail.

But what really happened to the Equifax data? Did it appear publicly? It was traded? According to an investigation published on CNBC, the Equifax data is "The city of Atlantis or the Holy Grail": impossible to find. One of the researchers, who spent nearly a year and a half searching the dark web for clues about the whereabouts of the data, stated that the content of the breach was never traded. In those cases, hackers usually bet on speed: the faster they can sell illegally obtained databases, the less likely they are to be captured or that the stolen institutions will render the data useless. That is why the breached data is considered to be in the hands of the hackers who stole it or private actors such as those already mentioned in the Chinese government.

Of course, there are examples to the contrary. In the site Have I Been Pwned there is a list with dozens of data breaches, including the Yahoo breach that occurred in 2013, and whose data was sold for about $300,000 on the dark web to at least three malicious actors, two of them scammers. Another example is Tumblr, which also suffered a breach where 65 million emails and passwords were stolen, to be subsequently sold on the dark web.

What did we learn about the Equifax data breach?

Although the ramifications were not completely disastrous for the affected users, breaches such as this one allows us to pay attention to several points related to data security and privacy.

Personally identifiable information is valuable, and must be protected at all costs

Even if we are constantly redefining what is PII, personal data must be treated with the care it deserves. This goes many ways, from the user himself choosing what information to disclose, to governments pushing adequate legislation in order to protect our data from hackers and malicious actors.

Important data must be fragmented and encrypted

The classification and fragmentation of personal data in all organizations that handle it must be a key issue. It’s a step that a lot of CISO consider tedious and bureaucratic, but that allows deep control. For example, a lot of personal data could be organized into properly inventoried layers: customer data, confidential company data, and public data. In this way, a hacker with access to a web server is prevented from stealing the entire cake, instead of just a piece.

The same goes for encryption. Data hierarchy and inventory allows us to assign different levels of data protection, and to assign responsibilities for public and private keys to users and parts of the organization that need it.

If a breach occurs, report it to your users as soon as you can

This point remains a legally gray area due to the unwillingness to legislate firmly regarding data privacy. However, it should be a moral imperative to communicate to users quickly in the event of a major breach, especially if it involves personally identifiable information.

In most cases –as it was shown in the Equifax breach– all those days you think you’re “winning” by not communicating, transform into suspicious practices such as influence peddling or information concealment, which are punishable by law.