Making Risk Assessments Useful

Michael Wojcik


October 1, 2010

In his award-winning 1921 dissertation, economist Frank Knight made an important distinction between risk and uncertainty. According to Knight, "risk" refers to a situation in which the probability of an outcome can be determined. "Uncertainty," by contrast, refers to an event with a probability that cannot be known. Thus, Knight showed that while economists wanted to present their field of study as an exact science, it is not.

Unfortunately for information security professionals, IT security falls largely within Knight's uncertainty category. IT risk assessment is an inexact science since risks are rarely quantifiable.

Take, for example, risks introduced by software vulnerabilities. I can say with near certainty that the desktop system I am currently using has vulnerabilities. I can say this even if I have patched all known vulnerabilities. How do I know this? Because my software vendor sends out periodic security alerts and releases patches to fix these vulnerabilities. Vulnerabilities often are not found until years after the software goes into service; but they were always there. They were just unknown until someone maliciously exploited them or the vendor became aware of their existence.

At its core, risk assessment has a straightforward methodology: multiply the magnitude of a loss by the probability that loss will occur. Obviously, quantifying risk probabilities is integral to risk assessment. But if these risk probabilities are unknowable, are we wasting our time conducting risk assessments? To quote Dirty Harry, "a man's gotta know his limitations." Ultimately, risk assessments still can be useful even when uncertainty abounds. We just need to have our priorities in order and not let the minutiae overshadow the larger approach.

Humans have a zero-risk bias. This means we will opt for a small reduction in a risk to totally eliminate it rather than a larger reduction in a more significant risk. This is because we tend to choose more certain benefits (even if they are small) over larger, less uncertain benefits.

I was once involved with a large federal system that used dozens of servers. The certification team ran a routine vulnerability scan and noticed two things. First, a number of the servers did not have critical patches installed. Second, of those that were missing patches, they were missing different patches. When the certification team reported the issue they identified the risks as "server XYZ is missing critical patch ABC."

What was the problem with this approach? They did not understand the true issue, which was that the overall security program was not set up to systematically detect vulnerabilities. It was not, by industry jargon, operating under proper "configuration management," which means that the system is being monitored for changes and adapting as necessary.

This is not to say we should not fix any unpatched systems as soon as possible. Of course we should. But we also cannot shy away from identifying and mitigating the larger risk and the flaws in security program monitoring -- even if it is far more difficult to fix and its long-term benefits are less tangible.

Recognizing this is key to adopting a mentality in which we admit that, if we are to continue to do risk assessments, we should at least try to make them useful. This means understanding human cognitive biases and attempting to ask questions that dig deeper into the risks.

It also means taking into account "black swan" events that could cause significant harm -- even if their likelihood is considered low. Case in point: Barings Bank.

Barings Bank was the first merchant investment bank in England. Established in 1763, the bank quickly expanded to conduct business throughout the world. It was the bank of royal families, financing the Napoleonic wars and the Louisiana Purchase. In 1818, French statesman Duc de Richelieu said "there are six great powers in Europe: England, France, Russia, Austria, Prussia and the Baring brothers."

Close to two centuries later, the bank collapsed almost overnight. During an audit in February 1985, bank examiners realized that a young trader in the Singapore office, Nick Leeson, had fraudulently covered up losses of ?827 million, the equivalent of about $1.3 billion and twice the bank's available trading capital. The once-great bank could not recover. It was soon sold to Dutch bank ING for just ?1.

The Barings Bank debacle was a typical black swan event. In his 2007 book The Black Swan, Nassim Nicholas Taleb defines black swan events as having three characteristics: (1) they are unpredictable or surprising for most observers, (2) they have a major impact, (3) after the fact, they are readily explainable and, in hindsight, seem almost obvious.

Black swans are troubling for all risk professionals. But they are particularly problematic for IT security professionals because an IT security program is expensive enough when it is merely tasked with safeguarding against predictable threats. Organizations generally just do not have the additional resources to defend against outliers even if they could predict them.

But once the Barings insider threat became well-publicized, it ceased to be a black swan. Organizations scrambled to decrease their vulnerability to insider threats. Make no mistake, there were -- and still are -- thousands of organizations vulnerable to this type of insider action. In the vast majority of these organizations, however, the insiders are either honest, do not know how to exploit the vulnerabilities or would not cause havoc on such a large scale. While it is a known fact that employees can be malicious -- even sociopathic -- the likelihood of them doing something like that is unknowable.

And this reactionary re-prioritization of a risk that was formerly unquantifiable represents an over-reliance on a risk-based approach rather than a systematic, controls-based approach. Security professionals generally do not step back to see the real issues. Too often they get bogged down in the weeds of risk assessment findings.

To further illustrate this, it is useful to look at how companies report risks in annual reports. Here is an example from GM's 10-K statement: "We have determined that our internal controls over financial reporting are currently ineffective. The lack of effective internal controls could adversely affect our financial condition and ability to carry out our strategic business plan."

Using this as a model for the earlier example in which server XYZ was missing critical patch ABC, our risk statement might be something like: "We have determined that our configuration management controls for responding to and patching system vulnerabilities are ineffective. The lack of effective configuration management controls could adversely affect our ability to maintain confidentiality, integrity and availability of the data of our financial systems."

There are a few things to take notice of regarding this method. It uses the term "we" because a risk is something that is described in terms of an organization's business. It is not something that affects a server or even a data center. The server does not care about the risk. Management does (or at least should). And this approach puts the risk in language they can understand.

Management may justifiably be hesitant to think about risk assessments in a more creative, systematic way. But in the end, the most effective way to protect the company is from the top down.
Michael Wojcik, CISSP, is a manager in the risk and compliance practice of the international technology consulting firm Acumen Solutions, Inc., where he has been working on cloud security issues.