In recent years, there has been more and more of a public discussion about the need to learn how to embrace failure. Silicon Valley has even coined a neat terminology around it, and the notion of “failing fast” has found its way into many a technology company’s vision statement. As long as you learn from it, then, the thinking goes, it does not really count as failure.
Which is all well and fine in technology development, but if said failure results in, oh, the loss of the entire business, then one might reasonably argue that doing it quickly would perhaps be a tad less valuable, rather than more.
In other words, the notion is limited to certain aspects of the business. Typically it reflects the idea that when trying new things, it is better to try a lot of them and rapidly discard the ones that do not work as one would have hoped. Like throwing spaghetti at the wall, “failing fast” in such environments can dramatically increase innovation (or, if managed poorly, chaos, which in itself is an interesting failure mode worthy of an entire book – one probably written by somebody else). This sort of continued iteration can help uncover good ideas that could have significant market value and, just as importantly, avoid large investments in blind alleys that end up leading nowhere.
One big underlying assumption here is, of course, that this sort of thing is usually very much confined to a lab setting. The exposure to this “engineered failure” process is pretty much limited to research and development budgets, which allows it to be fairly contained, and those costs are quickly repaid once something sticks (to that proverbial wall).
It’s smart business, really.
But what if we try to take this idea and apply it to the realm of technology operations?
It is not as crazy as it sounds, actually. One of the best emerging concepts in Security is that of “cyber resilience”, a term used to describe an environment built to withstand and contain as many failure modes as possible and in such a fashion that it can be brought back up quickly. In their book The Fifth Domain, Richard Clarke and Robert Knake discuss this idea at length, doing so in very grand terms that, in addition to applying to large enterprises, also pertain to the largest scale of operations we humans have ever created – national governments.
The idea itself is simple enough. One of the truisms of security is that you simply cannot build a system that will successfully prevent all attacks. It is not actually possible, and anyone who claims otherwise is trying to sell you something (I’m looking at you, security product and services vendors). There are many reasons for why this is true, but let me state the most obvious, the one about asymmetric warfare: in the world of security, the defender has to be right every time, whereas the attacker only has to be right once. And when the battle itself takes place over copper and fiber wiring, with no actual troops to deploy or supply lines to maintain,* and at little to no cost to the attacking army, time itself dictates that a determined enough party will ultimately succeed.
* Caffeine and candy notwithstanding. So instead of only trying to prevent bad things from happening, the thinking behind cyber resilience is that we do what we reasonably can to stop malicious parties, to delay the impact of their actions, to alert us as quickly as possible that damage is accruing, and to have mechanisms in place to recover quickly and minimize the overall disruption. This is effective due to many reasons, but let me again state the obvious one: the cost of recovery in a well-prepared organization is dramatically reduced for the defender, and the equation actually flips around if properly executed. This is true because of the value of time; rapid recovery resets the clock for the attacker, who now has to again spend all the time and effort to find a new way through, if they are so inclined, while the defender is back to normal operations and has already plugged the holes that allowed the initial attempt to succeed. If the environment is resilient enough, then this delay and repeat sequence can fairly reliably be estimated in weeks or months.
The attacker, unless particularly determined (and it is then a completely different matter than what one usually encounters in the private sector), will hopefully end up frustrated and choose to go after an easier target.
Alright, this was already a longer intro to this chapter than I planned for.
Story time.
***
Back in the Oughts, I knew someone who was a very strong technical security person. He lived in a different country, one that once was but is no longer part of the EU,* and he got a job offer from a large financial institution, the details of which I have never seen before, nor since.
* How’s that for hiding incriminating detail? I mean, there is no way you would ever be able to guess to which country I am referring. I am so smart! The role itself was director of IT security (the title “Chief Information Security Officer (CISO)” had not become popular yet, and the field was still overwhelmingly viewed as a technology one focused around network and system administration). The job paid well enough, at the equivalent of about $200,000 per year, but that was not what made it interesting.
What did was the fascinating “success or failure clause” baked into it. It was really simple, and worked like this: over each period of 12 calendar months during his employment, there was one single measurement of success – or failure. If a certain set of systems were not breached, fully or partially (including even minor sensitive data leaks), during that time period, then he would receive a bonus. If they were, then he would not get the bonus (and, let’s be fair, he would also most likely lose his job).
The bonus itself was what made it remarkable, in that it amounted to about $1,000,000.
I have lost touch with the fellow since then, unfortunately, but I do know that he did not sleep much in the first year of this contract.
Amazing stuff, really, and other than the shock value (for me, anyway), utterly meaningless from a security perspective. Because while it seems like an excellent way to motivate someone to do the best they can (and potentially give them the biggest ulcers ever recorded), it ignores the fact that breaches are often a matter of luck rather than effort, and the reality is captured by the well-known adage about if someone wants to get you badly enough, they will. More importantly, the question remains: say the worst did happen. Other than saving themselves the cost of the bonus for that year, how would this change the final outcome for the financial institution, beyond having someone to readily fire?*
* Let’s be honest; they would fire the security leader anyway. It is one of the unspoken and sad truths about being a CISO in an organization that views it as a “defensive role” – if the walls crumble, it is their head that rolls. For one thing, it allows everyone else to save face. In this vein, there is another story that I would like to share with respect to one of my own customers. We shall call him Dudley.†
† Of course the name is made up. I told you I was going to do that. Plus I like Dudley. It is such a cool name. Dudley. Dudley. Woot! It was a Thursday evening, one week before Thanksgiving, and I was in Dallas attending my favorite conference, as I do every year.‡ A new event was about to start when my phone lit up with an incoming call from Dudley, and because he was an important client, I stepped aside to a relatively quiet corner and picked it up.
‡ See if you can guess which conference it is by doing some Internet sleuthing. I will give you a hint: it has nothing at all to do with security, and everything to do with my favorite hobby. If you figure it out, and are so inclined, write to me on LinkedIn and tell me what my “handle” is for this conference, which matches the one I use on the sponsoring organization’s website. “Barak, listen. I would never bother you like this except I really need you to help me figure something out”. He knew this was my “break time” and how essential it was to my mental stability, and so I knew this had to be important.
“I got this weird incoming message on LinkedIn from some company in Europe I never heard of, who said they may have seen chatter about our company being targeted by hackers, and I don’t know what to make of it”.
“Is it from someone you know or a complete stranger?” I asked.
“Stranger. But they said they knew our CEO”.
“Did they give you any information at all beyond that intro?”
“No. They said I should reach out to them if I wanted to get it”.
“Sounds like some elaborate social engineering deal to me”, I mused. But it sounded like nothing I had ever heard before. We agreed that he would ask them for more information, and also verify with the CEO their claim of knowing him, and we can talk about it again. We hung up and I went back to the event.
You will note that my own response was mostly suspicious, rather than alarmed. I am not patting myself on the back here, because it turns out that the reach out was indeed legitimate, came from a player in the extremely new (back then) commercial threat intelligence field, and was completely honest in that the only reason they did it and offered Dudley this valuable insight for free was because their CEO knew the CEO of Dudley’s company personally.
In other words, I initially read this all wrong. To be even more candid, being where I was, my desire to get back to my fun activities was also extremely strong, probably further reinforcing my tendency to initially view this message as suspect. I really, really didn’t want it to be real, and I therefore quickly determined that it likely wasn’t real, thereby mirroring the decision making process made by many before me (and many more since and forever more) when a crisis was unfolding in front of their very eyes. My highly developed “gut sense” failed me, and as a result, I almost failed Dudley.
Well, the conference pretty much ended for me that night.
By the next day and throughout the weekend I was on conference calls with the team back in San Francisco figuring out actions to take against the developing attack.
But here is why I love sharing this story.
Ready?
We had, at his company, i...