Chances are
excellent that your incident response plan has a glaring omission in regards to
one of the most critical aspects of success during an incident.
There has been an immense amount of time and
treasure expended on what a proper incident response plan should look
like. Just throw “incident response plan”
into your favorite search engine and you’ll get pages and pages of content. You’ll
see all sorts of advice on how the various steps and phases of an incident
response plan should play out and quite a bit of thought being put into things
such as collecting contact information, identifying stakeholders and roles, inventory
of tools to be used, determining secure communication methods (because you’re
assuming the baddies got you email servers early and often), and the like. Great stuff.
Does any of your plan talk about how to take
care of your people during a major incident? I’m talking about those incidents
that are measured in weeks or months where it’s an all hands to the pump 24/7
response measured in days or weeks of the response. Once these incidents kick off, it’s too late
for the preparation stage. It’s show
time and there is an immense amount of stress involved on all of the team
whether it’s the CISO who is constantly being asked for updates by senior
executives who are seeing their career dissipation lights cranked up
to about a quarter million lumens or the lowest level incident responder who is
cranking out digital forensic images or pouring through network logs.
An incident response plan for major incident
responses isn’t fit for purpose unless it addresses how your incident
responders border
collies will be fed, watered, and rested. An
organization should have a catering plan in place before an incident so that
they can start getting a steady stream of food and drink to the people who are
going to be putting in an immense number of hours all around the clock getting
things under control.
If it’s a large organization (or a really nice
start up in Palo Alto) chances are excellent that there is already an on-site
cafeteria for employees that probably offers on-site catering services. The incident response plan should specify how
to engage those people and who the points of contact are. You’re also going to want to talk to them
before an incident to make sure that you can get food to cover a long term
around the clock response.
If you don’t have anything on-site, you’re going
to want to identify several external catering options and understand how to
engage them on short notice for an extended response and to understand how
scalable their services are since you might be feeding a very large team. Their contact information, billing methods,
and the like should be part of your incident response plan. You also need to
discuss with your catering providers the menu options available before an
incident. It’s important to give your people healthy food during an incident to
keep them going. Just saying you are
going to order a steady stream of pizza from the take-out place down the road
for weeks on end isn’t a great option.
You want to give your people some healthy options to keep them fueled
up, feeling good, and ready to chase bad guys out of your network.
You also want to make sure you are providing
your people with a variety of non-caffeinated drink options in addition to the
endless gallons of caffeinated sugar water or energy drinks that fuel most
major incident responses.
Keep in mind that you are going to be feeding
not only your employees, but any consultants that parachute in to help you out
of your bind. There is a lot of dietary diversity
these days so you’ll want to make sure you have options for people who need it
due to medical, religious, or cultural reasons.
Popular options include vegetarian and gluten-free diets which works out
well because you can get fantastic stuff that complies with either that
everyone will enjoy.
The other thing that needs to be covered is
transportation for your people. Drowsy
driving is a thing and it’s a thing you want nothing to
do with during an incident. Ride sharing
services have made this much easier especially in major metropolitan
areas. The goal is to make sure you can
get your people safely and efficiently back and forth between home (or the
hotel rooms they are calling home during the incident) and work. Most of your
people will be driving into work, but if they are too tired to drive because
they ended up working a day or more in a row without sleep, it’s probably not a
great idea to let them drive home and your plan should address that fact.
Which reminds me of an important point. If you
are having people staying up for days on end, you’re very likely understaffed
for your incident and you need to fix that quickly or you’re asking for more
problems. My general rule is that I
don’t do forensics after ten hours because my chances for mistakes go up
dramatically. I’ve lost count of the
amount of times that I struggled with something during a forensic exam at the
end of a very long day only to solve it the issue in first fifteen minutes of
being back in the office after getting some sleep.
As always, the keys to success are people,
processes, and tools and your incident planning should reflect that fact.
"Does any of your plan talk about how to take care of your people during a major incident?"
ReplyDeleteThey rarely do. As an incident responder, I most often engage with folks who don't have an IR plan, or if they do, it's a dust covered binder that they point to when a compliance assessor asks.
I've had a number of incidents over the years where I've gone on-site and the first thing I've recommended...after watching someone try to type a "simple" command or email address several times...is that everyone go home and get some rest.
It doesn't take someone with my background to recognize...or maybe it does...that if you're dealing with an issue that you don't understand, on a network that you (think you) own (but don't fully understand), with an adversary who's not only operating on his own time frame, but is able to react to stimulus...then the initial anxiety is only going to get worse, and fuel that thought process that "more is better".
I was once engaged with an IT director who was under considerable pressure from the higher ups to engage in 24x7 ops (while the "higher ups" were no where to be found...) and had been doing so for about 10 days. Now, there were no shifts...so the team was getting a bit of rest basically while "no one was looking"...not good. The IT director was downloading images to an ext HDD, and the copy operation (over the internal network) had stabilized at about 8+ hrs remaining. I suggested that at that point, everyone make a hard stop and get some rest...to which the IT dir asked me who'd be analyzing the images while we were resting. Yes...those images...the ones currently being downloaded, that would require just a bit more than 8 hrs to finish.
Due to fatigue, we ran into other issues...the compromised systems were not "supposed" to be on the internal network, and even with hard data (netflow, logs, etc) to support that, it took them about 2 days to accept it.
Fear and anxiety quickly lead to fatigue, mistakes, apathy, and burnout. All of these can be avoided with the appropriate instrumentation and visibility into your infrastructure.
Great commentary. That also reminds me that one of the things that needs to be thought about well in advance is how to plug your consultants into your system as seamlessly as possible. Who are they going to be reporting to? *How* will they be communicated with? (because you're likely not using your corporate email for the incident response portion of things) and how are they going to get access to your network, whose tools are they using, etc, etc.
ReplyDeleteYou don't want your expensive calvary showing up and then you lose several days because they don't have access to your network and there is confusing on what tools are going to be used by who during the response.
Great Article Eric!
ReplyDelete