Incident Response: Gathering the FactsNot Knowing Numbers Behind Event Makes Risk Assessment Hard
Organizations must gather as much information as possible to make informed decisions in order to respond to IT incidents more effectively, says ENISA's Marnix Dekker.
By not collecting data on incidents such as breaches and communications failures, organizations, industry thought-leaders and government officials cannot make policy and assess risks to any reasonable degree, says Dekker, who works on cloud and smartphone security at the European Network and Information Security Agency.
A recent report issued from ENISA focused on mobile- and land-based network threats and highlighted the benefits of information sharing.
"This is the first time that incidents were collected at the EU level and fed back to the different countries," he says in an interview with Information Security Media Group [transcript below]. "This is the first step in improving transparency of cybersecurity incidents."
Organizations can use the information on certain threats to gauge their overall impact and learn what the root causes were.
Collecting incident information is important to see what the issues are. "Then you look at where you can take security measures to improve and mitigate these incidents," Dekker says.
In this interview, Dekker discusses:
- ENISA's incident report;
- The importance of information availability;
- Voluntary IT security standards.
Dekker works at ENISA on cloud security and smartphone security. He has a degree in theoretical physics and a Ph.D. in computer science. His doctoral thesis proposes new, more flexible, access control for collaborative work environments such as medical health record systems.
Before joining ENISA, Dekker worked for the business consultancy KPMG in the Netherlands as an identity management architect and IT auditor. He designed the new version of DigiD, a digital identity for citizens. At KPMG he also reviewed the deployment of a large cloud and outsourcing service for a critical governmental agency.
ENISA Incident Report
ERIC CHABROW: Take a few moments to summarize the major findings of the study and what you found most fascinating about it.
MARNIX DEKKER: I have to say that the data that we analyzed was still quite limited. Perhaps the fact that we collected the data and that we were able to issue a report was a bigger feat than the data itself. We had some first conclusions of the data and some conclusions where mobile networks were often affected at power cuts, that many users were affected when there were outages, and that very often, in the EU at least, outages involve big natural phenomena. What we saw very often was that there was a big storm which led to power cuts which led to long-lasting outages of electronic communication services. What we would like to stress is the fact that this is the first time that incidents are collected at the EU level and fed back to the different EU countries, and this is the first step in improving transparency of cybersecurity incidents.
CHABROW: Why is that important?
DEKKER: In the past, everyone always focused on incidence response, and our organization, the security agency of the EU, also focused on improving incident response processes. In the past, we used to work a lot with computer emergency response teams to make sure that they improve their capabilities, that they work better together and that they collaborate so that response to incidents and crises are better.
But at the same time, what we saw a lot is that once the incident is over - when the pizza boxes are empty, you had a lot of long nights trying to address the incident, doing a lot of malware engineering - at the end of the day, nothing happened with the incident data. There was no process that assessed what the overall impact was of the incidents, put it all together [and learned] what were the root causes. There was no process that collected that data for sharing with the industry, policymakers or even with the public. That's really surprising.
In the EU, you can go to any country and ask a politician if they know how many incidents there were in the banking sector and what their total impact was, but they don't know the answer. That's difficult to make policy and to even assess the risks of cybersecurity incidents without knowing the numbers behind it. The focus of the last two years was really to improve the transparency of the incidents, knowing that sometimes incidents are sensitive, and trying to stress that only by sharing more data of the incidents, making analysis and aggregation of these incidents and showing what the overall impacts are, this is the only way we can understand better what our big issues are.
CHABROW: It looks like there were four categories that the study looked into: fixed telephony, fixed Internet, mobile telephony and mobile Internet. Now you're talking about the providers of these services. Talk about users. They're sometimes using the same device to access these platforms. Why don't you define what they are, at least these four different groups?
DEKKER: The directive that really is the legal backing for this kind of work focuses on electronic communication services. What we chose are the four services that are most commonly regulated services across the EU member states. Just to give you an example, in some countries, television broadcasting is also seen as an electronic communications service and in some countries there are national regulators that regulate this service and they want to know about incidents and supervised providers of these services. This is not the case in all EU countries. What you see with the four services is really the subset of services that's being considered into the core of electronic communication services across the EU.
CHABROW: You use the term cybersecurity, but I suspect you're using it very broadly, security as making sure things operate as they should operate. We're not necessarily looking at malicious behavior because one of the things that struck me in looking at this report was that the root causes for a lot of these incidents were things such as hardware and software failure. For overall numbers, that represents nearly half of all these incidents. Then, natural phenomena represents about a third and malicious attacks was just six percent.
DEKKER: I completely agree. I can give you some informal arguments. For one reason, our press office likes to use the term cyber and I think this happens in any organization. We prefer to call these plainly security incidents. That's a more neutral term and it doesn't refer to nation-state actors, criminal groups and so on. That term cyber reminds me of Stuxnet.
There's another aspect to this as well. The telecom regulators in Europe are mostly concerned with making sure that all the networks are up and running. Many of the regulators are not too concerned with issues like hacking and flaws in encryption. They delegate that job to data protection authorities. I have to tell you that the telecom regulators got into security quite recently. Only a couple of years ago they started to really think about security of the networks. They're slowly getting into this field and slowly you see more and more telecom regulators focused on things like encryption and other aspects of security.
Availability of Information
CHABROW: What's important to a business is keeping things up and running. Whether it's through a malicious attack or whether it's through a power failure, it's the responsibility of those in charge of those systems to keep them operating. In the past month here in the United States, we've had distributed-denial-of-service attacks that have crippled online banking services. The root cause here would be a malicious attack. But the bottom line here is banks providing services or not being able to provide services. Do people in charge have to take a more holistic approach?
DEKKER: Security experts like to look at both aspects of security. One aspect is up-time, if the systems are up and running so that you can rely on them to do the job for you, and the other aspect is that the systems don't do anything you don't want them to do. Generally, the availability part of security is important, as well as confidentiality and integrity. But I agree that the availability is very essential for this society in general. Specifically, when we look at infrastructure - because when you look at banking these are services that rely on telecom infrastructure and data centers - the most important part of infrastructure is that it's resilient in a sense that it's up and running. With confidentiality and encryption, customers can still deal with those themselves. In that sense, I agree that cyber attacks, confidentiality issues and integrity issues are important aspects, but at the same time when we talk about incidents affecting infrastructure, availability is an even more important aspect.
CHABROW: The reporting mechanism is new for this kind of study that you're doing, and I believe that almost half of the nations are just starting to do the reporting. That's why these numbers aren't really reliable, but you're trying to infer certain trends in there, and malicious attacks are low down. Do you think that you'll be seeing more of those as more reporting come in? Do you think the threat will increase or do you think it will stay about the same?
DEKKER: I think we will see even less malicious attacks, actually.
CHABROW: Why's that?
DEKKER: We're focusing on the big outages, first of all. We're focusing on the infrastructure, which means the wire that goes from business to business, or from a business to operator, but not beyond that. For example, if you have an issue with your router or if someone attacks your PC, these are all issues that are incidents outside the scope of the operator and it's behind its scope. If you look plainly at the wire that goes from a business to another business or to an operator and so on, you see that and if you only look at the big outages, these big outages are often caused by software flaws at core routers, digging machines that caught cables or power outages, and that's what we're seeing a lot.
CHABROW: How about in the mobile platform? Is it the same thing?
DEKKER: Mobile networks are more complex, there are more dependencies and the outages seem to be bigger. There's more IT equipment involved, and that's a funny term. For years, they have been switching from standard telco equipment slowly to more and more what you call IT equipment, more IP-based. That kind of equipment is developed differently and isn't developed with the same resilience requirements in mind. It's being developed in a fairly open market where bringing products to market has priority over a very conservative approach in making sure that the product can deliver.
You see that on the one hand the complexity of this network is getting bigger and small flaws can have bigger impact. You see the equipment itself seems to be less reliable, and you also see that the operators have less knowledge to deal with the incidents. Whenever there's a software flaw, for example because some software vendor issued the wrong patch for a piece of router firmware, they don't really know what to do. They don't really know if they can roll back or not, and sometimes it's impossible and you see that takes eight hours, or even a full day, and the operator really doesn't have the stuff or the crew anymore to fix these issues. That's an interesting development for sure.
CHABROW: To characterize what you're saying, at least when it comes to the infrastructure, whether it's mobile or land-based, the real threats are the same things that have been around for years - the infrastructure itself, the wiring, natural disasters or even accidents - rather than something malicious happening.
DEKKER: It has been around for ages, but the problem is that the impact is now getting bigger and bigger. Recently, there was a big fire in the Netherlands. We had this long discussion, which may be interesting for your audience to know a little bit about this incident. We were really borrowing from the Federal Communications Commission. They've been doing incident reporting for a while. Ten years ago we would do information security not because of the incidents. One would get some ISO standard and look at the chapters of the ISO standard and start doing these things one by one. You would do something on human resource security. You would do something on business continuity. But you didn't really have any incidents, not because there were no attackers.
Slowly, on our side of the Atlantic, the younger security experts are trying to take a more risk-based approach. They're going to make sure they register all the incidents and they're going to address all the incidents. It's what the FCC has been doing with the telecom operators in the United States. They're taking the same approach. We had a lengthy conference call yesterday with them. It's very useful for us to speak and work with them, and we have frequent calls. They have many more years of experience. They have been doing this for 5-10 years more and they were also stressing yesterday that if you start by collecting the incidents and then you see what the issues are, then you look at where you can take security measures to improve and mitigate these incidents. That's really a very pragmatic approach and it allows you to get this cycle of where you see if your measures are successful. If you register, you see if they're successful by looking at the incidents and then you take these new measures.
Coming back to what I was going to explain earlier, what we're doing is really not only collecting these incidents. We discuss these incidents with the different regulators of the EU member states and we address them. We take a couple of big outages as examples. The day before yesterday, we were in Germany with these regulators discussing, among other things, this annual report. We took one big incident that had happened recently, which was a big outage in the western part of the Netherlands around Rotterdam, also involving The Hague, which is the seat of the International Tribunal and also the seat of the Dutch government. A very large area was affected by the mobile telephony outage. Many politicians and ministers have Vodafone subscriptions, so you can imagine the political impact that it had. And it lasted hours. The reason for this incident was really just a fire in an adjacent building of a Vodafone switch building.
What you see is the economic impact of such an outage is huge. There are all sorts of systems that are now relying on public communications. There are alarm systems that use a simple SIM card to dial home notifying that there's a burglar. There are trains and trams that are using normal telephony as a way to communicate with each other. Politicians use it. The banking sector uses it. It's pretty clear that society is more and more dependent on simple telephony.
What's funny that came out of this discussion is that the politicians were just not accepting the fact that they couldn't make their phone calls anymore, because they were saying, "There's not only Vodafone. There's also T-Mobile. There are also some other large Dutch operators. Here I am in a room and I'm an important person and I have a subscription with the wrong operator and there's someone standing next to me with a telephone from another operator and they can make a phone call. Why can he make a phone call and I cannot make a phone call?"
Just to push this point a bit further, they say, "When I'm abroad, I have no problem calling with the network of another operator. So why, if I'm the Netherlands, can I not use the network of another operator in case of an outage?" We understand the technical difficulties, the legal aspects and the impact of what we call national roaming would have on competition and international markets. But at the same time, we also like to step into the shoes of a citizen that has no idea about technology or a politician and we could just say that in fact, there are redundant networks. There are several operators covering the same area in a country. Why not? Why do we even have to have outages and why, when there's a big outage, can't we just switch over to another network? These are very interesting discussions and the Dutch have now started this discussion with the operators, and I'm sure they're going to find a solution out of this. Then, what we're going to do is take this solution, explain it in more detail and make sure that the other member states and countries in the EU can learn from this experience and follow the same path to reduce the impact of outages on society.
Voluntary IT Security Standards
CHABROW: A big debate here in the United States is the role of government establishing either regulation or voluntary standards on IT security. Do you find that to be an interesting debate or perplexing debate at all that's happening here?
DEKKER: In the states, there's a very elegant way out. Just make a phone call with people from the FCC and learn from their experience. We're taking the same approach here. Yes, it's very difficult to mandate security measures. As we all know, you can write down in the standards saying you have to have a firewall and we've been doing this for years. Then the auditor would come, check if you have a firewall and add another check on your checklist. But at the end of the day, your security didn't improve at all because you have the wrong rules in the firewall. Or, for example, there's something wrong because the attack goes over a different port. It's very difficult to cast security requirements in stone. It's very difficult to enforce them. But we're not doing that at all in Europe. The same is going on in the FCC, so there's a high-level requirement to take appropriate security measures and to do risk management to address the risks.
On the other side, there's also a legal requirement to report incidents and that requirement is cast in stone. There's no way out. You get big fines if you do not report the incident. But actual details of the security measures which have to be taken, these are not mandatory. They're just recommendations. The way you check back on whether or not these recommendations are good or whether or not companies are following up on these recommendations is you check back on them by using the incidents.
What the FCC does is they meet every three months with the operators of the telecom industry and they discuss what would be best practices to implement to improve certain aspects of the telecom infrastructure. That's not binding at all. Yes, it's an interesting discussion. Very often you hear this - it's a subtle point and it's very important to stress - that we cannot mandate ever a set of security measures, and that's really wrong. We've seen in the past that it has not worked with the ISO 27000. That has never really produced any useful results. It didn't really improve your security, but there are ways out of this and the FCC is a good example; what we're doing in Europe right now in the telecom sector is the same example. You work with the industry and you derive recommended best practices. The only part that's really mandatory is the incident reporting because that allows you to get feedback on if things are working.