Predictive Analysis in CyberdefenseConverting Mass Amounts of Data into Actionable Intelligence
Predictive analysis is an emerging tool being used to identify potential cyberthreats against organizations. But what is predictive analysis and how does the process work?
"[Predictive analysis] is the ability to ingest large quantities of data from multiple sources ... and it's sifting through that sea of data to find the key pieces of information that are necessary to put together a hypothesis about what threats are potentially up to," Christopher Ling, senior vice president at the management consultancy, says in an interview with Information Security Media Group [transcript below].
Enterprises constantly accumulate large amounts of data through threat intelligence reports provided by security vendors and information gathered from their own networks. On an organization's network, threat actors typically move around for weeks or months, leaving signs that they were there. "That gives you two very broad pieces that you can gain insight [on]," he says.
"In other words, there's an individual that has intention to do something and they're starting to exercise a particular capability that leaves evidentiary marks inside the network itself," Ling says. "Through [this] process, you have the ability to have forewarning of potential attacks."
In the interview, Ling:
- Defines predictive analysis;
- Explains the art and science, including how to connect the dots;
- Describes the characteristics of a professional who conducts predictive analysis.
Ling leads Booz Allen's business in military intelligence, specializing in developing high-level strategies to improve intelligence support to operations, focusing on quantifying investments to create new value and improve capabilities. He won a National Science Foundation fellowship for advanced physics at Harvard University.
ERIC CHABROW: What is predictive analysis?
CHRISTOPHER LING: Predictive analysis is the ability to anticipate the next move of a threat and the advantage of that insight is the ability to align your security resources for maximum advantage.
CHABROW: Where has predictive analysis worked?
LING: Certainly in the intelligence community and then later in the military environment in cyber space. Now, I think it's becoming more of a calling need across commercial industry.
Software, Personnel Needed
CHABROW: Walk us through predictive analysis and what kind of software and personnel you need.
LING: It's a mixture of two things and I think you've hit on some of the key points. First, it's the science of it, which is the collecting of the dots. Then there's the art, the analytical portion of it, which is connecting the dots. There are many tools available today and data is becoming largely a commodity item. There's ability to ingest large quantities of data from multiple sources across multiple languages, and it's sifting through that sea of data to find the key pieces of information that are necessary to put together a puzzle or a hypothesis about what threats are potentially up to.
By creating this hit rate in terms of identifying and collecting these dots, being able to connect them in a specific pattern gives you an insight into how the threat's evolving. Then, once you have enough of these hits, you get some idea that you can anticipate what their next steps are going to be.
Example of Predictive Analysis
CHABROW: Can you walk us through a situation where some organization used this and how they dealt with the threat?
LING: What organizations are faced with right now is the ability to buy certain pieces of threat intelligence data that are available today. Mostly this comes in the form of reports, or it gives them an idea about volume chatter where there may be certain key words associated with that particular company. And all that is - I think you've heard the term before - is noise in the channel. The trouble with that is it's not predictive in a real sense that you can actually take action against that.
Really what there is are tell-tell signs. As threats move through clients and networks, it's not so much that they have decided they're going to have an attack and it all happens within 24 hours or even a week. Usually, they're under a process of exploiting the network for a period of time, even months, and so they leave tell-tale signs as they move through the network. That, in addition to all this external data that I mentioned before, gives you two very broad pieces that you can gain insight because you can queue across those two elements and try and connect those.
Once you connect those, you have some idea that there's indeed context. In other words, there's an individual that has intention to do something and then also they're starting to exercise a particular capability that leaves evidentiary marks inside the network itself. By connecting these capabilities and intentions, that's where you get a first flag that there's a potential for something interesting to look at, and that's where the human really steps in and starts to do more detailed analysis. Through that process, you have the ability to have forewarning of potential attacks.
CHABROW: You talk about the market out there of reports. What kinds of reports are you talking about?
LING: Many of the large antivirus companies produce reports and they go into quite a bit of detail on specific cyber threats that exist, whether they be at the nation-state level or they're syndicated crime. They usually talk about the bad actors and what they're up to. What they're not really able to do is tie that to a specific target.
CHABROW: How are organizations monitoring their networks, or how should they be monitoring their networks, to look for these kinds of threats?
LING: Today, network monitoring usually happens within the network itself. What I mean by that is that they have lots of tools that monitor the volume of data, the type of data and its ingress and egress paths. But that's not necessarily connected to a lot of what goes on external to a company. There's a lot of chatter in social media groups, a lot of news sources and those sorts of things, but there isn't a way right now to directly connect those two together.
CHABROW: Why not?
LING: I think it's because continuous monitoring of a network, as it were, really evolved out of the IT process or the IT group. Most companies have a large IT segment of their organization and they outsource the actual hardware and software, and they could actually outsource the applications, and so when it came time when there was volume and everybody was worried about whether the system was up and running or not, or whether it needed six 9's or nine 9's to run on a continuous basis, the idea of continuous monitoring was more about the performance of the network and making sure it was at its peak levels all the time.
I think the task to actually do continuous monitoring was first put in place just with technical people who monitor the hygiene and the health of the network to make sure that it was performing well. Out of the growth of that, when it became clear that threats were not just some 16-year-old kid in his parent's garage, but it was becoming something more than that and it was a syndicated crime opportunity to either extract data for financial gain, there became more focus on continuous monitoring. That capability obviously evolved, but it evolved initially out of just the performance aspect of it. It's always been an add-on or a bolt-on that I don't think was necessarily designed from the get-go, but was added on to an existing infrastructure.
Types of Skills Needed
CHABROW: You talked about the human element. What kinds of skills are needed to analyze this kind of information?
LING: It's a myriad of people that do this. I don't know if these people are trained so much as they're naturally born that way: people who have reverse-engineering skills, people who are very curious and people who also have the ability to have deep technical insight. [We also] need somebody that understands the cultural phenomenon and characteristics of certain thing, [as well as] linguists. It's sort of a group of people that collectively have the skill set that are able to work collaboratively over time to identify threats, monitor threats over a period of time and try and figure out the context in which they're operating. Review historical patterns and see if they have established techniques, tactics and procedures that they use or if they're using specific pieces of malware that they're morphing over time.
We often think of malware that has DNA. We can usually trail back or have an audit trail about where it actually came from, who touched it and how it changed over time. All of this creates a knowledge-base on all the threat actors out there, what they're using and how they use it. When we see activity, it's almost like we can template that against what we have in our known databases. Is it something that existed previously or is it something completely new? Then, this collective group works together all the time, monitoring and working with one another to continuously develop the deeper understanding of the characteristics of that threat.
Limited Resource of Professionals
CHABROW: You monitor a lot what goes on in the military. Are there a sufficient number of people in government in the defense department and intelligence agencies that have those kinds of skills?
LING: Yes, I think for the government, and that has been the theme of the day. NSA has been stood up to protect military organizations. DHS has been stood up to manage the .gov realm. But there really isn't anybody in the government directly responsible for securing .com. That rests with the companies at the moment. While there has been significant expertise that has been stood up at NSA to protect .mil, and DHS is just beginning the process now for .gov, everybody is in competition for the same number of people, and right now that's a very limited resource.
CHABROW: Is it another problem we have with finding enough IT security folks out there?
Cost to Organizations
CHABROW: This all sounds costly. Can most organizations afford this?
LING: That's a very good question. The government certainly can because it's a matter of national security. I think DHS is scaling up at the moment and recognizing the importance, especially the critical infrastructure programs. For companies, that's where the challenge is coming in. This is the type of thing that I think industries have the opportunity and the government has begun this by standing up these information-sharing groups around the critical infrastructures in financial services, communications and transportation. This is were a conglomerate of companies come together and share information on the types of threats that they see in their industry, and we actually have one for defense companies as well [called] the Defense Industry Base, referred to as the DIB, so as threats come in, we share that with one another.
I think this is the type of situation where it's a little bit of a slightly different mindset to overcome the economic barriers of this all. I see in the future that security will be much more of a managed service, but a managed service that's tailored to individual clients so there will be a way to do the large data collection that I mentioned before in gathering the dots and connecting the dots, but then connecting the dots within the context of what each company's attack surface is, to provide clear insight into where the threat vectors are and where they can take actionable intelligence. That would be shared with that particular company. But the overall infrastructure could be a shared cost.
CHABROW: We're hearing a lot now in the president's executive order with at least one-way sharing of information from the government to industry, but perhaps legislation permits it going both ways in the industry. This will play a big role in preventative analysis.
LING: Yes. Collective security will be the base in which they generate the data set in which you can now apply some very sophisticated modeling and analytical tools to give you predictive intelligence.
CHABROW: Any other thoughts you have about predictive intelligence?
LING: It's a funny thing. In the past, the intelligence community really was spying - it still is today - taking information from our adversaries. It used to be it has to capture data in motion. It was only when you went through the process of communicating and putting it out in the open through HF or those sorts of things was the opportunity where the data was available and we had to try and get the data at that moment in movement. With the advent of computer systems, databases and the way we use e-mail and messaging today, there was an opportunity to capture data at rest. Now you can get vast quantities of data where it's all co-located in a single time, and the interesting thing about that is it really gave boon to the market of really deep and fast analytics against huge data sense.
If we think about the success we've had against Osama Bin Laden, it was really finding one communication that was absolutely critical amongst billions of communications. If you think about the analytic capability, both in terms of functional breadth and depth of expertise in analytics to do that, there have been pioneering efforts inside the government to create that kind of capability and those tools for decades now. The ability to harness that and take that analytical expertise, even if it's without the classified data, and move that into the commercial arena, it's really an ability for the commercial arena not to spend a decade trying to figure out what predictive intelligence is, but to really garner the state-of-the-art best practices and leverage that with open data sets inside the commercial industry.
CHABROW: It seems like we just spent about ten minutes or so speaking about big data without even using that term.
LING: All data is big data these days, I guess.