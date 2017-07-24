There’s a lot of hype over machine learning and data science these days. It’s time to go beyond the buzzwords, and find out how we, as a security community, can actually reap the benefits of machine learning like fine-tuning staffing needs, making smarter budget decisions, automated and adapative security incident response, and more accurate threat reporting and assessment.

Machine learning isn’t the end all, be all of security operations and program management. We aren’t going to stop a ransomware attack just because we have more data on how the last one occurred. It doesn’t matter how much data we have if we don’t understand it and what to do with it. It won’t get us anywhere without someone on staff (or a full staff, if you’re really serious) who can understand the data, analyze it, and create a meaningful execution strategy using insights from that data.

The Joint Chiefs of Staff have defined intelligence as, “Information and knowledge about an adversary obtained through observation, investigation, analysis, or understanding.” We need to apply that principle to today’s threat landscape. Here’s how.

Plan your work, and work your plan

The idea behind data science is taking the huge amount of information available today from all of our devices and networks, and enriching it to make it actionable. Data science naturally relies on data, the more the better, but it isn’t just about quantity, it’s about quality. Data science, like all analysis methodologies and techniques, is a “garbage in, garbage out” scenario. Thus, the result of our analysis (intelligence) is only as good as the information the system or systems ingest. Sure, we’re going to enrich a baseline data set with additional sources of information to improve the quality, accuracy, and potential actions that can be taken, but if all of our data sources are garbage, our result will ultimately be garbage.

