In the world of big data, “predictive analytics” is all the buzz. You put tons of information into a big data machine, a magical algorithm does its part and BOOM – you suddenly predict the future right? Well, not so fast. While the term predictive analytics has panache, it’s not an exact science – it’s the process of calculating probability based upon the analysis of data points. Determining probable outcomes requires the synthesis of mountains of individual data points, a valid data mining methodology and a clear idea of the answers you are seeking. The data must be gathered, organized, cataloged and indexed before the analytics can begin. This is the less than glamorous side of predictive analytics that makes all the difference in the success or failure of determining projected outcomes.
Process vs. Exact Science
I recently attended two conferences and had the benefit of hearing experts from different industries speak about this topic in different ways. They both touched on a similar theme: data collection is often tedious, time consuming and difficult, but it is the foundation of data analytics and one that cannot be overlooked. At the Digital Government Institute Big Data Conference, Colonel Linda Jantzen from the Army Architecture Integration Center focused on the challenges of enterprise wide data management. She pointed out that big data is “fragmented and disconnected,” with a jumble of different protocols, structures and syntax. The architectures for managing data have to be customized and prioritized to work more efficiently. All of these features make it difficult to aggregate data to perform analytics. Jantzen advocated viewing data sharing and data management as a strategic asset—and organizing data with architectures that are interoperable, trusted, accessible and accurate. Similarly, the International Association of Chiefs of Police (IACP) conference offered a breakout session entitled, “Improving the Collection and Utility of Law Enforcement Use of Force Data.” Amy Blasher from the FBI described a current federal initiative collecting data from state and local police on how agencies use data. She emphasized the sheer complexity of aggregating data from literally hundreds of resources, but this is a necessary first step before any sort of analysis can be undertaken.
I walked away from both sessions with the realization that it is not enough for organizations deploying data analytics to use effective predictive algorithms, but they need to have the infrastructure, tools and processes in place to collect, manage and organize the data before any sort of analysis can be performed.
How to Get Started
A valid data mining methodology starts with determining business objectives and understanding data that you see in this illustration. The initial phase focuses on understanding the project and the data and translating that into a clear problem definition.
The next step is data preparation and modeling. Raw data is collected, organized and cleansed while various modeling techniques are selected and applied to the data sets defined in the business in step 1.
Finally, the model or models are evaluated based on their ability to provide statistically valid answers to the initial business question.
“You get out what you put in.” While this can apply to just about anything, it’s especially true with data analytics. And it’s not just what you put in, but HOW you do it. With attention at this critical step, business and government organizations can gain answers to critical questions in a format that is easy to consume and practical.
For Colonel Jantzen, the data points that must be managed are as diverse as the mission objectives of the U.S. Army. For Ms Blasher, the data points were derived from hundreds of state and local law enforcement agences. But as different as the objectives of these two speakers may have been, the fundamentals are the same: without the timely and accurate collection and management of millions of data points, predictive analytics is not possible. For this reason, it is vital that every agency adopting big data analytics examine and understand how their systems perform such functions. When you combine the right data, algorithms, and collection strategies, predictive analytics becomes a powerful tool. Take a look at what Hitachi Visualization can do for law enforcement and what DataAdapt can do for organizations trying to make sense and derive value from mountains of complex data. Contact ViON to learn more about the process and power of predictive analytics.