Big Data, Data Mining, Predictive Analytics. These buzzwords are all arount the economic and IT news for some time now. The idea behint those words: Use of statistics, mathematics and modern computer technology to automatically analyze a huge amount of data, recognize patterns and to forecast future events.
Why I write a whole post about this topic?
Because this development is very exciting, and with my previous startup Placedise, I used these techniques to simulate advertising effects and predict their impact myself. For this reason, I want to tell you more about predictive analytics below.
Predictive analytics is a very broad term that describes no specific technology, but rather the idea that’s outlined above. The idea to predict or simulate future events and behavior with historical data from different areas is nothing new. Insurance companies and banks have used statistics for decades to determine credit risk or the credit worthiness of individuals. Given the complexity of these issues, it is not sufficient to simply consider the last 30 bank transfers of a credit applicant. In addition to the residence, or age, numerous complementary statistics are used. If the person has successfully demonstrated a similar application recently, for example, and how likely it is that the loan amount will be refunded in such cases. The situation is similar with advertising effects, which just can not be determined simply by scoring models with 7 parameters.
The basis for such data analysis models is always build on historical data. These can be derived from studies or directly by respective companies. In the end, it is simply important that the number of data is as large as possible, as with any additional information the validity of the statements increases. Thus, these predictive models can be even more meaningful at the end, than a small “real” experiment, which takes place as it were “live” and directly interacting with the object of interest.
Accordingly, for example, retail companies have great potential for such techniques, since the usually have recorded a lot of data due to their large trade turnover. But also in other areas, statistics can provide valuable results. So maybe you have heard about “Predictive Policing”. This describes the idea to predict crime with big data and prevent it. In case you now also remember the movie “Minority Report” with Tom Cruise, you’re right. Check out this article of the economist (click here) and the video below.
But how is this even possible?
To really understand the issue, we must first get rid of the frequent misconception that life consists primarily of chaos and that decisions are completely free of any influence and unpredictable. Life rather moves on very clear (even though complex) tracks. This is partly due to the fact that many decisions are quite strongly dictated by external circumstances.
Imagine an intersection. A person walks towards the intersection. We want to figure out which road the person is going to cross first. Many would now assume that this is subject to a kind of “free will” and yes, in the end it is somehow. However, the decision is influenced by many parameters. This begins with what traffic lights just turned green. Historical data show, for example, that people then first cross this road in 90% — at least if they have a similar level of education and social background. Now, our observed person could of course just know this and deliberately violate this “rule”. However, a comprehensive model also takes into account this probability. Mostly this is also a pattern in the sense that people who deliberately violate rules elsewhere, do so at intersections. Accordingly, we would simply adjust the probabilities. After taking into account many other factors, we get to a result which is able to predict fairly accurately which road the person crosses first. A street vendor, for example, could now use this information to ideally adjust his position.
Critics would argue that in the end there is still no absolute statement but merely a probability. This is correct, but in the end this is still a huge improvement compared to the alternative. In our example, you now know that the person crosses road A with a probability of 88%. This is significantly (!) better than the alternative to simply guess with a 50:50 chance! And isn’t it far better to only lose 12 out of 100 customers, than 50 or more?!
This is the way predictive analytics works, and it is applicable to virtually all areas of life. The only big challenge at the end is to gather and prepare the right data and to design automated analysis processes. Unfortunately, many existing services are currently so complex and full of statistical terminology, that it is difficult to use them for business without appropriate experts. But this will also change in the next years due to many innovative companies.
You are not yet convinced of the possibilities of modern data analysis?
Did you follow the Soccer World Cup 2014 in Brazil?
(Long time ago, I know; but still a good example!)
Did you bet (at least for yourself) on who would win?
Did you predict every game correctly?
Microsoft did (except the game for third place), using the described methods (see here)!