Every once in a while, watching a data scientist give a presentation is like watching kids put on their parents clothes. All of the garments are on the right body parts, but nothing fits. In cases like these, a relationship has been applied to the data, but the presumed relationship is incorrect.
My favorite story on this point:
Grad student: Oh, look! The data describe a Gaussian!
Professor: No! A Gaussian describes the data!
Just think about it.
A physicist collects specific kinds of data (empirical measurements) to answer specific questions. When they speak of analytics with respect to data, they mean something very like the idea of mathematical “proof” from your geometry classes. It's from these kinds of analytical exercises that a physical scientist uncovers relationships in the data and, from these relationships, builds models.
Your data scientists will do better when they're thinking in terms of finding rules and laws within the data sets they are given. They can experiment with various kinds of statistical or fitting applications, but these should be used as numerical experiments designed to uncover what the data have actually measured.
Digital Clones can supply pre-paid, retained services for major project design and launches. Retainers based on a 160 hour (four week) engagement. Send consulting enquiries to Consultants At DigitalClones Dot Biz.
We would also be happy to supply speakers to your organization to present on the principles of LO+FTTM and how they work in research and development teams. Send speaking enquiries to Speakers At DigitalClones Dot Biz.
Optimizing Luck is our primary case study on leadership in high-stakes, high-tech businesses. Get your copy while they're still available.
Buy Optimizing Luck
In the physical sciences, basic analytics helps us to understand small groups of observations. Great analytics leads to models that yield a totally different kind of intellectual power. To achieve predictive analytics, you need models.
There are two important kinds of predictive models. The first kind is a system model that shows you all of the repeatable features in a large-scale context. This kind of model made it possible to predict the total solar eclipse of August 21, 2017, and what it looked like from any point in the US. That's pretty powerful.
The second kind of predictive model is even more powerful. Quantum physics provides a great example. Prior to quantum theory no one had ever conceived of anything remotely resembling lasers or semi-conducting materials. In this case, it was analytics applied to the model itself that led to the discovery of these two pillars of modern technology. That's the intellectual power to discover the “unknown unknowns.”
The lessons for data scientists are many. Most importantly, data science results absolutely must map back to real world understanding. Formalism captured in an algorithm can usually spit out an answer. The question will always remain: “Does this answer make sense?”
Even in pure research contexts
it's all about problem solving.
Problem solving always begins with
careful problem characterization.
Innovation is the art of turning
a great solution into a great application.