A recent survey conducted by Gregory Piatetsky-Shapiro of KDnuggets, found a “strong interest in online certificates and MS degrees in Analytics, Big Data, Data Mining, and Data Science, even among those who already have graduate degrees.”
According to Piatetsky’s analysis of graduate degrees in these fields, they started to emerge in 2007, with a big spike in 2012. The proliferation of these programs–which the survey’s findings indicate will continue unabated for the foreseeable future–prompted a strong response on Twitter from Meta Brown, a consultant and writer on business analytics, and rejoinders by Piatetsky and Shlomo Engelson Argamon, director of Illinois Institute of Technology’s new Master in Data Science program. Delighted, I convened via email a virtual panel on data science education.
Brown didn’t mince words: “More than thirty percent of student loans are now delinquent. Over 500 schools now have student loan default rates higher than their graduation rates. I have lost count of the number of my personal acquaintances who have obtained advanced degrees with the aim of career advancement, yet found that advancement eluded them. Much of what’s on offer these days is of poor quality, and even academically excellent programs are no assurance of concrete rewards.”
Piatetsky disagreed, arguing that the proliferation of these programs is “a reasonable response to industry demand” and predicting that “market forces will take care of the weeding out of weaker degrees.” Argamon, for his part, had strong words of his own: “The development of data science degree programs is absolutely necessary. We all know that there is a clear need now and in the foreseeable future for many more data science professionals. These people need to have the strong statistical knowledge, software engineering ability, and communication skills which only a well-designed multidisciplinary program can provide. This distinct body of knowledge and skills cannot be conveyed by traditional educational programs such as statistics or computer science.”
Indeed, there is no doubt that the proliferation of these graduate programs is driven by various estimates of the current meager supply and expected high demand in the future of people skilled at data analysis (see MGI and Gartner). Yet Brown warns of jumping to conclusions regarding the expected return on investment on these programs and points to the lack of agreement on what constitutes data science: “Money is not the only reward of education, yet it is surely the primary selling point used to market data science programs, and the primary motivator for students. But there’s no clear definition of data science and no clear understanding of what knowledge employers are willing to pay for, or how much they will pay, now or in the future. Already I know many competent, diligent data analysts who are unemployed or underemployed. So, I am highly skeptical that the students who will invest their time and money in data science programs will reap the rewards they have been led to expect.”
Argamon agrees that “there are as yet no standards for data science curriculum content” which leads to differences in the quality and nature of the educational experience in the programs on offer today. But unlike Brown, he finds a silver lining in the lack of consensus regarding data science and thinks that the very proliferation of these programs “shines the attention on data science education that will help us evolve such standards.”
There is more or less a consensus about the lack of consensus regarding a clear definition of data science. And no lack of complaints about the murky situation. See, for example, Robin Bloor’s “Data Science Rant”–“These two words, when conjoined, are utterly misleading.” Bloor’s rant is “philosophical,” but I can share empirical evidence about both the current fuzziness and the prominence of the term. I learned from a recent Department of Defense DARPA announcement, asking for proposals to investigate “the national security threat posed by public data,” that there are “principles of data science,” presumed to be understood by all potential proposers (the exact language is “Based on principles of data science, develop tools to characterize and assess the nature, persistence, and quality of the data”). Intrigued, I emailed Dr. Christopher White, the “Topic Author,” and asked what he meant by “Principles of Data Science” and whether he can point me to a relevant reference. His answer: “My usage was general, like ‘natural science.’ There isn’t a specific reference.”
Just as the lack of clarity and established principles do not stand in the way of using “data science” in requests for proposals, it hasn’t slowed down the sprouting of new graduate programs with “data science” in their names. In addition to the one at the Illinois Institute of Technology (IIT), there are, for example, the Master of Information and Data Science from UC Berkeley’s iSchool (online only), MSc in Data Science and Management from Imperial College (London), MS in Data Science from New York University, Graduate Certificate of Advanced Studies in Data Science from Syracuse University’s School of Information Studies, and the Certification of Professional Achievement in Data Sciences from Columbia’s Institute for Data Sciences and Engineering.
Until the launch of these programs this year, “Business Analytics” used to be the most popular term used in the name of these programs. I asked IIT’s Argamon why they went with “data science” instead. “’Data scientist’ as a job title is significantly trending upwards,” Argamon says, “while ‘analyst’ is trending downwards (as is the more specific title ‘data analyst’). A second consideration (perhaps related to these trends) is that “analytics” is both too specific as well as too general a term. Data analysis is just one part of data science… In our view, the term ‘data science’ more clearly delineates the field we seek to teach—technically rigorous statistics, machine learning, and scalable computation, together with practical training in working with data, developing meaningful visualizations, and communicating usefully with a variety of non-technical stakeholders.”
Are we seeing the eclipse of a term—“business analytics”—which became a buzzword only six or seven years ago? The sudden emergence in 2007 of graduate programs in “business analytics” mentioned above can be attributed to a single paper that launched the term into buzzwordom: Tom Davenport’s Harvard Business Review article “Competing on Analytics,” published in January of 2006.
But “business analytics” —which replaced “data mining” as the term of choice for describing the analysis of data—was soon to be eclipsed by the rise of new players, new participants, and new tools and technologies. “Big Data” changed the landscape. The “old’ guard tried for a while to jump on the new bandwagon by using “big data analytics,” but the new players persisted and eventually took over the discussion with a new term: Data Science.
In October 2012, Davenport (with DJ Patil) heralded again the dawn of a new era by publishing “Data Scientist: The Sexiest Job of the 21st Century,” again in the Harvard Business Review. Davenport was also instrumental earlier in his career in the success of “business process re-engineering” and “Enterprise Resource Planning (ERP),” so his ability to call a major trend and his timing are indisputable. Guided by Davenport’s uncanny intuition (ignoring his hedging of bets with Analytics 3.0), and assuming that the rise of one buzzword signals the beginning of the decline of the one it is replacing, I conclude that the half-life of a buzzword (at least in the case of “analytics” or “business analytics”) is exactly 81 months, the time that passed between the publication of Davenport’s article on analytics and the one on data scientists.
“Data Science” could disappear, to be eclipsed by the next buzzword. But it may stick around, just like another incongruous joining of two words, “computer science,” did. This will happen, I predict, only if it becomes, like computer science, an established and well-defined discipline, with its own academic departments and all the necessary paraphernalia such as professional associations, journals, conferences, and awards.
[Originally published on Forbes.com]