Five key trends will likely cause a significant uptick in enterprise use of machine learning this year, according to Deloitte Global.
Large and medium-sized enterprises will step up their use of machine learning in 2018, Deloitte Global predicts, doubling the number of implementations and pilot projects that were underway in 2017. By 2020, that number will likely have doubled again. And as enabling technologies such as machine learning APIs and specialized hardware become available in the cloud, these advances will likely be increasingly within reach for small companies as well.
Machine learning is an artificial intelligence (AI), or cognitive, technology that uses data to enable systems to learn and improve from experience without being programmed explicitly. Despite considerable excitement and aggressive forecasts for these technologies, most enterprises using machine learning last year had only a handful of deployments and pilots underway, according to a 2017 Deloitte Consulting LLP survey.
By the end of this year, however, over two-thirds of large companies working with machine learning will likely have 10 or more implementations and a similar number of pilots, according to Deloitte Global. That growth will likely be made possible largely by progress in five key areas.
Increasing automation. Data scientists are typically the professionals responsible for putting machine learning to work within the enterprise, but due to the high demand for their specialized skills, their time is generally in very short supply. The good news is that much of what data scientists spend their time on—including data wrangling, exploratory data analysis, feature engineering, feature selection, predictive modeling, and model selection—can now be wholly or partially automated.
A growing number of tools and techniques for data science automation, offered by established companies as well as venture-backed startups, could help shrink the time required to execute a machine learning proof of concept from months to days. By making data scientists more productive, this growing automation will likely enable enterprises to increase their machine learning activities considerably.
Less data required. Training a machine learning model can require millions of data elements, and acquiring and labeling them can be costly and time-consuming. Consider a model that learns from MRI images labeled with diagnoses, for example. It might cost more than $30,000 to hire a radiologist to review and label 1,000 images at a rate of six images an hour. Privacy and confidentiality concerns can also make it difficult to obtain the data in the first place.
Several emerging techniques, however, could reduce the amount of training data required. One involves the use of synthetic data, generated algorithmically to mimic the characteristics of the real data. A team at Deloitte Consulting tested a tool that was able to build an accurate model with only a fifth of the training data previously required; it synthesized the remaining 80 percent.
Synthetic training data can also open the door to crowdsourcing data science solutions. For instance, researchers at MIT used a real data set to create synthetic alternatives that could be used to crowdsource the development of predictive models without needing to disclose the original data set. In 11 out of 15 tests, the models developed from the synthetic data performed as well as those trained on the real data.
Another technique that could reduce the need for training data is transfer learning. With this approach, a machine learning model is pre-trained on one data set as a shortcut to learning a new data set in a similar domain, potentially reducing the number of training examples needed by several orders of magnitude.
Faster training. Also alleviating the training burden is the development of specialized hardware by many established and startup hardware manufacturers. This could dramatically increase the use of machine learning by enabling applications to consume less power while being more responsive, flexible, and capable.
Graphics processing units, or GPUs, have generally been the most common kind of machine learning chip in the past, and although they will likely continue to dominate in 2018, they have growing competition. By the end of 2018, more than 25 percent of all chips used to accelerate machine learning in the data center will likely be field-programmable gate arrays and application-specific integrated circuits, Deloitte Global predicts.
By speeding up the calculations and data transfer within the chip, these new contenders can slash the time required to train machine learning models, which in turn can bring down the associated costs. Early adopters of these specialized AI chips include several major technology vendors and research institutions, but adoption is spreading to sectors such as retail, financial services, and telecom as well.
More transparent results. Machine learning achievements get more impressive by the day, but many models suffer from a critical flaw: They are black boxes, meaning it is difficult or impossible to explain with confidence how they make their decisions. This makes them unsuitable or unpalatable for many applications, for reasons ranging from trust in the answers to regulatory compliance.
Some new techniques can help shine light on such models. MIT researchers, for instance, have demonstrated a method of training a model so that it delivers not just accurate predictions but also the rationales behind them.
As it becomes possible to build more interpretable models, companies in highly regulated industries such as financial services and life sciences can be expected to intensify their use of machine learning and significantly expand the number of pilots and deployments in the coming years. Some of the potential applications include credit scoring, recommendation engines, fraud detection, and disease diagnosis and treatment.