Forecasting user activities is one of the most common problems that business groups face. Various industries, such as cloud solution providers and online retailers, often need to forecast product consumption and user engagement; the data scientist team under Azure Identity faces the same challenges. With billions of user activities on the Azure cloud, being able to accurately forecast user behavior becomes essential in business planning. However, statistical time series models and machine learning approaches like tree-based models fail to generate forecasts with satisfactory accuracy, while recently developed deep learning models can boost performance, and yet they remain hard to implement.
Tao Lu and Chenhui Hu focus on forecasting customer usage on a daily basis. In this user case study, they explain how Microsoft improved forecasting accuracy by 25% with dilated convolutional neural networks and reduced time in development by 80% with a set of time series forecasting best practices.
Dilated CNNs, recently proposed for modeling sequence data, can achieve state-of-the-art performance in time series forecasting and are easier to train than recurrent neural networks. To speed up the model development process, Microsoft leverages the best practices of time series forecasting that another Microsoft team has created. This is a framework that provides standard workflows of various methods (e.g., statistical methods, traditional machine learning methods, and recently developed deep learning approaches). It also includes reusable utility functions for data standardization and feature engineering. With such utility functions, Microsoft converts the dataset into a standard format. Then, it implements the dilated CNN model and other baseline models by following the templates in the framework. Moreover, the company tunes the hyperparameters of each model with HyperDrive in Azure Machine Learning to achieve a fair comparison of model performance in terms of accuracy, running time, and cost.
Tao Lu is a data scientist in the cloud and AI organization at Microsoft. He has strong background in applying machine learning and deep learning techniques to forecasting problems and deep domain knowledge in cloud identity and the financial services industry. He graduated from University of Washington with a master’s degree in computational finance.
Chenhui Hu is a data scientist in the Cloud and AI Division at Microsoft. He works on developing machine learning and big data solutions for internal and external customers. His current interests include time series forecasting, natural language processing, large-scale machine learning, and deep learning. He received his PhD degree from Harvard University with a focus on imaging data mining and signal processing. He has presented his work in many international conferences including O’Reilly AI Conference 2019 and O’Reilly Strata 2019.
Comments on this page are now closed.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org