Most companies in industry collect and leverage text data for some part of their business operations. Some, such as Yelp and Twitter, have text data at the core of their platform while most others utilize it behind the scenes, triaging and responding to support requests and customer feedback. Top companies have achieved incredible performance by switching to deep learning methods for text analysis. Companies making this shift, though, typically encounter a set of challenges which include determining which models to spend their time and money on, how to validate and explain model performance, and how model complexity affects the ease of deploying them. Examples of such business challenges include:
Drawing on research gathered from conversations with 75+ teams from Google, Facebook, Amazon, Twitter, Salesforce, Airbnb, Capital One, Bloomberg, and others, Emmanuel Ameisen and Yan Kou share a guide for moving your company from traditional machine learning approaches, such as logistic regression on bag-of-words features to more expressive deep learning models, such as convolutional neural networks and recurrent neural networks. These new techniques allow companies to improve many of the core algorithmic concerns that underlie a majority of key business operations, such as clustering (e.g., to identify topics in articles) and classification (e.g., to automatically forward support requests to the appropriate person). You’ll learn the trade-offs of different models in terms of power, complexity, and interpretability and understand how to choose the ones most appropriate for your projects.
Emmanuel Ameisen, a machine learning engineer at Stripe, implemented and deployed predictive analytics and machine learning solutions for Local Motion and Zipcar. Recently, he led Insight Data Science’s AI program, directing more than a hundred machine learning projects. Emmanuel holds graduate degrees in artificial intelligence, computer engineering, and management from three of France’s top schools.
Yan Kou is the director of product at Insight Data Science, which over her tenure instituted the first in the market professional education program on data science in healthcare in the US. Over the past two years, Yan has directed 80+ data science projects on topics including consumer genomics, electronic medical records, natural language processing, deep learning, medical images, and wearables. Yan’s team is an official partner of Y Combinator and has partnered with many leading healthcare organizations, including Massachusetts General Hospital, Optum, the Broad Institute, Flatiron Health, Biogen, and many more. Yan has a background in human genomics and five years experience in data science and machine learning. Her research on complex human diseases such as cancer and autism has resulted in more than 2,000 citations. Yan was nominated as one of Forbes’s 30 under 30 in 2013.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com