In the past few months, NLP has witnessed several breakthroughs with transfer learning, namely ELMo, OpenAI Transformer, universal language model fine-tuning (ULMFiT), and BERT. Pretrained models derived from these techniques have achieved state-of-the-art results on a wide range of NLP problems. The use of pretrained models has come a long way since the introduction of Word2vec and GloVe, and these two approaches are now considered as shallow in comparison.
David Low introduces you to the history of transfer learning, which has been a tremendous success in the computer vision field because of ImageNet competition. He explains why pretrained models are handy for tackling machine learning problems with limited data, and you’ll understand how they can be used as fixed feature extractors for downstream tasks and applications.
David walks you through the codes and steps to fine-tune a transfer learning model so that it achieves state-of-the-art accuracy (92%) on a real-world sentiment classification problem—the Amazon reviews dataset. It takes just 1,000 samples of training data to produce a model that achieves similar performance to a fastText-based model trained on a full dataset of 3.6 million samples.
David Low is the cofounder and chief data scientist at Pand.ai, a company building an AI-powered chatbot to disrupt and shape the booming conversational commerce space with deep natural language processing. He represented Singapore and the National University of Singapore (NUS) in the 2016 Data Science Games held in France, and clinched the top spot among Asian and American teams. David has been invited as a guest lecturer by NUS to conduct master classes on applied machine learning and deep learning topics. Throughout his career, David has engaged in data science projects across manufacturing, telco, ecommerce, and the insurance industry, including sales forecast modeling and influencer detection, which won him awards in several competitions and was featured on the IDA website and the NUS publication. Previously, he was a data scientist at the Infocomm Development Authority (IDA) of Singapore and was involved in research collaborations with Carnegie Mellon University (CMU) and Massachusetts Institute of Technology (MIT) on projects funded by the National Research Foundation and SMART. He competes on Kaggle and holds a top 0.2% worldwide ranking.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org