O’REILLY、INTEL AI主办

English中文
将人工智能用起来
2019年6月18-21日
北京,中国

人工智能大会2019讲师

请查看人工智能大会北京2018讲师列表。. 新讲师会不断加入,请方便的时候回来查看最近更新。

过滤器

搜索讲师

Sarah Aerni is a Director of Data Science at Salesforce Einstein, where she leads teams building AI-powered applications using autoML. Prior to Salesforce she led the healthcare & life science and Federal teams at Pivotal. Sarah obtained her PhD from Stanford University in Biomedical Informatics, performing research at the interface of biomedicine and machine learning. She also co-founded a company offering expert services in informatics to both academia and industry.

Presentations

Achieving Salesforce-Scale Machine Learning in Production 40分钟议题 (40-minute session)

At Salesforce Einstein data science is an agile partner to over 100,000 customers. How do we achieve this scale? We share lessons learned in business, technology and process along the way. Via use cases, oft-missed foundational elements for deployment, and the evaluations that must happen along the way, we will share how to achieve and sustain models in production, and where to go from there.

Jesse Anderson is a Big Data Engineering expert and trainer.

Presentations

Professional Kafka development 2天培训 (2-day Training)

Jesse Anderson leads a deep dive into Apache Kafka. You'll learn how Kafka works and how to create real-time systems with it. You'll also discover how to create consumers and publishers in Kafka and how to use Kafka Streams, Kafka Connect, and KSQL as you explore the Kafka ecosystem.

Chris Butler is the director of AI at Philosophie, where he leads the firm in human-centered AI engagements. Chris has over 19 years of product and business development experience at companies like Microsoft, KAYAK, and Waze. He was first introduced to AI through graph theory and genetic algorithms while studying computer systems engineering at Boston University and has worked on AI-related projects at his startup Complete Seating (data science and constraint programming), Horizon Ventures (advising portfolio companies like Affectiva), and Philosophie (AI consulting and coaching). He has created techniques like empathy mapping for the machine and confusion mapping to create cross-team alignment while building AI products.

Presentations

Design Thinking for AI 3小时辅导课 (3-hour Tutorial)

Purpose, a well-defined problem, and trust from people are important factors to any system, especially those that employ AI. Chris Butler leads you through exercises that borrow from the principles of design thinking to help you create more impactful solutions and better team alignment.

Yijing Chen is a senior data scientist in the Cloud AI Group at Microsoft, where she works with external customers in areas such as energy demand forecast, user mobile behavioral analysis, retail demand forecast, energy theft detection, product pricing, and medical claim denial prediction as well as on other projects using various machine learning methods. Yijing holds an MA in statistics from Harvard University.

Presentations

基于深度学习的时间序列预测 (Deep Learning for Time Series Forecasting) 3小时辅导课 (3-hour Tutorial)

Almost every business today uses forecasting to make better decisions and allocate resources more effectively. Deep learning has achieved a lot of success in computer vision, text and speech processing, but has only recently been applied to time series forecasting. In this tutorial we show how and when to apply deep neural networks to time series forecasting. The tutorial will be in CHN and EN.

Chin is the data engineer working at Rakuten who originated and lead the team building the data science platform.

Presentations

Best practice of building data science platform in Rakuten 40分钟议题 (40-minute session)

Data Science Platform is a suite of tools for exploring data, training models, and running GPU/CPU compute jobs in an isolated container environment. It provides one click machine learning environment creation, powerful job scheduler and flexible "function as a service" component. It runs on Kubernetes and supports both on-premises and cloud environment, as well as hybrid mode.

Bin Fan is a software engineer at Alluxio and a PMC member of the Alluxio project. Previously, Bin worked at Google building next-generation storage infrastructure, where he won Google’s Technical Infrastructure award. He holds a PhD in computer science from Carnegie Mellon University.

Presentations

AVA: a Cloud-Native Deep Learning Platform at Qiniu 40分钟议题 (40-minute session)

Atlab Lab at Qiniu Cloud focuses on deep learning for computer vision. Our team has built a high-performance and cost-effective training platform based on Cloud for deep learning, called AVA, which deeply integrates open source software stack including Tensorflow, Caffe, Alluxio and KODO our own cloud object storage.

Bas Geerdink is a programmer, scientist, and IT manager at ING, where he is responsible for the fast data systems that process and analyze streaming data. Bas has a background in software development, design, and architecture with broad technical experience from C++ to Prolog to Scala. His academic background is in artificial intelligence and informatics. Bas’s research on reference architectures for big data solutions was published at the IEEE conference ICITST 2013. He occasionally teaches programming courses and is a regular speaker at conferences and informal meetings.

Presentations

AI at ING: the why, how, and what of a data-driven enterprise 40分钟议题 (40-minute session)

AI is at the core of ING’s business. We are a data-driven enterprise, with ‘analytics skills’ as a top strategic priority. We are investing in AI, big data, and analytics to improve business processes such as balance forecasting, fraud detection and customer relation management. In this talk, Bas will give an overview of the use cases and technology to inspire the audience!

Chenhui Hu is a Data Scientist in the Cloud AI organization at Microsoft. His current interests include retail forecast, inventory optimization, IoT data, and deep learning. He received his PhD degree from Harvard University with his PhD thesis focusing on biomedical imaging data mining. He also has research experience in wireless networks and network data analysis. He is a recipient of the third IEEE ComSoc Asia-Pacific Outstanding Paper Award. 

Presentations

Forecasting Customer Activities with Dilated Convolution Neural Networks: Use Case and Best Practices 40分钟议题 (40-minute session)

Forecasting customer activities is one of the most important and common business problems. In Microsoft Azure Identity team, we forecast customer behavior based on billions of user activities. We will share how we improve 25% of forecasting accuracy with dilated convolutional neural networks and reduce 80% of the time in development with the best practices of time series forecasting.

Alex Ingerman leads the product management team at Google Research, focusing on federated learning and other privacy-preserving technologies for machinde learning. He joined Googlew in 2016 after working on products including ML-as-a-service platform for developers, web-scale search, content recommendation system and immersive data-exploration environments. Alex holds a BS in computer science and an MS in medical engineering.

Presentations

The future of machine learning is decentralized 40分钟议题 (40-minute session)

Federated Learning is the approach of training ML models across a fleet of participating devices, without collecting their data in a central location. Alex Ingerman introduces Federated Learning, compares the traditional and federated ML workflows, and explores the current and upcoming use cases for decentralized machine learning, with examples from Google's deployment of this technology.

Jewel James is currently working as a product analyst at Go-jek

Presentations

Using ML for personalizing Food Recommendations 40分钟议题 (40-minute session)

The story of how we prototyped the search framework that personalizes the restaurant search results by using ML to learn what constitutes a relevant restaurant given a user's purchasing history

Yangqing Jia is director of engineering for Facebook’s AI platform team, which develops general-purpose open source AI solutions that serve as the backbone of Facebook AI products, such as ranking, computer vision, natural language processing, speech recognition, mobile AI, and AR. He has been influential in developing an open source deep learning software stack, many of the components of which serve as the de facto industry standard in AI. He is the creator or cocreator of Caffe, TensorFlow, Caffe2, ONNX, and PyTorch 1.0. Lately, he has been focused on the design and evolution of the AI hardware and software ecosystem and the combination of AI research and conventional wisdom of computer science.

Presentations

Keynote with Yangqing Jia 主题演讲 (Keynote)

Keynote with Yangqing Jia

Jialin Jiao, an experienced technical veteran in HD maps, autonomous driving, machine learning, search, software engineering, specializes in research and development in the fields of Location-Based Service (LBS), HD maps, big data, machine learning, artificial intelligence and search. He had worked in Uber (US headquarter), Microsoft search engine Bing, IBM TJ Watson Research. He is currently working for Pony.ai, a startup building autonomous driving car. He holds a bachelor’s degree in computer science from Sun Yat-sen University, a master’s degree in computer science from Shanghai Jiao Tong University, and a master’s degree in electronic engineering from the University of Michigan – Dearborn.
Mr. Jiao is an IEEE member, as well as a founding member of IEEE Computer Society’s Special Technical Community in Autonomous Driving (https://stc.computer.org/autonomousdriving/leadership/).

Presentations

Confidence Estimation for Deep Neural Networks 40分钟议题 (40-minute session)

While deep learning has been in the center of AI with unprecedentedly great results, predictions of deep neural networks usually do not come with a reliable and well-calibrated confidence score. Wrong but confident predictions place great threads to critical real-life applications, e.g. self-driving car. This talk is a tutorial/comparison of confidence estimation methods for deep neural networks.

Jing(Nicole) is a data scientist experienced with different machine learning/deep learning model and deals with big data and transform data/model into products and service that drive business.

Presentations

Real-time product recommendations leveraging deep learning on Apache Spark in Office Depot 40分钟议题 (40-minute session)

To show case how to build efficient recommender systems for e-commerce industry using deep learning technologies

Chaoguang has been working in distributed systems for more than 10 years. He was working at IBM on the first generation of SSD tiered storage DS8000, then he was the chief architect of the all-flash storage Dorado Cache in Huawei. Currently he is the leading the deep learning platform at Qiniu.

Presentations

AVA: a Cloud-Native Deep Learning Platform at Qiniu 40分钟议题 (40-minute session)

Atlab Lab at Qiniu Cloud focuses on deep learning for computer vision. Our team has built a high-performance and cost-effective training platform based on Cloud for deep learning, called AVA, which deeply integrates open source software stack including Tensorflow, Caffe, Alluxio and KODO our own cloud object storage.

Zhichao Li is a senior software engineer at Intel focused on distributed machine learning, especially large-scale analytical applications and infrastructure on Spark. He’s also an active contributor to Spark. Previously, Zhichao worked in Morgan Stanley’s FX Department.

Presentations

Analytics Zoo: Distributed Tensorflow and Keras on Apache Spark 3小时辅导课 (3-hour Tutorial)

In this tutorial, we will show how to build and productionize deep learning applications for Big Data using "Analytics Zoo":https://github.com/intel-analytics/analytics-zoo (a unified analytics + AI platform that seamlessly unites Spark, TensorFlow, Keras and BigDL programs into an integrated pipeline) using real-world use cases (such as JD.com, MLSListings, World Bank, Baosight, Midea/KUKA, etc.)

刘影,现任鲸算科技高级风控经理,集团科技教育品牌鲸小小联合创始人。武汉大学学士,加拿大纽芬兰纪念大学硕士,电子工程专业,海外读研期间当过研究员,曾在雷达信号处理领域发表过两篇SCI,参加过三次国际学术会议。2014年底毕业后在青岛墨尔文中学担任一年全英文数学老师,教L5-U6的中学生IGCSE/ALEVEL数学课,并组织了青少年编程俱乐部。2016年加入鲸算科技(原闪银Wecash),任职数据科学家一年,从事互联网金融信用评估特征工程搭建和线上模型研发。2017年至今,转岗为风控高级经理,致力于公司的数据管理工作,深度挖掘数据商业价值,与资方、财务、产品运营、催收、人事管理等团队紧密合作,设计AB实验,帮助大家为公司降本增效,希望把数据科学技术落地为更大更广的商业价值。

像所有其他从事数据科学的同行一样,渴望看见AI技术在更多领域落地,产生影响力。也想鼓励更多女性同行加入到这项激动人心的事业中来。在设计AI产品时,像”Design of Everyday Things”书中讲到的,在追求理性效率同时,倡导更多人文关怀,人性化地去解决问题,共同提高AI落地的方法论,造福人类未来生活。

Ying (Claire) Liu, a Senior Risk Management Manager at Abakus and co-founder of Abakus Kids (a new edTech brand). She received her B.Eng. in radio physics (radio wave propagation and antenna) at Wuhan University in 2007. She also completed a M.Eng. in electrical and computer engineering at Memorial University of Newfoundland in Canada in 2014.

*As a research student, she published 2 SCI papers and attended several international conferences in radar signal processing in 2012-2014.
*Back in China, she taught L5-U6 kids IGCSE/ALEVEL Math class and held a web development club at Malvern College Qingdao in 2015
*From 2016 to 2017, she worked as a data scientist in Abakus Group (Wecash China), from feature engineering to model deploy, witnessing AI technology accelerate online lending in China.
*So far, she assumed herself both a data officer and product manager, dived into the architecture of our data management platform, shared the knowledge of data in our company and spared no effort to help people do better on creating business value, collaborating closely with product, operation, finance and collection teams with data.
*Her second career is as a cofounder of education technology at Abakus Kids which is a startup founded in 2018.

PS, her LinkedIn profile is as follows:
https://www.linkedin.com/in/claire-ying-liu-28948086/

Presentations

A Humane AI Solution to Improve Debt Collection 40分钟议题 (40-minute session)

AI debt collection platform of Abakus provides a friendly and humane product solution which is designed for people who work in the live agents of the organization in the frontline. The agent training of the organization could be enhanced more smoothly with an AI friendly culture. It has been proved in our experiment that the performance of the collection assistants has been highly improved.

David Low is currently the Co-founder and Chief Data Scientist at Pand.ai, building AI-powered chatbot to disrupt and shape the booming conversational commerce space with Deep Natural Language Processing. He represented Singapore and National University of Singapore (NUS) in Data Science Game’16 at France and clinched top spot among Asia and America teams. Recently David has been invited as a guest lecturer by NUS to conduct masterclasses on applied Machine Learning and Deep Learning topics. Prior to Pand.ai, he was a Data Scientist with Infocomm Development Authority (IDA) of Singapore.

Throughout his career, David has engaged in data science projects ranging from Manufacturing, Telco, E-commerce to Insurance industry. Some of his works including sales forecast modeling and influencer detection had won him awards in several competitions and was featured on IDA website and NUS publication. Earlier in his career, David was involved in research collaborations with Carnegie Mellon University (CMU) and Massachusetts Institute of Technology (MIT) on separate projects funded by National Research Foundation and SMART. As a pastime activity, he competed on Kaggle and achieved Top 0.2% worldwide ranking.

Presentations

The Unreasonable Effectiveness of Transfer Learning on NLP 40分钟议题 (40-minute session)

Transfer Learning has been proven to be a tremendous success in the Computer Vision field as a result of ImageNet competition. In the past months, the Natural Language Processing field has witnessed several breakthroughs with transfer learning, namely ELMo, Transformer, ULMFit and BERT. In this talk, David will be showcasing the use of transfer learning on NLP application with SOTA accuracy.

Tao Lu is a Data Scientist in the Cloud and AI organization at Microsoft. He has strong background in applying machine learning and deep learning techniques to forecasting problems. He has deep domain knowledge in cloud identity and financial services industry. He graduated from University of Washington with a master degree in Computational Finance.

Presentations

Forecasting Customer Activities with Dilated Convolution Neural Networks: Use Case and Best Practices 40分钟议题 (40-minute session)

Forecasting customer activities is one of the most important and common business problems. In Microsoft Azure Identity team, we forecast customer behavior based on billions of user activities. We will share how we improve 25% of forecasting accuracy with dilated convolutional neural networks and reduce 80% of the time in development with the best practices of time series forecasting.

Zhenxiao Luo is an engineering manager at Uber, where he runs the interactive analytics team. Previously, he led the development and operations of Presto at Netflix and worked on big data and Hadoop-related projects at Facebook, Cloudera, and Vertica. He holds a master’s degree from the University of Wisconsin-Madison and a bachelor’s degree from Fudan University.

Presentations

Query the planet: Geospatial big data analytics at Uber 40分钟议题 (40-minute session)

One of the distinct challenges for Uber is analyzing geospatial big data. Locations and trips provide insights that can improve business decisions and better serve users. Geospatial data analysis is particularly challenging, especially in a big data scenario. For these analytical requests, we must achieve efficiency, usability, and scalability in order to meet user needs and business requirements.

Robert Nishihara is a fourth-year PhD student working in the UC Berkeley RISELab with Michael Jordan. He works on machine learning, optimization, and artificial intelligence.

Presentations

Building reinforcement learning models and AI applications with Ray 3小时辅导课 (3-hour Tutorial)

Ray is a general purpose framework for programming your cluster. We will lead a deep dive into Ray, walking you through its API and system architecture and sharing application examples, including several state-of-the-art AI algorithms.

Richard Ott is a data scientist in residence at the Data Incubator, where he gets to combine his interest in data with his love of teaching. Previously, he was a data scientist and software engineer at Verizon. Rich holds a PhD in particle physics from the Massachusetts Institute of Technology, which he followed with postdoctoral research at the University of California, Davis.

Presentations

Deep Learning with PyTorch 2天培训 (2-day Training)

PyTorch is a machine learning library for Python that allows users to build deep neural networks with great flexibility. Its easy to use API and seamless use of GPUs make it a sought after tool for deep learning. This course will introduce the PyTorch workflow and demonstrate how to use it. Students will be equipped with the knowledge to build deep learning models using real-world datasets.

Vanja Paunic is a data scientist in the Algorithms and Data Science Group at Microsoft London. She works on building machine learning solutions with external companies utilizing Microsoft’s AI Cloud Platform. She holds a PhD in computer science with a focus on data mining in the biomedical domain from the University of Minnesota.

Presentations

基于深度学习的时间序列预测 (Deep Learning for Time Series Forecasting) 3小时辅导课 (3-hour Tutorial)

Almost every business today uses forecasting to make better decisions and allocate resources more effectively. Deep learning has achieved a lot of success in computer vision, text and speech processing, but has only recently been applied to time series forecasting. In this tutorial we show how and when to apply deep neural networks to time series forecasting. The tutorial will be in CHN and EN.

Dmitry Pechyoni is a senior data scientist in the Cloud AI Group at Microsoft, where he works on building end-to-end data science solutions in various domains, including retail, energy management, and predictive maintenance. Previously, he built machine learning models for display advertising Akamai and MediaMath. Dmitry holds a PhD in theoretical machine learning from the Technion – Israel Institute of Technology.

Presentations

基于深度学习的时间序列预测 (Deep Learning for Time Series Forecasting) 3小时辅导课 (3-hour Tutorial)

Almost every business today uses forecasting to make better decisions and allocate resources more effectively. Deep learning has achieved a lot of success in computer vision, text and speech processing, but has only recently been applied to time series forecasting. In this tutorial we show how and when to apply deep neural networks to time series forecasting. The tutorial will be in CHN and EN.

Prasanth leads the product management team for AI Frameworks at Microsoft. He is one of the founding members of ONNX and actively involved in the open source community.

Presentations

ONNX:开放和互操作平台让AI无处不在(AI everywhere: Open and interoperable platform for AI with ONNX) 40分钟议题 (40-minute session)

An open and interoperable ecosystem enables you to choose the framework that's right for you, train at scale, and deploy to cloud and edge. ONNX provides a common format supported by many popular frameworks and hardware accelerators. This session provides an introduction to ONNX and its core concepts. The session will be delivered in English and Chinese jointly.

Mark has had an interest in machine learning and artificial intelligence since doing a Masters at University of Toronto in the 80s. Currently he works at IBM and is responsible for shepherding customers to a variety of database products, including IBM Integrated Analytics System, which includes a full-blow machine learning environment: DSX. Mark’s interests in machine learning include deep learning on structured data and NLP.

Presentations

Using deep learning and time-series forecasting to reduce transit delays 40分钟议题 (40-minute session)

Toronto is unique among North American cities for having a legacy streetcar network as an integral part of its transit system. This means streetcar delays are a major contributor to gridlock in the city. Using deep learning and time-series forecasting, we'll show how streetcar delays can be predicted... and prevented.

Sujatha Sagiraju is a Group Program Manager in the Azure Cloud & AI group. Her expertise is in building large scale distributed systems. Her latest mission is accelerating and democratizing Artificial Intelligence via Automated Machine Learning. She has been at Microsoft since 2001 in various roles including developer, program manager and capacity planner. Other interests – Sujatha is a diversity & inclusion champion at the Azure AI platform org and is passionate about recruiting, mentoring and growing diverse talent.

Presentations

通过自动化机器学习民主化和加速AI落地 (Democratizing & Accelerating AI through Automated Machine Learning) 3小时辅导课 (3-hour Tutorial)

Intelligent experiences powered by AI can seem like magic to users. Developing them, however, is pretty cumbersome involving a series of sequential and interconnected decisions along the way that are pretty time consuming. What if there was an automated service that identifies the best machine learning pipelines for a given problem/data? Automated machine learning does exactly that!

Kaz Sato is a staff developer advocate on the Cloud Platform team at Google, where he leads the developer advocacy team for machine learning and data analytics products such as TensorFlow, the Vision API, and BigQuery. Kaz has been leading and supporting developer communities for Google Cloud for over seven years. He is a frequent speaker at conferences, including Google I/O 2016, Hadoop Summit 2016 San Jose, Strata + Hadoop World 2016, and Google Next 2015 NYC and Tel Aviv, and has hosted FPGA meetups since 2013.

Presentations

ML Ops and Kubeflow Pipeline 40分钟议题 (40-minute session)

Creating an ML model is just a starting point. To bring the technology into production service, you need to solve various real-world issues such as: building a data pipeline for continuous training, automated validation of the model, version control of the model, scalable serving infra, and ongoing operation of the ML infra with monitoring and alerting.

Alejandro Saucedo is the Chief Scientist at The Institute for Ethical AI & Machine Learning. With over 10 years of software development experience, Alejandro has held technical leadership positions across hyper-growth scale-ups and tech giants including Eigen Technologies, Bloomberg LP and Hack Partners. Alejandro has a strong track record building multiple departments of machine learning engineers from scratch, and leading the delivery of numerous large-scale machine learning systems across the financial, insurance, legal, transport, manufacturing and construction sectors (in Europe, US and Latin America).

Presentations

A practical guide towards explainability and bias evaluation in machine learning 3小时辅导课 (3-hour Tutorial)

Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents. In this talk we demystify machine learning bias through a hands-on example. We'll be tasked to automate the loan approval process for a company, and introduce key tools and techniques from latest research that allow us to assess and mitigate undesired bias in our machine learning models.

Maulik Soneji is currently working as a Data Engineer at Gojek where he works with different parts of data pipelines for a hyper-growth startup. Outside of learning about mature data systems, he is interested in elasticsearch, golang and kubernetes.

Presentations

Using ML for personalizing Food Recommendations 40分钟议题 (40-minute session)

The story of how we prototyped the search framework that personalizes the restaurant search results by using ML to learn what constitutes a relevant restaurant given a user's purchasing history

Guoqiong Song is a senior deep learning software engineer of the big data technology team at Intel. She has a PhD degree in atmospheric and oceanic sciences from UCLA, with a focus on numerical modling and optimization. Her interest is in developing and optimizing distributed deep learning algorithms on spark

Presentations

Real-time product recommendations leveraging deep learning on Apache Spark in Office Depot 40分钟议题 (40-minute session)

To show case how to build efficient recommender systems for e-commerce industry using deep learning technologies

Angus Taylor is a data scientist in the Cloud AI Group at Microsoft, where he builds data science solutions for external customers in the retail, energy, engineering, and package distribution sectors. He holds an MSc in AI from the University of Edinburgh.

Presentations

基于深度学习的时间序列预测 (Deep Learning for Time Series Forecasting) 3小时辅导课 (3-hour Tutorial)

Almost every business today uses forecasting to make better decisions and allocate resources more effectively. Deep learning has achieved a lot of success in computer vision, text and speech processing, but has only recently been applied to time series forecasting. In this tutorial we show how and when to apply deep neural networks to time series forecasting. The tutorial will be in CHN and EN.

Jiao (Jennie) Wang is a software engineer on the big data technology team at Intel, where she works in the area of big data analytics. She is engaged in developing and optimizing distributed deep learning framework on Apache Spark.

Presentations

Real-time product recommendations leveraging deep learning on Apache Spark in Office Depot 40分钟议题 (40-minute session)

To show case how to build efficient recommender systems for e-commerce industry using deep learning technologies

Lu is a data scientist / big data engineer from OfficeDepot, where he works on machine learning and big data analytics. He is engaged in developing distributed machine learning applications and real-time web services for OfficeDepot digital business platform.

Presentations

Real-time product recommendations leveraging deep learning on Apache Spark in Office Depot 40分钟议题 (40-minute session)

To show case how to build efficient recommender systems for e-commerce industry using deep learning technologies

Tiezhen Wang

Senior Software Engineer in Google

Presentations

Exciting new features in TensorFlow 2.0 40分钟议题 (40-minute session)

TensorFlow 2.0 is a major milestone with a focus on ease of use. This talk will give a in depth introduction to the new exciting features and best practices. Topics such as distributed strategies and edge deployment (TensorFlow Lite and TensorFlow.js) will also be covered.

王奕恒是腾讯云的高级研发工程师,主要方向是分布式机器学习,尤其是基于Apache Spark构建大规模数据分析平台。他还是Apache Spark上深度学习框架BigDL的主要贡献者。奕恒之前工作于Intel和摩根士丹利。

Presentations

Sparkling: 基于Apache Spark进行一站式机器学习 40分钟议题 (40-minute session)

机器学习项目在企业中实际落地往往涉及到复杂工作流构建和数据管理,以及多种工具的整合。而且随着数据规模的增加,团队规模的扩大,这一任务更具挑战性。Apache Spark是业界流行的大数据框架,被广泛的应用在海量数据的分析处理。本议题将介绍我们在腾讯云上如何基于Apache Spark为客户建立一个一站式机器学习平台的相关工作。主要内容包括多种数据源的接入,构建复杂数据管线,利用数据可视化理解数据,通过可插拔的机制使用各种流行的机器学习框架,以及部署和监控模型。我们也会分享在这一过程中遇到的问题和挑战。听众也可以了解到,通过这种和大数据紧密结合的一站式机器学习,用户可以怎样更加高效的建立和管理他们的机器学习项目,从而加速了机器学习在业务中的落地。

Mingxi Wu is the vice president of engineering at TigerGraph, a Silicon Valley-based startup building a world-leading real-time graph database. Over his career, Mingxi has focused on database research and data management software. Previously, he worked in Microsoft’s SQL Server group, Oracle’s Relational Database Optimizer group, and Turn Inc.’s Big Data Management group. Lately, his interest has turned to building an easy-to-use and highly expressive graph query language. He has won research awards from the most prestigious publication venues in database and data mining, including SIGMOD, KDD, and VLDB and has authored five US patents with three more international patents pending. Mingxi holds a PhD from the University of Florida, specializing in both database and data mining.

Presentations

非监督学习在大规模图谱上的案例应用和开源算法剖析 40分钟议题 (40-minute session)

图数据上的非监督学习在激活大数据的经济价值上有着广泛和不可替代的作用。 PageRank能够发掘重要的实体, 社区发掘(community detection)可以找到具有某种特性的群体,紧密度中心性算法(Closeness Centrality)可以自动找到远离群体的个体。所有这些算法都是非监督的学习。 我们分享一些具体客户案例来展示他们的价值,同时分享怎样在大数据上灵活应用这些开源算法。

夏磊先生, 现任英特尔中国人工智能技术架构师,服务于英特尔数据中心技术销售部。专注于为客户在应用人工智能前沿技术过程中为客户的创新提供技术建议与指导提供,并提供英特尔产品与技术相关的支持。
夏磊先生于2000年加入英特尔,历任网络系统工程师、客户技术经理、渠道技术总监、云计算方案架构师、物联网端到端方案架构师,支持了国内信息产业在在互联网、数据中心、云计算与物联网术时代的持续技术创新。
夏磊先生获有机器人工程学士学位。在加入英特尔前任职于政府与教育行业的不同的技术开发和技术教育岗位,在软件算法、自动控制及工程管理等领域具有丰富经验。

Presentations

Low precision inference on Intel Architecture 40分钟议题 (40-minute session)

Vector Neural Network Instructions or VNNI is the new Intel instruction set for low precision AI inference inside next generation Xeon platform. This lecture is to introduce the features of the VNNI and Intel software tools to support developers to use this new instruction set to accelerate inference with INT8.

Vincent Xie (谢巍盛) is the Chief Scientist and Director of China Telecom BestPay Co., Ltd. He builds the company’s Artificial Intelligence Group and leads the team to carry out research related to big data and A.I. Previously, he worked for Intel leading an engineering team working on machine learning- and big data-related open source technologies.

Presentations

How China Telecom combats financial frauds with Adversarial AutoEncoder? 40分钟议题 (40-minute session)

We exploit the good representation capability of AAE (Adversarial AutoEncoder) in our risk factors modeling in fighting a special kind of financial frauds. It's one step of our long stack of unsupervised tasks, yet it's proved to be efficient and effective in our practice.

Hui Xue is currently an Associate Researcher in System Group, Microsoft Research Asia (MSRA). She obtained her master degree majoring in Natural Language Processing in July, 2016, from Peking University. Her interests in automated machine learning(AutoML), deep learning and natural language processing, especially their applications for chat-bot.

https://www.microsoft.com/en-us/research/people/xuehui/?preview=true&preview_nonce=b031f0fc93

Presentations

自动机器学习(automated machine learning)技术的实践与应用 40分钟议题 (40-minute session)

人工智能在过去的几年里飞速发展,但是机器学习的实践和应用需要消耗一定的人力和时间。例如,如何去做特征选择,如何设计一个适合该任务的神经网络模型等等。而自动机器学习技术,可以帮助开发者和机器学习实战者,缩短开发周期,提高效率。我们的介绍主要包括:自动机器学习技术的进展;我们开源的自动机器学习开源库Neural Network Intelligence; 如何利用自动机器学习的技术,在产品和应用上提高效率,节省所需的时间和缩短周期。我们会在最后一部分,分享一些利用自动特征选择,自动参数调整以及模型架构搜索上的成功案例。

Season Yang is an analytics fellow in McKinsey & Company’s Risk Practice. Previously, Season was a data scientist in residence at the Data Incubator, where he also contributes to curriculum development and instruction and worked at NASA’s Goddard space center, where he studied climate change models with data analysis. Season holds a double Bachelor’s degree in applied mathematics and scientific computation and economics from UC Davis, and a Master’s in applied mathematics from Columbia, specializing in numerical computation.

Presentations

Deep Learning with TensorFlow 2天培训 (2-day Training)

The TensorFlow library provides for the use of computational graphs, with automatic parallelization across resources. This architecture is ideal for implementing neural networks. This training will introduce TensorFlow's capabilities in Python. It will move from building machine learning algorithms piece by piece to using the Keras API provided by TensorFlow with several hands-on applications.

袁理 深圳普思英察科技有限公司 项目及产品总监

袁理拥有AI行业及金融IT行业工作10多年经验,2006年加入汇丰银行环球技术中心。2013年袁理作为汇丰银行风控部门对公信贷风险业务的资深技术架构师及IT项目经理,主要带领印度,中国及香港团队及协调美国、英国、法国团队支持汇丰银行核心及风控等系统研发升级、自动化和敏捷转型以及云端移植可行性探索,2017年袁理加入普思英察至今主要负责AI及无人车行业产品及项目落地以及解决方案预研及商业模式设定等主要工作。

Presentations

自动驾驶技术是如何应用于新潮传媒、新零售行业 40分钟议题 (40-minute session)

如何令自动驾驶技术落地并结合新潮传媒以及新零售业务,相关的技术是如何实现,商业模式是什么以及如何通过人工只能技术提升行业的效率。

Henry Zeng is a principal program manager in the Cloud AI Group at Microsoft, where he works with engineering team, partners and customers to ensure the success of ML platform. He has been in AI and data area for more than 10 years from database, NoSQL, Hadoop ecosystem, machine learning to deep learning. Prior to this role, he was the lead AI solution architect in Microsoft China working with partners and customer to land AI solutions in manufactory, retail, education and public service etc with Microsoft AI offerings. Henry holds a MS in computer science from Wuhan University.

Presentations

ONNX:开放和互操作平台让AI无处不在(AI everywhere: Open and interoperable platform for AI with ONNX) 40分钟议题 (40-minute session)

An open and interoperable ecosystem enables you to choose the framework that's right for you, train at scale, and deploy to cloud and edge. ONNX provides a common format supported by many popular frameworks and hardware accelerators. This session provides an introduction to ONNX and its core concepts. The session will be delivered in English and Chinese jointly.

基于深度学习的时间序列预测 (Deep Learning for Time Series Forecasting) 3小时辅导课 (3-hour Tutorial)

Almost every business today uses forecasting to make better decisions and allocate resources more effectively. Deep learning has achieved a lot of success in computer vision, text and speech processing, but has only recently been applied to time series forecasting. In this tutorial we show how and when to apply deep neural networks to time series forecasting. The tutorial will be in CHN and EN.

通过自动化机器学习民主化和加速AI落地 (Democratizing & Accelerating AI through Automated Machine Learning) 3小时辅导课 (3-hour Tutorial)

Intelligent experiences powered by AI can seem like magic to users. Developing them, however, is pretty cumbersome involving a series of sequential and interconnected decisions along the way that are pretty time consuming. What if there was an automated service that identifies the best machine learning pipelines for a given problem/data? Automated machine learning does exactly that!

Alina Zhang is Data Scientist at Skylinerunners Corporation and certified as Google Cloud Professional Data Engineer. She has authored [articles](https://medium.com/@alina.li.zhang) on Machine Learning, Exploratory Data Analysis, Data Visualization, etc.
Alina is driving Skylinerunners to provide small business with AI solutions. She applies Machine Learning models on user behavior analysis, recommendation system, and time series forecasting.
Before joining Skylinerunners, Alina was data scientist in Nobul. She was driving Nobul to evolve real estate in the cloud with Machine Learning technology to a variety of problems including property listing prediction, real estate chatbot with natural language processing, customer’s behavioral clustering, etc.
She worked for IBM as a software developer and WLM component owner of IBM DB2. Alina holds a Master Degree in Computer Science from Western University, where her research focused on high performance computing and Truncated Fourier Transform.

Presentations

Using deep learning and time-series forecasting to reduce transit delays 40分钟议题 (40-minute session)

Toronto is unique among North American cities for having a legacy streetcar network as an integral part of its transit system. This means streetcar delays are a major contributor to gridlock in the city. Using deep learning and time-series forecasting, we'll show how streetcar delays can be predicted... and prevented.

刘怀军

美团研究员,美团外卖个性化技术负责人,负责外卖个性化搜索、排序和推荐工作。曾为腾讯搭建公司第一个智能反垃圾系统和智能问答系统,并负责搜搜查询分析,微信智能对话系统和微信搜索算法团队。发明专利20多篇,大部分已经授权。任中文信息学会社会媒体处理专委。

Presentations

AI技术在外卖个性化场景中的落地与思考 40分钟议题 (40-minute session)

该议题的内容包括: 1.外卖个性化场景:个性化搜索,个性化推荐 2.个性化产品形态包括:商家、商品、套餐等 3.外卖个性化中应用的AI技术包括:NLP,DNN,图像技术,强化学习 4.针对外卖业务的特点,介绍个性化场景中,几项重点AI技术的落地、挑战与思考

He is currently working at Rakuten as data engineer and in charge of building the data science platform.

Presentations

Best practice of building data science platform in Rakuten 40分钟议题 (40-minute session)

Data Science Platform is a suite of tools for exploring data, training models, and running GPU/CPU compute jobs in an isolated container environment. It provides one click machine learning environment creation, powerful job scheduler and flexible "function as a service" component. It runs on Kubernetes and supports both on-premises and cloud environment, as well as hybrid mode.

目前在阿里巴巴计算平台事业部PAI团队负责大规模深度学习算法基础设施相关建设工作,对大规模分布式机器学习的开发、建设、优化以及在不同业务场景中的落地应用有较为深入的理解和认识。之前先后在奇虎360担当广告技术部门架构师,Yahoo北京研发中心担当效果广告系统技术负责人。

Presentations

PAI Tensor Accelerator and Optimizer---Yet Another Deep Learning Compiler 40分钟议题 (40-minute session)

本次演讲会介绍阿里计算平台PAI团队过去一年多时间里在深度学习编译器领域的技术工作进展----PAI TAO(Tensor Accelerator and Optimizer)。PAI-TAO采用通用编译优化技术,来解决PAI平台所承载的多样性AI workload面临的训练及推理需求的性能优化问题,在部分workload上获得了20%到4X不等的显著加速效果,并且基本作到用户层全透明,在显著提升平台效率性能的同时也有效照顾了用户的使用惯性。目前PAI-TAO已经先后用于支持阿里内部搜索、推荐、图像、文本等多个业务场景的日常训练及推理需求。

杨博理,现任宜信大数据创新中心首席量化科学家,负责宜信线上财富管理平台上的量化投资策略研发、财务规划系统构建、以及AI在财富管理应用层面上的探索。华中科技大学博士后、博士,剑桥大学联合培养博士,里昂高等商学院访问学者。《量化炼金术——中低频量化交易策略研发》一书的作者。

Presentations

线上财富管理领域中的AI应用 40分钟议题 (40-minute session)

AI技术是线上财富管理领域中不可或缺的一环。在这个演讲中,我会将财富管理进一步细分为投资和实现财务目标两个方面,并分别讲解AI技术在这两个细分层面上的应用问题。对于投资而言,一些具备强金融逻辑的变量可能更适合使用机器学习进行预测。而在资产价格的预测上,可以尝试使用AI和大数据技术获取更多的有价值信息。对于实现财务目标而言,基于NLP技术的语义理解、引导式对话是理解用户的关键,基于AI和大数据的KYC也是判断用户状态的有效工具,而一个融合了财务规划、投资和精算知识的专家系统则是定制级规划的核心。

温浩,云从科技联合创始人。2003年获得中国科大电子科学与技术专业学士,并保送中国科大中科院量子信息重点实验室硕博连读,师从“量子调控”973首席科学家郭光灿院士,专攻量子通信器件和网络方向。2008年获得中国科大通信与信息系统博士学位,2014年加入中国科学院重庆绿色智能技术研究院。2015年和周曦博士共同创立云从科技。

Presentations

打造A.I.闭环 引领产业变革 40分钟议题 (40-minute session)

AI企业发展应该是一个从学术研究、行业验证、商业落地、行业平台到智能生态的一层层深入过程,这也是人工智能企业理想的发展阶段。 云从科技计划打造核心技术闭环,让计算机更好地服务人类。并将全面降低人工智能准入门槛,让“AI普惠”成为可能。

王书浩是透彻影像的联合创始人、技术总监,博士毕业于清华大学,清华大学交叉信息研究院博士后、助理研究员,曾于百度、NovuMind(异构智能)、京东从事人工智能研究,于EuroSys、ECML等会议发表多篇学术论文。

王书浩有着多年的人工智能实践经历,对深度学习有深入的研究,同时对深度学习在大规模集群的实施具有丰富的经验。

Presentations

人工智能病理影像辅助诊断系统——从方法到落地 40分钟议题 (40-minute session)

病理学是医学诊断的“金标准”,病理报告对于临床医生提供进一步治疗策略至关重要。一位能够独立发病理报告的病理医师需要10年以上的培养周期,我国目前共有约1万名注册在案的病理医师,根据WHO的要求,人才缺口为4-9万人。使用人工智能来辅助病理医师对样本进行诊断,不仅能够大幅提高医师的诊断效率,而且可以减少漏诊,提高诊断准确率。数字化的病理影像能够观察到组织的细胞形态,在最高倍数字扫描时,文件尺寸达到GB量级,需要从人工智能和系统工程的层面去应对这些挑战。在这个演讲中,我们将从人工智能系统的构建方法入手,介绍透彻影像与中国人民解放军总医院在消化道病理影像辅助系统研发过程中的技术细节。同时,我们将分享诊断系统从部署到落地使用的一些经验。

Dr. Yurong Chen is a Principle Research Scientist and Sr. Research Director at Intel Corporation, and Director of Cognitive Computing Lab at Intel Labs China. Currently, he’s responsible for driving cutting-edge Visual Cognition and Machine Learning research for Intel smart computing. He is also the co-owner of Intel Labs “Visual Understanding and Synthesis” program, driving research innovation in smart visual data processing technologies on Intel platforms across Intel Labs. He drove the research and development of Deep Learning (DL) based Visual Understanding (VU) and leading Face Analysis technologies to impact Intel architectures/platforms and delivered core technologies to help differentiate Intel products including Intel RealSense SDK, CV SDK, IOT video E2E analytics solutions and client apps. He led the team to win Intel China Award (Top team award of Intel China) 2016, Intel Labs Academic Awards (Top award of Intel labs) – Gordy Award 2016, 2015 and 2014 for outstanding research achievements on DL based VU, Multimodal Emotion Recognition and Advanced Visual Analytics. Dr. Chen joined Intel in 2004 after finishing his postdoctoral research in the Institute of Software, CAS. He received his Ph.D. degree from Tsinghua University in 2002. He has published over 50 technical papers, and holds 10+ issued/pending US/PCT patents and 30+ patent applications.

Presentations

在边缘实现深度学习 40分钟议题 (40-minute session)

深度学习在许多领域尤其是视觉识别/理解方面取得了巨大突破,但它在训练和部署方面都存在一些挑战。本讲座将介绍我们通过高效CNN算法设计、领先DNN模型压缩技术和创新部署时DNN网络结构优化来解决深度学习部署挑战的前沿研究成果。

陈薇博士,现任排列科技首席科学家,江西互联网金融协会特聘风控专家,博金贷金融科技研究院院长。
之前,陈薇曾任职于Lendingclub (NYSE:LC) 任首席数据科学家,负责风险管理相关技术创新,开创性将机器学习与文本数据挖掘系统引入P2P贷款风险分析,取得非常良好的效果,并极大缩短了研发周期,主导的非传统风险模型与决策算法的研究与开发,使公司风控水准远高于美国传统银行。再之前,陈薇曾任Paypal(NYSE:PYPL)主任信贷分析师,专注线上交易风险识别和分析,尤其是银行交易的风险分析和建模设计,创新性将大数据,人工智能和机器学习运用于风险识别和决策。持有内布拉斯加大学计算机科学系博士学位,清华大学计算机工程系硕士及中国人工智能重点实验室成员,曾担任数个学术期刊评审,发表专业论文数十篇。

Presentations

量化互联网金融信用与反欺诈风控 2天培训 (2-day Training)

您想了解金融企业是怎样利用大数据和人工智能技术来画像个人行为并检测欺诈用户的吗?互联网金融幕后的量化分析流程是怎么杨的?个人信用是怎样通过大数据被量化的?在实践过程中,机器学习算法的应用存在着哪些需要关注的方面?怎样通过图谱分析来融合多维数据,为我们区分正常用户和欺诈用户? 这套辅导课基于清华大学交叉信息研究院开设的一门"量化金融信用与风控分析”研究生课。其中会用LendingClub的真实借贷数据做为案例,解说一些具体模型的实现。

中国人寿研发中心高级工程师,自2014年从事大数据相关项目开发及管理。2016年开始研究机器学习模型的构建与实施,已主导多个模型落地实施。

Presentations

保险中的机器学习实践 40分钟议题 (40-minute session)

分析保险行业人工智能发展情况及现有数据特性,评估机器学习模型构建的主流工具、语言、算法。总结基于机器学习技术,实现一个保险业人工智能场景的全流程——从场景研讨、数据加工提取到模型构建、模型效果评估、模型落地实施。以一个真实的机器学习模型项目为例,介绍整个方法论不同环节中各方人员的参与工作内容和比例,探讨特征稳定性、样本不均衡、参数选择、模型可解释性等环节的难点及尝试方案。为金融或者其他行业的机器学习项目落地提供参考和指导。

黄铃,慧安金科(北京)科技有限公司创始人、CEO,清华大学交叉信息研究院兼职教授。主要技术背景是人工智能、信息安全和金融风控。他是全球为数不多的同时精通人工智能和计算机安全的顶级专家,在美国加州大学伯克利分校获得计算机科学博士 (2002-2007),师从 Anthony Joseph 和 Michael Jordan ,从事机器学习算法研究以及计算机网络建模应用。他是美国硅谷著名的反欺诈公司DataVisor的创始成员和大数据总监 (2014-1016),主持了公司整个机器学习,用户行为分析和信用分析系统。他在美国英特尔研究院任资深科学家七年(2007-2014),和 Intel McAfee 开展多个合作项目,应用人工智能技术解决网络和数据安全问题。他在人工智能,大数据分析和金融科技相关领域有近十五年的研究和开发背景,在世界顶尖会议上发表近50篇论文,在 Google Scholar 上总引用已超过5,000次。

Presentations

量化互联网金融信用与反欺诈风控 2天培训 (2-day Training)

您想了解金融企业是怎样利用大数据和人工智能技术来画像个人行为并检测欺诈用户的吗?互联网金融幕后的量化分析流程是怎么杨的?个人信用是怎样通过大数据被量化的?在实践过程中,机器学习算法的应用存在着哪些需要关注的方面?怎样通过图谱分析来融合多维数据,为我们区分正常用户和欺诈用户? 这套辅导课基于清华大学交叉信息研究院开设的一门"量化金融信用与风控分析”研究生课。其中会用LendingClub的真实借贷数据做为案例,解说一些具体模型的实现。

目前在阿里巴巴PAI团队负责GPU底层核心优化工作,之前在中科院软件所从事计算机系统结构相关研究工作,对高性能计算、微处理器设计、异构计算领域有较深入的理解和认识,先后有多篇论文在PPoPP、Micro、ACL等体系结构及AI领域顶级会议发表。

Presentations

PAI Tensor Accelerator and Optimizer---Yet Another Deep Learning Compiler 40分钟议题 (40-minute session)

本次演讲会介绍阿里计算平台PAI团队过去一年多时间里在深度学习编译器领域的技术工作进展----PAI TAO(Tensor Accelerator and Optimizer)。PAI-TAO采用通用编译优化技术,来解决PAI平台所承载的多样性AI workload面临的训练及推理需求的性能优化问题,在部分workload上获得了20%到4X不等的显著加速效果,并且基本作到用户层全透明,在显著提升平台效率性能的同时也有效照顾了用户的使用惯性。目前PAI-TAO已经先后用于支持阿里内部搜索、推荐、图像、文本等多个业务场景的日常训练及推理需求。