Presented By
O’REILLY + INTEL AI

English中文
PUT AI TO WORK
June 18-21, 2019
Beijing, CN

Presentations

Ying Liu (Abakus 鲸算科技(Wecash闪银))
AI debt collection platform of Abakus provides a friendly and humane product solution which is designed for people who work in the live agents of the organization in the frontline. The agent training of the organization could be enhanced more smoothly with an AI friendly culture. It has been proved in our experiment that the performance of the collection assistants has been highly improved.
Alejandro Saucedo (The Institute for Ethical Ai & Machine Learning)
Undesired bias in machine learning has become a worrying topic due to the numerous high profile incidents. In this talk we demystify machine learning bias through a hands-on example. We'll be tasked to automate the loan approval process for a company, and introduce key tools and techniques from latest research that allow us to assess and mitigate undesired bias in our machine learning models.
Sarah Aerni (Salesforce Einstein)
At Salesforce Einstein data science is an agile partner to over 100,000 customers. How do we achieve this scale? We share lessons learned in business, technology and process along the way. Via use cases, oft-missed foundational elements for deployment, and the evaluations that must happen along the way, we will share how to achieve and sustain models in production, and where to go from there.
Bas Geerdink (ING)
AI is at the core of ING’s business. We are a data-driven enterprise, with ‘analytics skills’ as a top strategic priority. We are investing in AI, big data, and analytics to improve business processes such as balance forecasting, fraud detection and customer relation management. In this talk, Bas will give an overview of the use cases and technology to inspire the audience!
刘先生 (美团)
该议题的内容包括: 1.外卖个性化场景:个性化搜索,个性化推荐 2.个性化产品形态包括:商家、商品、套餐等 3.外卖个性化中应用的AI技术包括:NLP,DNN,图像技术,强化学习 4.针对外卖业务的特点,介绍个性化场景中,几项重点AI技术的落地、挑战与思考
Zhichao Li (Intel)
In this tutorial, we will show how to build and productionize deep learning applications for Big Data using "Analytics Zoo":https://github.com/intel-analytics/analytics-zoo (a unified analytics + AI platform that seamlessly unites Spark, TensorFlow, Keras and BigDL programs into an integrated pipeline) using real-world use cases (such as JD.com, MLSListings, World Bank, Baosight, Midea/KUKA, etc.)
Yang Wang (Intel)
We will introduce Analytics Zoo, a unified analytics + AI platform for distributed TensorFlow, Keras and BigDL on Apache Spark, designed for production environment. It enables easy deployment, high performance and efficient model serving for deep learning applications.
Chaoguang Li (Qiniu), Bin Fan (Alluxio)
Atlab Lab at Qiniu Cloud focuses on deep learning for computer vision. Our team has built a high-performance and cost-effective training platform based on Cloud for deep learning, called AVA, which deeply integrates open source software stack including Tensorflow, Caffe, Alluxio and KODO our own cloud object storage.
安敖日奇朗 (Rakuten, Inc.), TzuLin Chin (Rakuten, Inc.)
Data Science Platform is a suite of tools for exploring data, training models, and running GPU/CPU compute jobs in an isolated container environment. It provides one click machine learning environment creation, powerful job scheduler and flexible "function as a service" component. It runs on Kubernetes and supports both on-premises and cloud environment, as well as hybrid mode.
Joseph Spisak (Facebook)
Learn how PyTorch 1.0 enables you to take state-of-the-art research and deploy it quickly at scale in areas from autonomous vehicles to medical imaging. We'll deep dive on the latest updates to the PyTorch framework including TorchScript and the JIT compiler, deployment support, the C++ interface. We will also cover how PyTorch 1.0 is utilized at Facebook to power AI across a variety of products.
Richard Liaw (UC Berkeley RISELab)
Ray is a general purpose framework for programming your cluster. We will lead a deep dive into Ray, walking you through its API and system architecture and sharing application examples, including several state-of-the-art AI algorithms.
Jialin Jiao (Pony.ai)
While deep learning has been in the center of AI with unprecedentedly great results, predictions of deep neural networks usually do not come with a reliable and well-calibrated confidence score. Wrong but confident predictions place great threads to critical real-life applications, e.g. self-driving car. This talk is a tutorial/comparison of confidence estimation methods for deep neural networks.
Rich Ott (The Data Incubator)
PyTorch is a machine learning library for Python that allows users to build deep neural networks with great flexibility. Its easy to use API and seamless use of GPUs make it a sought after tool for deep learning. This course will introduce the PyTorch workflow and demonstrate how to use it. Students will be equipped with the knowledge to build deep learning models using real-world datasets.
Season Yang (McKinsey & Company)
The TensorFlow library provides for the use of computational graphs, with automatic parallelization across resources. This architecture is ideal for implementing neural networks. This training will introduce TensorFlow's capabilities in Python. It will move from building machine learning algorithms piece by piece to using the Keras API provided by TensorFlow with several hands-on applications.
Aileen Nielsen (Skillman Consulting)
Deep learning for time series analysis has made rapid progress in 2018 and 2019, with advances in the use of both convolutional and recurrent neural network architectures. The state of the art in deep forecasting will be summarized for 2018 and 2019, including use cases in both forecasting and generating time series.
Chris Butler (IPSoft)
Purpose, a well-defined problem, and trust from people are important factors to any system, especially those that employ AI. Chris Butler leads you through exercises that borrow from the principles of design thinking to help you create more impactful solutions and better team alignment.
Bichen Wu (UC Berkeley)
The success of deep neural networks is attributed to three factors: stronger computing capacity, more complex neural networks, and more data. These factors, however, are usually not available with the edge applications as autonomous driving, AR/VR, IoT, and so on. In this talk we discuss how we apply AutoML, SW/HW codesign, domain adaptation to solve these problems.
Tiezhen Wang (Google)
TensorFlow 2.0 is a major milestone with a focus on ease of use. This talk will give a in depth introduction to the new exciting features and best practices. Topics such as distributed strategies and edge deployment (TensorFlow Lite and TensorFlow.js) will also be covered.
Tao Lu (Microsoft), Chenhui Hu (Microsoft)
Forecasting customer activities is one of the most important and common business problems. In Microsoft Azure Identity team, we forecast customer behavior based on billions of user activities. We will share how we improve 25% of forecasting accuracy with dilated convolutional neural networks and reduce 80% of the time in development with the best practices of time series forecasting.
Ben Lorica (O'Reilly Media), Roger Chen (Computable), Jason (Jinquan) Dai (Intel)
Opening keynote remarks by Program Chairs Ben Lorica, Jason Dai, and Roger Chen
In this presentation we will share experiences from our attempts in using AI on Spark for game playing.
David Maman (Binah.ai)
Zero-day attacks. IoT-based botnets. Cybercriminal AI v. cyberdefender AI. While these won’t be going away, they aren’t the biggest worry we have in cybercrime. Hacking humans is. The combination of mere minutes of video, signal processing, remote heart rate monitoring, AI, machine learning, and data science can identify a person’s health vulnerabilities, which evildoers can make worse.
YAN KE (上海扩博智能技术有限公司)
In this talk, we will share the successes and failures of creating an entirely autonomous visual recognition-powered drone inspection solution for turbine blades, which increased the efficiency by 10 times.
Weisheng Xie (China Telecom BestPay Co., Ltd)
We exploit the good representation capability of AAE (Adversarial AutoEncoder) in our risk factors modeling in fighting a special kind of financial frauds. It's one step of our long stack of unsupervised tasks, yet it's proved to be efficient and effective in our practice.
Vijay Agneeswaran (Publicis Sapient), Abhishek Kumar (Publicis Sapient)
We illustrate how capsule networks can be industrialized: 1. Overview of capsule networks and how they help in handling spatial relationships between objects in an image. We also learn about how they can be applied to text analytics. 2. We show an implementation of recurrent capsule networks, which are useful in text analytics, especially for some tasks such as summarization or classification.
Pete Warden (Google)
Keynote by Pete Warden
Ion Stoica (UC Berkeley)
Keynote with Ion Stoica
Yangqing Jia (Facebook)
Keynote with Yangqing Jia
Keynotes to come
Keynotes to come
Lei Xia (Intel)
Vector Neural Network Instructions or VNNI is the new Intel instruction set for low precision AI inference inside next generation Xeon platform. This lecture is to introduce the features of the VNNI and Intel software tools to support developers to use this new instruction set to accelerate inference with INT8.
Kaz Sato (Google)
Creating an ML model is just a starting point. To bring the technology into production service, you need to solve various real-world issues such as: building a data pipeline for continuous training, automated validation of the model, version control of the model, scalable serving infra, and ongoing operation of the ML infra with monitoring and alerting.
Prasanth Pulavarthi (Microsoft), Henry Zeng (Microsoft)
An open and interoperable ecosystem enables you to choose the framework that's right for you, train at scale, and deploy to cloud and edge. ONNX provides a common format supported by many popular frameworks and hardware accelerators. This session provides an introduction to ONNX and its core concepts. The session will be delivered in English and Chinese jointly.
杨军 (阿里巴巴), 龙国平 (Alibaba)
本次演讲会介绍阿里计算平台PAI团队过去一年多时间里在深度学习编译器领域的技术工作进展----PAI TAO(Tensor Accelerator and Optimizer)。PAI-TAO采用通用编译优化技术,来解决PAI平台所承载的多样性AI workload面临的训练及推理需求的性能优化问题,在部分workload上获得了20%到4X不等的显著加速效果,并且基本作到用户层全透明,在显著提升平台效率性能的同时也有效照顾了用户的使用惯性。目前PAI-TAO已经先后用于支持阿里内部搜索、推荐、图像、文本等多个业务场景的日常训练及推理需求。
Jesse Anderson (Big Data Institute)
Jesse Anderson leads a deep dive into Apache Kafka. You'll learn how Kafka works and how to create real-time systems with it. You'll also discover how to create consumers and publishers in Kafka and how to use Kafka Streams, Kafka Connect, and KSQL as you explore the Kafka ecosystem.
Zhenxiao Luo (Uber)
One of the distinct challenges for Uber is analyzing geospatial big data. Locations and trips provide insights that can improve business decisions and better serve users. Geospatial data analysis is particularly challenging, especially in a big data scenario. For these analytical requests, we must achieve efficiency, usability, and scalability in order to meet user needs and business requirements.
Guoqiong Song (Intel), Luyang Wang (Office Depot), Jennie Wang (Intel), Jing (Nicole) Kong (Office Depot)
To show case how to build efficient recommender systems for e-commerce industry using deep learning technologies
Yiheng Wang (Tencent)
机器学习项目在企业中实际落地往往涉及到复杂工作流构建和数据管理,以及多种工具的整合。而且随着数据规模的增加,团队规模的扩大,这一任务更具挑战性。Apache Spark是业界流行的大数据框架,被广泛的应用在海量数据的分析处理。本议题将介绍我们在腾讯云上如何基于Apache Spark为客户建立一个一站式机器学习平台的相关工作。主要内容包括多种数据源的接入,构建复杂数据管线,利用数据可视化理解数据,通过可插拔的机制使用各种流行的机器学习框架,以及部署和监控模型。我们也会分享在这一过程中遇到的问题和挑战。听众也可以了解到,通过这种和大数据紧密结合的一站式机器学习,用户可以怎样更加高效的建立和管理他们的机器学习项目,从而加速了机器学习在业务中的落地。
Alex Ingerman (Google)
Federated Learning is the approach of training ML models across a fleet of participating devices, without collecting their data in a central location. Alex Ingerman introduces Federated Learning, compares the traditional and federated ML workflows, and explores the current and upcoming use cases for decentralized machine learning, with examples from Google's deployment of this technology.
David Low (Pand.ai)
Transfer Learning has been proven to be a tremendous success in the Computer Vision field as a result of ImageNet competition. In the past months, the Natural Language Processing field has witnessed several breakthroughs with transfer learning, namely ELMo, Transformer, ULMFit and BERT. In this talk, David will be showcasing the use of transfer learning on NLP application with SOTA accuracy.
Ben Lorica (O'Reilly Media), Jason (Jinquan) Dai (Intel), Roger Chen (Computable)
Opening keynote remarks by Program Chairs Ben Lorica, Jason Dai, and Roger Chen
Mark Ryan (IBM), Alina Li Zhang (Skylinerunners)
Toronto is unique among North American cities for having a legacy streetcar network as an integral part of its transit system. This means streetcar delays are a major contributor to gridlock in the city. Using deep learning and time-series forecasting, we'll show how streetcar delays can be predicted... and prevented.
Maulik Soneji (Go-jek), Jewel James (Go-jek)
The story of how we prototyped the search framework that personalizes the restaurant search results by using ML to learn what constitutes a relevant restaurant given a user's purchasing history
王书浩 (透彻影像)
病理学是医学诊断的“金标准”,病理报告对于临床医生提供进一步治疗策略至关重要。一位能够独立发病理报告的病理医师需要10年以上的培养周期,我国目前共有约1万名注册在案的病理医师,根据WHO的要求,人才缺口为4-9万人。使用人工智能来辅助病理医师对样本进行诊断,不仅能够大幅提高医师的诊断效率,而且可以减少漏诊,提高诊断准确率。数字化的病理影像能够观察到组织的细胞形态,在最高倍数字扫描时,文件尺寸达到GB量级,需要从人工智能和系统工程的层面去应对这些挑战。在这个演讲中,我们将从人工智能系统的构建方法入手,介绍透彻影像与中国人民解放军总医院在消化道病理影像辅助系统研发过程中的技术细节。同时,我们将分享诊断系统从部署到落地使用的一些经验。
鞠芳 (中国人寿研发中心)
分析保险行业人工智能发展情况及现有数据特性,评估机器学习模型构建的主流工具、语言、算法。总结基于机器学习技术,实现一个保险业人工智能场景的全流程——从场景研讨、数据加工提取到模型构建、模型效果评估、模型落地实施。以一个真实的机器学习模型项目为例,介绍整个方法论不同环节中各方人员的参与工作内容和比例,探讨特征稳定性、样本不均衡、参数选择、模型可解释性等环节的难点及尝试方案。为金融或者其他行业的机器学习项目落地提供参考和指导。
陈玉荣 (Intel)
深度学习在许多领域尤其是视觉识别/理解方面取得了巨大突破,但它在训练和部署方面都存在一些挑战。本讲座将介绍我们通过高效CNN算法设计、领先DNN模型压缩技术和创新部署时DNN网络结构优化来解决深度学习部署挑战的前沿研究成果。
Yijing Chen (Microsoft), Dmitry Pechyoni (Microsoft), Angus Taylor (Microsoft), Vanja Paunic (Microsoft), Henry Zeng (Microsoft)
Almost every business today uses forecasting to make better decisions and allocate resources more effectively. Deep learning has achieved a lot of success in computer vision, text and speech processing, but has only recently been applied to time series forecasting. In this tutorial we show how and when to apply deep neural networks to time series forecasting. The tutorial will be in CHN and EN.
李苍柏 (中国地质科学院矿产资源研究所)
矿床所在的位置往往伴随着地质、地球物理、地球化学、遥感异常,因此,这些异常所在的位置也往往伴随着矿床的存在。所以,在找矿工作当中,一个重要的过程便是在地、物、化、遥数据中寻找异常,并将其整合,得出该区域成矿的概率,从而推断出靶区所在的位置。但传统方法并未考虑空间中点与点之间的相关关系。而卷积神经网络中的卷积和池化方法,充分考虑了点与点之间的相关关系。但单纯使用卷积神经网络只能进行特征提取,不能圈定异常所在的区域。因此,特将目标检测的相关算法引入其中,从而圈定异常所在的区域。
本次人工智能会议上午8:00-8:30可以和希望社交的与会来宾见面。我们将在周五主题演讲之前搞一个非正式快速社交活动。一定记得带名片参加活动。
在本次人工智能大会上与寻求联系的与会者会面。会议将在周四主题演讲之前举行一个非正式的快速社交活动。一定要带上自己的名片来享受社交活动。
温浩 (云从科技)
AI企业发展应该是一个从学术研究、行业验证、商业落地、行业平台到智能生态的一层层深入过程,这也是人工智能企业理想的发展阶段。 云从科技计划打造核心技术闭环,让计算机更好地服务人类。并将全面降低人工智能准入门槛,让“AI普惠”成为可能。
杨博理 (宜信大数据创新中心)
AI技术是线上财富管理领域中不可或缺的一环。在这个演讲中,我会将财富管理进一步细分为投资和实现财务目标两个方面,并分别讲解AI技术在这两个细分层面上的应用问题。对于投资而言,一些具备强金融逻辑的变量可能更适合使用机器学习进行预测。而在资产价格的预测上,可以尝试使用AI和大数据技术获取更多的有价值信息。对于实现财务目标而言,基于NLP技术的语义理解、引导式对话是理解用户的关键,基于AI和大数据的KYC也是判断用户状态的有效工具,而一个融合了财务规划、投资和精算知识的专家系统则是定制级规划的核心。
Hui Xue (微软亚洲研究院)
人工智能在过去的几年里飞速发展,但是机器学习的实践和应用需要消耗一定的人力和时间。例如,如何去做特征选择,如何设计一个适合该任务的神经网络模型等等。而自动机器学习技术,可以帮助开发者和机器学习实战者,缩短开发周期,提高效率。我们的介绍主要包括:自动机器学习技术的进展;我们开源的自动机器学习开源库Neural Network Intelligence; 如何利用自动机器学习的技术,在产品和应用上提高效率,节省所需的时间和缩短周期。我们会在最后一部分,分享一些利用自动特征选择,自动参数调整以及模型架构搜索上的成功案例。
Li Yuan (Perceptin 深圳普思英察科技有限公司)
如何令自动驾驶技术落地并结合新潮传媒以及新零售业务,相关的技术是如何实现,商业模式是什么以及如何通过人工只能技术提升行业的效率。
刘祁跃 (爱奇艺)
对视频进行精彩度分析,有助于筛选优质内容,尤其是冷启动阶段 同时,基于算法对精彩内容的理解,可以辅助创作,如进行标题辅助生成、动态/精彩封面生成、智能拆条等 我们通过对视频、音频、文本等多模态内容分析,同时利用用户交互数据,建立了完备的视频精彩度分析系统,并落地在长/短视频的不同业务场景下,明显提升了业务产出质量和效率
Hongyu Cui (DataVisor)
AI技术在赋能各个产业的同时,也被网络黑产所利用,使得黑产攻击更加自动化,更加隐蔽,难于检测。 DataVisor在互联网反欺诈领域研究发现,目前黑产的攻击模型呈现以下趋势:攻击方法多样化而变化快,攻击手段趋于模拟正常用户,攻击账号主要来源由大规模注册渐渐转向ATO账号。传统的规则系统和有监督的模型,由于对欺诈案例以及标签数据的强依赖,往往无法及时应对迅速演化的黑产攻击,在反欺诈中一直处于被动防守的状态。DataVisor的无监督算法,通过全局分析,在高维空间聚类,可以在无标签情况下,自动发现大规模关联欺诈团伙。无监督算法在提前预警以及检测快速演变欺诈模式方面体现了显著的优势。
Sujatha Sagiraju (Microsoft), Henry Zeng (Microsoft)
Intelligent experiences powered by AI can seem like magic to users. Developing them, however, is pretty cumbersome involving a series of sequential and interconnected decisions along the way that are pretty time consuming. What if there was an automated service that identifies the best machine learning pipelines for a given problem/data? Automated machine learning does exactly that!
Jike Chong (Tsinghua University | Acorns), 黄铃 (Tsinghua University), 陈薇 (排列科技)
您想了解金融企业是怎样利用大数据和人工智能技术来画像个人行为并检测欺诈用户的吗?互联网金融幕后的量化分析流程是怎么杨的?个人信用是怎样通过大数据被量化的?在实践过程中,机器学习算法的应用存在着哪些需要关注的方面?怎样通过图谱分析来融合多维数据,为我们区分正常用户和欺诈用户? 这套辅导课基于清华大学交叉信息研究院开设的一门"量化金融信用与风控分析”研究生课。其中会用LendingClub的真实借贷数据做为案例,解说一些具体模型的实现。
Mingxi Wu (TigerGraph)
图数据上的非监督学习在激活大数据的经济价值上有着广泛和不可替代的作用。 PageRank能够发掘重要的实体, 社区发掘(community detection)可以找到具有某种特性的群体,紧密度中心性算法(Closeness Centrality)可以自动找到远离群体的个体。所有这些算法都是非监督的学习。 我们分享一些具体客户案例来展示他们的价值,同时分享怎样在大数据上灵活应用这些开源算法。