There was a news story recently. In Ningbo, a “pedestrian” was detected for crossing at a red light. But the truth is that the “pedestrian” was just an advertisement on a bus showing Mingzhu Dong, a famous entrepreneur in China. As this incident makes clear, AI technology has many problems when it’s applied in the real world. We can’t just use AI because we want to. AI should be a closed loop with perception, cognition, and decision making. Only then your technology is advanced enough to form a set of valuable solutions.Hao Wen discusses how CloudWalk is building a closed-loop technology with the ability to perceive, recognize, and make decisions. Artificial intelligence is the “lead goose.” AI solves the problem itself. How to improve the ability, how to solve the problem, and how to let it lead the development of technology are the main problems we need to think about now.
At present, when talking about technology to applications, besides speech recognition, face recognition is the most widely used technology. And many face-recognition technologies have become the visual portal for human-computer interaction. In fact, in the visual recognition domain, in addition to face recognition, there is recognition of people by body pose and clothing, which now is also widely adopted. Recognition via the human face plus human body pose is a relatively better solution. Experimentation at the Chinese Academy of Sciences analyzed various aspects and light sources for the human face and then formed a set of structured data. Now the success rate of human body recognition is better than 96.6%, which is the standard for commercial use of this technology. For example, if a girl was running in a park and her face wasn’t caught by a camera, we can identify her through body features. This is a cross-camera and no-face-needed recognition application, so it’s called “cross-camera tracking” technology.
It is widely recognized that the next generation of human-computer interaction is “face + body + voice + AI,” such as VR interaction. Besides visual recognition, CloudWalk also conducted speech recognition research with the Joint Lab and the Chinese Academy of Sciences, whose error rate was very low. At the same time, many models in decision making have been built, such as the two-tower neural network.
In terms of five senses, CloudWalk has built a unified big data model through perceptual technology, which lets robots learn portraits, get strategic recommendations, and then implement and provide feedback. For example, the “Mingzhu Dong” red light crossing incident is due to a less perfect perception system. If the system knows that Mingzhu Dong is unlikely to appear in Ningbo, it can completely avoid this error. Of course, this is an extreme example.
We also developed industry products and solutions, implementing more than 50 use cases in the banking industry. For example, using perception technology, we built biometrical recognition, such as cash withdrawal in ATMs with face recognition. But this is a relatively simple application. In addition, banks are also very interested in forecasting the reserve fund. For example, a CCB branch needs to predict the reserve for more than 1,000 ATMs. If the forecasting can reduce RMB 1 billion reserve in one month, it will save millions of RMB in terms of interest. This is a good example of the closed loop of decision making from perception to decision.
In terms of offline stores in the retail industry—for questions like how many people entered a store, how many times they entered, what products they looked at—we have a set of customer conversion rate tools for identification and decision making. For example, stores are equipped with face-catch cameras, which can tell whether a caught face belongs to a VIP, a member, or a regular customer. And then stores’ system can push advertisements based on the customer’s age and gender. And on the shelf, sensor technology can catch a portrait and send it to a clerk’s terminal. He or she makes relevant recommendations for the customer. In the final payment step, showing a face would directly make the payment.
This is what AI technology now can do for offline stores, rather than building an unmanned supermarket, which does nothing for business improvement. Finally, CloudWalk believes that the AI-to-application process would from the very beginning start from academic research, showing the advancement of technology, to industry verification step by step, then to actual industry application, and finally to an industry platform and an intelligent ecology.
Hao Wen is a cofounder, along with Xi Zhou, of CloudWalk Technology. Previously, he worked for the Chongqing Institute of Green Intelligent Technology of the Chinese Academy of Sciences. He holds a bachelor’s degree in electronic science and technology from China University of Science and Technology, where he was recommended for admission into the doctoral program at the China University of Science and Technology’s Key Laboratory of Quantum Information; he completed his PhD in communication and information systems under Guangcan Guo, the “Quantum Control” 973 chief scientist. His research interests include quantum communication devices and networking.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org