How do top European R&D personnel view BAT smart speakers

Since more than two years ago, Amazon launched the smart speaker Echo, Google, Microsoft, Apple and other technology giants have poured into the industry leader, and their participation will also become the smart hardware that is now unknown. Amazon Echo now sells more than $1 billion.

The technology giants in the Chinese market are adding to the enthusiasm.

On July 5th, Alibaba Group released its first artificial smart speaker product in Beijing. In the future, it will be combined with e-commerce shopping. In the same period, the music brand Himalayan FM, which acts as a third-party partner, represents the audio content provider. Launched its own Xiaoya AI speaker; another home appliance giant Jingdong also released the smart speaker "叮咚" in the hope of occupying the entrance of the smart home field; Tencent executives also publicly stated that Tencent smart speaker products "ears" "It will be released around August. Xiaomi, Baidu, Lenovo, etc. also launched smart speaker products. At this point, China's BAT giants are almost in the field of smart speakers.

One reason why smart speakers are favored by giants is that their convenient voice interaction can provide functions such as querying, purchasing goods, and controlling home appliances. This is regarded as the entry point of smart home and the next generation of human-computer interaction interface. Carrier.

These are certainly not the imagination of the world's leading technology companies. According to statistics, in the United States, nearly 40 million users use voice-operated speakers once a month. Market research firm StrategyAnalyTIcs pointed out that in 2016, smart speakers will ship 5.9 million units worldwide, and will increase 10 times in 2022, with a market value of 5.5 billion US dollars.

More third-party organizations predict that by 2018, 30% of human-computer interaction will be completed through natural language, and 2018 is expected to become a critical year for the development of smart speakers.

Faced with the phenomenon of domestic smart speakers, FT found the core R&D personnel of the two companies with different emphasis on intelligent voice technology overseas, and talked with them about the intelligent speech recognition technology behind the hardware.

The availability of data is the bottleneck of speech prediction "prediction"

In the past few years, Pawel has observed more and more intelligent voice products on the market. In his opinion, this is a signal. "New reforms are coming, because a series of products are changing people's daily lives, Amazon Echo is very A good example, it makes people realize that it is inconvenient to have no voice assistant, just like there is no smartphone in life."

As a technology-driven startup, Emotech's R&D team often tries to add some human-like technologies to smart speech. “Let the device better understand non-speech prompts and respond more expressively. Simple Say, we want to create a device that is compatible with people's lifestyles.” Pawel explains that compared to other startups, at Emotech, the most distinctive feature is the use of hardware and software platforms to provide a personalized assistant solution. And most other startups will only focus on one of the hardware or software.

Emotech's Olly robots in the development of speech recognition, the team began to build a seed model from the open corpus, and then iteratively and collect more matching intra-domain data, making it more suitable for the real acoustic environment. In addition to hardware, Olly also has research and development on patented brain-like engines, robotic psychology architecture, and intelligent speech recognition.

According to Pawel, in addition to semantic and speech recognition, Olly can more accurately identify the user's identity through voice and face recognition to meet the user's needs. Olly can also detect changes in the user's mood, and through LED color, shape changes and their own movements and emotional interaction with them, this is also the most unique place for Olly.

As a woman, Marily is very good at seeking the intelligence of speech recognition technology from the user's point of view.

Marily likes cooking very much, but in the cooking scene, it is difficult to use a mobile phone, so when cooking, Marily uses the mobile phone's translation function almost every day to send messages to family and friends. The use of cooking is really great, especially if you need to time, or you need a recipe based on the existing ingredients in the refrigerator.” And this is where smart speech recognition is needed in life applications.

Computers can turn voice into text, which is a wonderful thing for Marily. In fact, in the 1920s, speech recognition technology appeared. At the beginning, the technology intelligently recognized the number of a particular person. By 1962, the system created by IBM could recognize 16 words of words. . In the 1970s, the speech recognition system was able to distinguish the voices of different people, but the speaker needed to say a pause.

Marily said that today's speech recognition systems are based on hidden Markov models. The principle is to create a stochastic model from known sounds and match them to unknown sounds produced by a particular model. In other words, the speech recognition technology under this model allows the machine to accurately "guess" what we are going to say. However, like other predictive behaviors, this requires a large amount of data analysis as the driving force behind the operation of the system. "Data availability is the bottleneck of speech recognition 'predicting', but I believe that more and more people will be exposed to speech recognition in the future. Application, if we can think of what is in the new future, we will really get what we really did at that time."

Nowadays, in Marily's life, smart voice is almost everywhere. When she goes out, she will ask Siri if she wants to bring an umbrella, use Facebook's M to make plans and complete payment, and then tell Google Home to dim the light in the living room and turn on the TV. Watch "The Game of Rights." Alexa will automatically help her order on Amazon. "My favorite Google Home voice recognition feature is that it knows who is talking to it by matching the identity and voice." Marily said.

The balance between R&D and commercialization: making technology that makes people's lives better

Pawel's Emotech and Marily's Google are companies that focus on intelligent voice research at both ends of the industry chain, one is Big Mac and the other is a startup with distinctive R&D capabilities, so it also makes them in smart voice development. The different ideas, Pawel's experience is that large companies can provide a more focused work environment, "you are assigned a clear question, you can spend a few months to solve this problem." And in startups, the goal may be A big change has occurred. "You will change the problem you will face because of changes in your goals, usually beyond your comfort range."

In Pawel's view, working at a startup is like an adventure, with many ups and downs along the way. “At Emotech, we often develop, integrate and test certain modules in a very tight timeframe. This process is not easy. The most important part is to learn to be flexible, lasting and patient.”

Despite Google’s large corporate background, Marily recently made another attempt to “try to build a startup with the hackathon project as a starting point.” These two identities tend to be academically researched to verify that the hypothesis is explored as The purpose, one needs to consider the industrial application business, the startup needs to find a "home" and be passionate, both of which make Marily attractive.

In fact, in the face of new areas, both startups and technology giants are inevitably on the road to trial and error. The coexistence of commercialization and technology research and development will also be somewhat restricted. As a R&D staff member in a commercial company, Pawel has a more objective view. He believes that all R&D needs expenses. Expenditure always has to find a way to pay for it. This requires company participants to find a true balance point. Pawel believes that this balance is to do technology that will make people's lives better.

Intelligent speech recognition is a new technology that fits this view. "This has strengthened a complementary relationship to some extent. The more opportunities people use speech recognition, the better the product will become, and the commercial company will continue to Improve the product, because the product usage rate becomes higher, it will collect more necessary data, which is an important way to improve the product.” Pawel said.

It’s very meaningful for users to “show” the next generation of technology or product tracking users.

As a frontline practitioner in academia and technology companies, in the UK, Pawel and Marily often attend industry gatherings.

What makes Marily feel the deepest is that in the process of brainstorming, they often bring new ideas and new products to them. “Sometimes these discussions will directly lead to the formation of new product development teams. Google Street View and Facebook video The product was born like this.” Marily said that the last time she attended an industry gathering, she chatted with friends in the circle about new technologies, publications, electronics, etc. In her opinion, sometimes it seems that Related industries, for artificial intelligence or intelligent speech recognition, can enhance potential synergies.

This brainstorming, which stems from different cultural backgrounds and ways of thinking, also occurs in the Emotech team. Emotech's team is very diverse, with 30 employees with 22 native languages, and unlike traditional technology companies, only scientists and programmers, there are musicians, gamers, psychologists and other backgrounds.

Talking about the company's development plan for smart voice in the future, Pawel said that Emotech will work harder to improve the ASR (Automatic Speech Recognition) system so that multiple people can talk at the same time, or work in a very noisy environment, such as a cocktail party. Scenes. This will produce some interesting applications, "Let the device collect multiple sound sources or just focus on one talker, ignoring the other, I hope to build more modes used by the session interface, because the task is as complex as the dialog interface. At the time, much information is hidden between conversations. "Interestingly, humans themselves do not have a good grasp of the skills of listening to multiple people at the same time. Instead, they only pay attention to a single speaker in a noisy or reverberant environment.

For Marily, talking about the future, she first thought of the science fiction movie of the 1960s. "At the time, we felt that talking to the machine was completely fictional. Now the description of the novel has become a reality. We already have our own AI assistant. They may not look like a robot in shape, but they are already here, and we can do it at our own discretion." Based on this foundation, she believes that in the future, people can customize their own AI assistants and customize their sound and appearance. .

Marily boldly imagined the future scene, his AI assistant will be tailored for his personal life, and will automatically adjust according to time and place. They will learn how to "understand" the needs of their owners. For example, you can make your own AI assistant like an actor you like very much. This is just one of the countless examples of technology companies turning AI technology into reality.

The connection between personal assistants and AI speech recognition technology is that technology is changing the way human-computer interaction, and speech recognition makes humanized human-computer interaction possible. “I am very much looking forward to seeing where AI and speech recognition technology will go in the future. Everything is possible before I get my own C3PO.”

Marily said that she often finds "gap" in technology and finds innovative ways to fill it. She believes that users can always "show" the next generation of technology or products, so tracking users is a very interesting thing. This is also the source of inspiration for her startups. "Assume that we are users, especially like a product, but it is obvious that it has some shortcomings. How to improve this shortcoming? It is better to retreat to the net than the squid, think about your own mission. Then go and implement it."

It can be seen that voice interaction is one of the most high-frequency applications in the smart home field. It can be realized that smart voice and home equipment such as TV, audio, air conditioner, curtains, lamps, toys, and smart home control center system. In combination, all functions are controlled from the portal through voice interaction. With the continuous efforts of companies like giant Google and startup Emotech, there is more to be expected in the future for smart voice interaction.

Guest introduction

Pawel Swietojanski (Pawel for short) is the Emotech team of Europe's top artificial intelligence startup, a speech recognition researcher for intelligent home robot Olly, a technology-driven artificial intelligence startup. In November 2016, at the CES conference, Emotech's intelligent emotional robot Olly won four innovation awards. In August 2016, the project completed a total of $10 million in Series A financing. Intelligent speech recognition has always been one of the most important infrastructures of the company.

Prior to joining Emotech, Pawel was a Ph.D. student at the University of Edinburgh's Speech Technology Research Center. He has published many articles on speech and language processing, and has contributed to the acoustic modeling of speech recognition. His two papers have won the Best Papers in Speaking Techniques of the Institute of Electrical and Electronics Engineers and the Best Student Paper Awards in IBM Research Spoken Techniques. . He has also worked as an intern at Microsoft Corporation and has been invited to be a visiting researcher at the Japan Institute of Information and Communications Research.

Marily Nika (hereafter called Marily) holds a Ph.D. in Computer Science from Imperial College London. She has tried to predict the Internet phenomenon - a model of virality. During her time at the blog, she worked as a data analyst at Google and Facebook. After graduation, she joined. Google Silicon Valley becomes an engineering project manager involved in the development and management of Google Assistant, Google Home and Data & Artificial Intelligence. She has lectured on TEDx three times and won the Influential Women Award in Science and Engineering in 2015. Imperial College awarded her a medal for her outstanding contribution to science and technology. She is also the first female geek to receive the Google Anita Borg Memorial Scholarship. Today, Marily is also the CEO of EdTech Ventures.

Crystal Clear Back Sticker

Crystal Clear Back Sticker,Phone Sticker,Mobile Phone Back Skin,Crystal Clear Phone Skin

Shenzhen Jianjiantong Technology Co., Ltd. , https://www.tpuscreenprotector.com

Posted on