The news about the release of the Polish version of Google assistant turned out to be a little more critical than we expected. Since our language is one of the most difficult in the world, and Poland is not a major market for Google, we were put a little further in the technological queue, and we would miss this little revolution – namely voice control. And here I deliberately left out the noun. As what can be controlled by voice? Well… anything that can be connected to the Internet and empowered with Google services. We decided to explore this area and created a POC (Proof of Concept) solution that, with the use of Google assistant, is to support company employee management processes. Read the article to get acquainted with the case study and the conclusions from the project.
Imagine you are entering the house and commanding your phone ‘Ok Google’ I’m home! ‘. And then the lights and the TV get turned on, the microwave heats the dinner. These are the things that we can configure ourselves, but what if we want to apply the latest technological solutions for more complex topics? To set an example – the human-dominated area, such as human resources management?
For one of our clients (a large international corporation), we have completed a POC, which was to check whether a voice assistant can be used in office conditions to optimise HR processes in a company.
The POC was aimed at:
- Defining the technological limitations for the solution, for example, the list of supported devices (the need to use two different technologies on iOS and Android), the solution performance, defining the technological imperfections concerning costs. Namely, Google API usage is free, but since it is not able to cover all the needs, part of the work must be performed by the developer.
- Defining other barriers, such as technological maturity of the target group. For example, the majority of smartphone users do not know that they have the opportunity to use a voice assistant, let alone do it effectively. The same kind of barriers humanity had to break when the mouse-controlled computers and touch-controlled smartphones entered the market.
- The validity of the solution, e.g. whether such a solution will bring measurable benefits or generate more problems for the organisation. Some organisations are aware of the need to test various solutions as one of them might become the future standard. In this particular case, the client suggested the growing popularity of loudspeakers and voice assistants applied for private use.
In the first version, the solution was based on DialogFlow, which is used as a mechanism for categorising the sentences spoken by the user. It was aimed at providing the company’s employees with voice communication with a virtual assistant. This way, they were able to ask a question, regarding employment, benefits or information regarding maternity leave or vacation. The voice recognition engine on Android and Siri on iOS are responsible for voice recognition and transcription.
Pro tip – when it comes to the efficiency of speech recognition, Google tool remains the unquestionable leader, followed by Amazon Alexa, then Microsoft Cortana – that, despite the lack of smartphones support performs better than Apple Siri, concerning the number of correct answers.
DialogFlow 1.0 from Google (we used just the older version as 2.0 supports only Android) is responsible for the initial processing and categorising responses. While our proprietary solution, placed in the Amazon cloud, gets to match specific categories to the content available in the internal network of the customer. For example, we ask the questions to the phone, and in return, we receive specific paragraphs regarding the issue we are interested in sourced from the company regulations.
- we are dealing with an entirely new method of software control, which brings completely new challenges in the field of UX – the knowledge of the technology limitations seriously affects whether and at what extent the user can take advantage of the solution,
- as it turns out, speech recognition systems are susceptible to the accent of a given person; for example, people of Irish or Scottish descent were not understood by an assistant which was focused on the reception of proper (London) British English.
- thanks to DialogFlow being able to store the thread’s context, it enables the in-depth analysis of the document processed,
- the DialogFlow recognition and categorisation model is based on keywords, which results in quite severe limitations in functionality – one should consciously and economically manage the keywords to avoid the traps of ambiguity. I know it sounds complicated, and it really is – for example, we ask Google assistant about our leave. There are a few types – vacation, maternity, on demand, etc. – we can talk about a leave in the context of days off, equivalents, rights, and this information is placed in several places in the company’s regulations. We need to give Google the required context. “Ok Google, how many days of my leave do I have left, I want to go on holiday,” Google will be able to indicate the number of vacation days off we are entitled to or even suggest information on company’s vacation benefits.
You should keep in mind that the main players, i.e., Google, Apple, Microsoft and Amazon do not cease their efforts to improve their products. Therefore, the problems described above may turn out to be history in just a month.
I hope the material will prove to be useful. If you want to implement Google assistant in your solution or organisation and thus optimise business processes in your company, please contact us.