“Helper” robotic cats in Khmer

The man says “chhum reap sour” (hello) and the machine responds hello. Then he says “chhum reap lear” (goodbye) and the robot responds “lear heoy” (goodbye). When the man says “Cambodia”, the robot says “I love Cambodia”.

The conversation took place between a small box-shaped robot with the Cambodian national flag printed on it and the words “National Polytechnic Institute of Cambodia” (NPIC) on its 3.2-inch screen and one of its student-creators of the institute.

“We want to create an automatic Khmer speech recognition system that can transform spoken Khmer into a self-generated written Khmer transcript without using the Internet,” said Ny Virbora or Bora, a senior student in a team with three other members, at the Post.

Since they are studying software programming related to artificial intelligence (AI), their professor recommended that the group of four members – Cheat Chea, Sokheang Ching, Ny Piseth and Ny Virbora – work together to create a robot that recognizes speech. Khmer.

‘Assistant’ gets entry-level job

The robot – named “Assistant” – is only 22cm tall and operates automatically when powered on. He can communicate in Khmer through Khmer automatic speech recognition system using Carnegie Mellon University (CMU) Sphinx speech recognition system with limited data set.

“The robot can transcribe speech into words and can answer some specific questions from us. Therefore, we interrogate the robot in Khmer and it responds in Khmer with on-screen scripts,” says Bora, who studies electronic engineering.

The team plans to market the bots to private companies or corporate clients for customer service purposes, but they also believe it could be developed for the education sector and that was the initial focus. of the group.

“We can use the automaton at the kindergarten and primary levels while still being able to handle some inquiries, display images and videos to students,” Bora said.

The assistant uses noise reduction technology to help it understand what is said to it and its limited dataset includes 85 speakers and 157 selected words in the Khmer language which it uses to create sentences when responding to people.

To assess speech recognition accuracy, 100 Khmer transcriptions were randomly created from the training dictionary and used to calculate word and sentence error rate, according to the research team.

Ny Virbora, lead student of the four-person team who created a robot capable of transforming Khmer speech into Khmer writing displayed on its screen without using the internet. Heng Chivoan

“The accuracy of Khmer speech recognition reached 89.91% of word recognition accuracy and 90.02% of sentence recognition accuracy,” Bora said.

He said the ability to respond to inquiries depends on the database used to train the assistant. Train in this context means a set of software used to train scripts and acoustic models to create a speech recognition model for any language by providing enough acoustic data that can work with the CMU Sphinx speech recognition system.

“This automaton is not so different from a toddler. To teach young children to recognize words, we must first show this word and we say the word to him several times. But the machine is different from a human in that if a man speaks to it, it will only recognize that man. So we need to capture more data for it to effectively recognize a wide range of human voices,” Bora said.

“The voices of over 80 people were used to train this robot to recognize various accents. If you practice with the accent of one person, of course the automaton only recognizes that person,” said Srun Channareth, a professor at the National Polytechnic Institute of Cambodia and the Department of Electronics Master of Engineering who led the “Assistant” project.

‘Wizard’ finds success at work

Although the assistant only knows 157 words, Nareth said, the accuracy rate with them is over 90%.

“For this model, it is considered our first success because we can be sure that it will do things right. The tested result for accuracy and ability to transcribe speech to text is 90% and ability to use sound set of words is 89%, because sentences made up of these words can sometimes be confusing,” Bora said. He cited Microsoft research findings, which estimate that a robot with 90% accuracy can be used in the field.

His team is now starting to recruit volunteers to help feed more data into the robot to improve its use of words and phrases.

“As for the number of words, it depends on where or what industry the robot will be used in,” Bora said. “If we use it for the education sector, we will see what words should be used by looking at the lessons in the textbooks. We only include Khmer words, as they will be translated into sentences.

“The particularity of our robot is that it works offline without using the Internet. Working offline means it works faster and it’s local computing that processes the data in the robot without having to connect to a server,” said Bora, 22.

The next step is to create a larger database, as the Assistant now stores a limited set of data and increasing its database size will take time.

“From the beginning of the project to the first phase of the robot, our team spent more than a year, because we needed time to do a lot of research on intelligent technology (AI), even if the design of the robot itself was effortless,” Bora said. .

In terms of hardware, the team only had to spend around $200 since the robot’s body was 3D printed by the university’s own 3D printer.

‘Wizard’ gets a promotion

“Soon we will create a new robot model and develop new software that acts as an assistant for use by businesses and restaurants as well as customer service points,” Bora said. “The new one will be smaller and more portable, but with an expanded screen size of up to 7 inches.”

The robot can act as a source of information for customers without them having to rely on human services or contact. For example, at the airport, passengers could ask the robot to show them scheduled flights from Phnom Penh to Japan. Once the Khmer language recognition system is fully developed, its results can also be translated into other languages.

For the robot to work properly, Bora added, there are still two big steps. First, update the software to make the system work faster, and second, increase its ability to recognize words.

“We’re also going to equip it with a camera so it has the ability to recognize people and be able to call them directly by name,” Nareth said. “We have a lot of plans, but what we lack is support for new research.”

‘Wizard’ returns to school

Bora told The Post that it’s very difficult to train robots to recognize words because the first step is to create a database that includes both voice and text data.

“In the coding system, we cannot write directly in Khmer because the computer does not know Khmer script,” he said. “So we use the Latin alphabet represented in Khmer as the unicode format.”

Content Image - Phnom Penh Post

The ‘Assistant’ robot can communicate in Khmer through the Khmer automatic speech recognition system using Carnegie Mellon University’s Sphinx software and a limited data set. PROVIDED

It took the team nearly six months to create their initial database as they needed to collect the data and create a list of words. But for the new words added, Bora says they wrote software that makes data collection faster and less time-consuming.

“Before there were 10 of us. We collected data and it took us up to two days, but then with the software I wrote, I spent a little over an hour on it and we collected 10 to 20 words,” he said.

The team grapples with hardware and software issues and bridges the gap between theory and practice. They can design robot bodies, but it can be tedious to produce real-world parts from computer-aided designs.

“We have to study the theories and lessons related to the development of AI and speech recognition, but when we put all these theories into practice, obviously the practice is not at all the same as in the theory, we so we need more time to experiment,” Bora said.

Khmer-supported software takes a long time to start, although there are currently theory books and software support. But they can’t completely use software from other countries because they want what they do to be accessible to Cambodians who only know the Khmer language.

“Whether there is a sponsor or not,” he said, “it is important that we focus on promoting strategies for new research. Although we cannot afford to buy supplies and materials, the school can help some. In addition, teachers or professors can also help find additional funding.”

‘Wizard’ explores new careers

Professor Channareth said that if a company develops a sufficiently advanced robot that can help communicate with customers or answer questions, they could use them to replace human workers and thus save money and increase efficiency.

Industries from telephone companies to banks, restaurants and cafes to customer service providers can benefit from this approach and already do in some parts of the world.

“When we walk into a bank now they have a machine that asks customers to press the number and choose the type of service they need and then it puts them in line to speak with a human teller to receive that service. What if the first machine they interacted with could simply provide the service?

“However, our primary focus is on educational apps that will be used by donors. By working with charities, students can strengthen their programming skills and write code and learn electronic engineering skills while effectively contributing to donor businesses,” he said.

Comments are closed.