Transcription is probably one of the most challenging tasks for humans, even for professionals. How much more is it for machine with all the nuances, errors, and interruptions involved when a person is speaking alone or with someone else. However, Microsoft has recently unveiled its AI transcriptionist and said that it is much better than human professional transcriptionists.
In a recent report released by Microsoft, they said that they have been able to beat a professional human transcriptionist through its new AI device. They said that they did not discover or developed a new algorithm but they just fine-tuned the existing AI architecture.
The showdown was quite simple. They hired a professional human transcriptionist to type the audio then another person listens to correct any error made by the first person. They based the score on the standardized test for transcriptionist where the humans got 5.9 per cent and 11,3 percent error rates.
Microsoft's AI transcriptionist also went through the same process minus a second person to check the error. However, the computer had to undergo 2,000 hours of learning human speech. The system's score 5.9 percent and 11.1 percent error. The difference can be small but i t shows a lot of promise. The company said it will take the system to the next level by letting it undergo speech recognition but with a lot of background noise , such as in a party or in the streets with cars moving fast and honking.
This step brings Microsoft's vision for the machine which will go beyond transcription but that it will be possible for humans to interact with a machine. Satya Nadella, Microsoft CEO, said that this is in accordance with what she said as the future of the company, which is artificial intelligence, where the cornerstone is conversation.