AI Transcription software

Updated: August 01, 2023

AI transcription software is a cutting-edge technology that utilizes artificial intelligence and machine learning algorithms to automatically convert audio and video recordings into written text. These software solutions can accurately transcribe speech from various sources, including interviews, meetings, webinars, and podcasts, in real-time or near-real-time. AI transcription software continuously improves its accuracy and language capabilities through machine learning, making it capable of handling different accents, languages, and specialized terminologies. With the ability to transcribe large volumes of audio content quickly and cost-effectively, AI transcription software offers a significant time-saving and productivity boost for businesses, researchers, journalists, and content creators. By automating the transcription process, AI transcription software helps enhance accessibility, searchability, and analysis of audio content, revolutionizing the way organizations handle and utilize their recorded data.

See also: Top 10 Office suites

2022. Otter.ai challenger Airgram raises $10M to transcribe and time your video calls



There is a multitude of tools available that compete in the market, aiming to facilitate smoother video calls for individuals. Some of these tools, such as the voice transcription service Otter.ai, have experienced a surge in demand during the COVID-19 pandemic and have attracted substantial investments. Now, an innovative newcomer named Airgram has entered the competition. In addition to transcribing Zoom, Google Meet, and Microsoft Teams calls into shareable and editable text, Airgram also aims to assist users in maintaining efficient meetings. It allows users to project their meeting agenda onto the screen and provides a timer to remind them not to exceed the allotted time.


2021. Microsoft is acquiring Nuance for $19.7B



Microsoft has announced its plans to acquire Nuance Communications, a renowned leader in speech-to-text software, for a significant sum of $19.7 billion. This strategic move by Microsoft aims to strengthen its foothold in the healthcare industry, where Nuance has demonstrated considerable success in recent years. Microsoft had previously introduced the Microsoft Cloud for Healthcare, and this acquisition serves as a means to expedite its growth and impact within this vertical. Notably, Nuance offers a range of innovative products in the healthcare domain, such as Dragon Ambient eXperience, Dragon Medical One, and PowerScribe One for radiology reporting. Additionally, Nuance boasts a diverse product portfolio, which includes Dragon Dictate—a text-to-speech solution catering to both consumers and businesses, with its origins tracing back to the early 1990s.


2021. Microsoft launches transcription and translation app for in-person meetings



While various real-time transcription apps exist for mobile devices today, such as Otter.ai or Google's Recorder app for Pixel devices, Microsoft's new Group Transcribe app revolutionizes the transcription process during meetings by introducing a collaborative approach. In this innovative solution, all participants simultaneously record the meeting on their individual devices, resulting in enhanced accuracy. Additionally, the app offers real-time translation for languages spoken in over 80 different regions. Powered by advanced AI speech and language technology, the app achieves improved transcription precision and speaker attribution by analyzing each phone's microphone input to determine speaker volume.


2021. Meeting transcription service Otter.ai raises $50M



In the past year, the voice transcription startup Otter.ai has demonstrated its commitment to the future of remote work by integrating its product with popular meeting applications such as Zoom and Google Meet. This strategic move has resulted in the company securing a significant $50 million in Series B funding. Otter.ai has gained traction among various businesses, including professional services, pharmaceutical companies, financial services, and multinational corporations where employees operate across different time zones. Looking ahead, Otter.ai aims to enhance its value proposition for corporate use cases by expanding its services beyond meeting transcripts and venturing into what it terms "conversation intelligence." Otter.ai's service provides a convenient means of recording meetings, whether in-person using a mobile app or online through its integrations with well-known web conferencing platforms.


2020. Microsoft brings transcriptions to Word



Microsoft launched Transcribe in Word, its new transcription service for Microsoft 365 subscribers. The service is currently available in the online version of Word, with plans to expand to other platforms in the future. Additionally, Word is introducing new dictation features that allow users to format and edit their text using their voice. This feature enables users to transcribe both live and pre-recorded speech and make edits directly within Word. By offering these capabilities, Microsoft enters the competition against startups like Otter and Google's Recorder app, each with their own strengths and weaknesses. Currently, Transcribe in Word is limited to English and only accessible to paid Microsoft 365 accounts.


2020. Huddl.ai wants to bring more intelligence to online meetings



Huddl.ai is an innovative startup aiming to incorporate artificial intelligence into online meeting technology. By leveraging AI tools, Huddl.ai transcribes meetings, extracts important points, and assists users in comprehending the essence of lengthy sessions without the need to extensively review notes. Unlike existing solutions such as Zoom, Cisco WebEx, Google Meet, and Microsoft Teams, which merely provide a link to a cloud room for participants to join, Huddl.ai seeks to introduce a more organized approach to the entire meeting process.


2019. GoToMeeting improved AI-transcription in videoconferencing software


LogMeIn has unveiled the latest version of GoToMeeting with a strong emphasis on delivering a user-friendly and intuitive experience, while providing IT departments with enhanced control over deployment, management, and security. This new release includes a range of updates aimed at improving the collaboration platform's ease of use for both IT and users, spanning pre-meeting, during-meeting, and post-meeting functionalities. These updates include a completely redesigned video-first interface that is consistent across all devices, exceptional audio quality, real-time note-taking capabilities, and AI-powered transcription. Furthermore, hosts now have the ability to create multiple personalized meeting rooms, complete with customized branding, enabling teams to instantly collaborate whenever needed. GoToMeeting has also made updates to its popular calendar plugins and integrations, offering seamless compatibility with Office 365, Outlook, GSuite Calendar, Salesforce, and more. Additionally, GoToMeeting remains committed to supporting integrations with tools like Slack and Zoho.


2018. Microsoft added AI transcription to OneDrive and SharePoint



Microsoft is introducing video and audio transcription capabilities to OneDrive for Business and SharePoint, enhancing the accessibility of various digital content for users. This new feature utilizes AI technology from Microsoft Stream (previously known as Office 365 Video) to automatically generate a complete transcript of dialogues when viewing videos or listening to audio files. The resulting text data will be stored in the Microsoft Cloud, offering cost-effectiveness and enhanced security compared to relying on external transcription tools. Office 365 subscribers can expect to access this new service later in the year.


2018. Google Voice version for enterprise came to G Suite



Google is introducing an enterprise version of its Google Voice service exclusively for G Suite users. While Google Voice has traditionally been popular among everyday consumers, offering numerous advantages beyond a regular phone number, the enterprise version aims to provide companies with similar functionalities. This includes AI-powered features such as voicemail transcription, which employees may already be utilizing in a manner that bypasses company guidelines. Administrators have the capability to provision and transfer phone numbers, access comprehensive reports, and configure call routing options. They can also assign phone numbers to departments or employees, granting them a universal number that isn't tied to a specific device. This simplifies communication by ensuring easy accessibility when needed. Additionally, the enterprise version includes a spam filtering feature, which proves beneficial in handling the influx of unwanted automated calls for various purposes.


2018. GoToMeeting added AI transcription, Amazon Alexa integration



LogMeIn has recently implemented updates to its GoToMeeting video and audio conference platform, introducing several new features. These enhancements encompass a text chat function, an AI transcription service, and integration with Amazon's voice assistant, Alexa. The Business Messaging feature facilitates one-to-one or group chats among employees and external clients through the GoToMeeting desktop application or a standalone mobile app. With just a single click, users can seamlessly transition from a message thread to a video or audio conference. Additionally, the Smart Meeting Assistant transcribes meeting audio and securely stores the text in the cloud, enabling easy sharing afterwards. This advancement eliminates the need for attendees to worry about note-taking and allows them to concentrate fully on engaging in discussions with colleagues.


2017. Box applied AI to content management



Box has recently introduced Skills and its accompanying SDK, known as Skills Kit. These new offerings empower organizations and developers to extract valuable insights from their extensive content repositories within Box datasets, utilizing machine learning techniques to unlock the inherent commercial value of their content. Box is currently showcasing three initial Box Skills, leveraging machine learning tools from Google Cloud and Microsoft Azure to address common business use cases. These include image recognition, which involves detecting objects and concepts within image files, performing optical character recognition (OCR) to extract text, and automatically assigning keyword labels to images for efficient metadata creation. Additionally, Box Skills encompass audio transcription and analysis, enabling the creation and indexing of text transcripts from audio files for seamless search and manipulation across various use cases. Lastly, video indexing employs text transcription, topic detection and indexing, and facial recognition to analyze video files comprehensively.


2017. GoDaddy launched business phone system SmartLine



Domain and hosting provider GoDaddy is expanding its services to include voice and telephone capabilities through a new app called SmartLine. This app aims to address a common requirement for small businesses—using their smartphones for both personal and work calls while preserving privacy. SmartLine enables users to create a secondary number that seamlessly connects to their iOS or Android devices. Notably, the mobile app allows users to fully configure SmartLine without the need to visit a website. Additional features include the ability to establish business hours, ensuring that calls outside of those hours are directed to voicemail, as well as the option to receive voicemail transcriptions. GoDaddy intends to differentiate itself by introducing further functionalities, including vanity numbers, toll-free numbers, and the ability to share a single phone number among multiple employees' cell phones. The unlimited calling and texting plan for SmartLine is priced at $9.99 per month.


2010. Twilio created Google Voice for web-applications



Virtual PBX systems emerged long ago and gradually taking the market share from traditional PBX. This kind of service is even available for consumers - for example, Google Voice - the personal virtual PBX, which handles incoming calls, SMS and voicemail. But the main drawback of such services (for business) is that they are designed for interaction with people, but not with applications. At the same time, to improve business performance, each call should be at least logged in some application (e.g. in CRM system), and better - initiate some automated operations (for example, open the card of calling customer or solve customer problems by the interactive voice menu). Therefore, recently the API-voice services, which allow to easily integrate telephony into web-applications, appeared. Twilio - is one of them. ***