Researchers first submit a new project request form. Once approved, we send instructions for securely sending audio files to the ATS Manager. We recommend that researchers first send one or two files and evaluate whether the transcripts generated will meet their needs.
The ATS Manager then submits these audio files for transcription (see "About the Technology," below).
Written transcripts are securely returned to researchers by the ATS Manager. Most transcripts are in Word document format, with time stamps, speaker recognition, and color-coding based on word confidence scores to aid in human review. For some languages, though, we can only provide transcripts in plain text format that may or may not include speaker recognition (see "Language support," below).
Our service is based primarily on Amazon Transcribe; we also use Google Cloud Speech-to-Text to provide support for additional languages. We use these services through existing IU contracts and have obtained special provisional approval from IU's data stewards for our service to be used with research data and PHI. We created this service to make it possible for researchers to take advantage of these services without having to obtain, manage, and seek approval for their own cloud computing accounts through IU's contracts, and without having to work directly with output in JSON format.
Amazon Transcribe currently supports a range of languages; these are listed in a drop-down menu in our project request form, and a full, up-to-date list from Amazon is also available on their website.
If a language is not supported by Amazon Transcribe, we can sometimes provide transcripts through Google Cloud. View a list of languages supported by Google Cloud Speech-to-Text on Google's website; note that for languages without support for speaker diarization, we are not able to provide transcripts that include speaker recognition.
Both Amazon and Google are increasing their language support; check the links above for the most current information.
We have generated several English-language sample transcripts that are available to view. We are grateful to the IU Bicentennial Oral History Project and the IU University Archives for providing us with several audio files used to generate the transcripts. You can listen to and read transcripts for hundreds of oral histories from the project at oralhistory.iu.edu.
Willis, Martha, June 21, 2010. Indiana University Bicentennial Oral History Project, IU Libraries University Archives, Bloomington.
Listen to the interview
View raw transcript produced by ATS
IU researchers have access to other automated transcription services that are approved for use with research data, including Microsoft Transcribe, Microsoft Teams transcription, and Zoom closed captioning. Read about how to use these services in the SecureMyResearch Cookbook Recipe, "Generate transcripts for study participant interviews."