Converting or Transcribing audio to text using C# and .NET System.Speech

Recently, I had a project where I needed to convert some audio to text. It took a bit more googling than I was used to in order to find the code, so I went ahead and whipped up a project that demonstrates its usage, so people can more easily find it.

This code uses the .NET System.Speech namespace and demonstrates how to transcribe audio using either a microphone or a previously created .wav file using C#.

The code can be divided into 2 main parts:

Step 1: Configuring the SpeechRecognitionEngine

_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);

At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.

Step 2: Handling the SpeechRecognitionEngine Events

_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);

_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);

private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e)
{
///real-time results from the engine
string realTimeResults = e.Result.Text;
}

private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
///final answer from the engine
string finalAnswer = e.Result.Text;
}

That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use _speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile); instead of _speechRecognitionEngine.SetInputToDefaultAudioDevice();.

There are a bunch of different options in these classes and they are worth exploring in more detail. This covers the bare essentials for a prototype. I have attached a full example and encapsulation here.

13 Comments

  1. April 28, 2012    

    Awesome! I was looking for exactly this and having the code project with it made it even sweeter! Have you planned on doing anything more with the speech recognition? I’m just starting out using it and C# for that matter (coming from Java) for home automation. Would be interested in any other projects related to speech recognition. Thanks again!

    • May 6, 2012    

      Hey Nicholas,

      I really do not have any more plans for working with speech recognition. I found the Microsoft solution and a Google solution. I would like to get my hands on the Dragon NaturallySpeaking SDK and see how good that is, but I have no plans in the immediate future for working on it.

      Good luck with your home automation and thanks for responding to my blog!

      • Maham's Gravatar Maham
        July 5, 2014    

        hi Micheal, thanks for help, nowadays i am working on Speech Recognition project , in which i have to convert video files into text, here i have tried to convert audio file( .mp3 ) into text through your code but its not properly working , so please do u tell me the code so i can proceed trough my work. please reply as soon as possible.

    • Samia's Gravatar Samia
      July 5, 2014    

      Hey can you please send me the code , i have also looking for this audio to text conversion in C# visual studio.. i have tried above code. but i getting through so many errors, so please help, will b very grateful to you

  2. May 18, 2012    

    Hey can you guide to build and debug this project.

  3. September 25, 2012    
  4. John Nelson's Gravatar John Nelson
    September 29, 2012    

    What was the Google solution?

  5. Julian's Gravatar Julian
    December 3, 2012    

    Like the article. Do you know if its possible to have a voice recognition system which identifies a certain word or words when someone speaks it or them and then converts it to text with a time when the word was spoken? Can it be incorporated into an ipad or tablet? If I have a sales person and I want them to emphasise the brand name for example, it will display as text every time they mentioned the brand name.

  6. Ram Raksha Mishra's Gravatar Ram Raksha Mishra
    January 7, 2013    

    hey it is awesome but there are some problem with this code
    1.Not converting long audio file(more than 1 min) in proper format
    2.converted txt not match with audio voice
    Please give me some solutions as soon as possible
    thanks

  7. jhonas's Gravatar jhonas
    February 3, 2013    

    Where could I read to actually build a command line application that would take microphone input or a wav file and provide me with a cout of the text in visual studio 2010

  8. Govind Dhawale's Gravatar Govind Dhawale
    June 27, 2013    

    Thanks

  9. willian's Gravatar willian
    July 26, 2013    

    Eu queria mais explicações. Teria como eu entrar em contato com você?

  10. June 9, 2014    

    I want to get involved in tts and stt functioning appl production. Because in special environment of which deaf person recognize the message and normal person’s environment requires such app in order to communicate each other. Well, sometimes I thought only text can solve the problem. But the trend is going to where apple understand and transfer the sound or text into counter format of message ie text, sound. If normal person like to speak out then the apple should be adjusted according to its trend. Likewise if deaf person like to send text message then appl shoud do so. Well all of these things depend on the this era’s trend. Anyway I want to get some help in terms of compiler and SDK. I am in need of detailed link and kind explanation as I am novice to this STT, TTS area. God bless you all!!! What if I download from this link? visualstudio

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>