Problem with my SpeechRecognition plugin

This forum is for questions / discussions regarding development of addons / tweaks for MediaMonkey.

Moderator: Gurus

Steegy
Posts: 3452
Joined: Sat Nov 05, 2005 7:17 pm
Location: Belgium
Contact:

Problem with my SpeechRecognition plugin

Post by Steegy » Sun Dec 11, 2005 10:02 am

Hello

I'm having some problems with a c#.NET speech recognition I'm trying to make for MM.

In the class, I declare:

Code: Select all

private SongsDB.ISDBApplication			myMMApp = new SongsDB.SDBApplicationClass();
(SongsDB.ISDBApplication can also be SongsDB.SDBApplication isn't it ??)

OK. When MM is running, nothing special happens but when it doesn't run it starts up on this line:

Code: Select all

				SongsDB.SDBArtists Artists = myMMApp.Player.CurrentSongList.Artists;
That's completely as it should be.

The full code to add artists to Speech Grammar:

Code: Select all

				ruleListItems.Clear();
				SongsDB.SDBArtists Artists = myMMApp.Player.CurrentSongList.Artists;
				for(int i = 0; i < Artists.Count; i++){
					word = Artists.get_Item(i).Name;
					Console.WriteLine(">" + i + ">" + word);
					this.Items[i] = word;
					ruleListItems.InitialState.AddWordTransition(null, word, 
						" ", SpeechGrammarWordType.SGLexical, word, i, ref propValue, 1F);
				}
The problem is that myMMApp.Player.CurrentSongList.Artists doesn't return all entries in the Now Playing list. One of them is skipped.
The output is:
>0>DJ Rave
>1>Dream Your Dream
>2>Fischerspooner
>3>Dune
>4>Jean Michel Jarre
>5>Daan

Where it should be
>0>DJ Rave
>1>Dream Your Dream
>2>Fischerspooner
>3>Flip Kowlier
>4>Dune
>5>Jean Michel Jarre
>6>Daan
looking at MM's Now Playing list. Probably MM doesn't like artist Flip Kowlier and pretends that it isn't in the Artists collection. I don't like this kind of artificial intelligence. It's to subjective! :)

When it then recognises Fischerspooner, it plays that one. When it recognises Dune, it plays Flip Kowlier.

(Flip Kowlier is the only artist I'm having problems with for now)

Another problem is that sometimes MM doesn't stay open or that it goes 100% CPU.

Any help would be appreciated.

Cheers
Steegy
Extensions: ExternalTools, ExtractFields, SongPreviewer, LinkedTracks, CleanImport, and some other scripts (Need Help with Addons > List of All Scripts).

Steegy
Posts: 3452
Joined: Sat Nov 05, 2005 7:17 pm
Location: Belgium
Contact:

Post by Steegy » Sun Dec 11, 2005 11:13 am

It works again (for now).
Possibly it was restarting MM that solved the issue with Flip Kowlier.

But the program is still not very stable.

Cheers
Steegy
Extensions: ExternalTools, ExtractFields, SongPreviewer, LinkedTracks, CleanImport, and some other scripts (Need Help with Addons > List of All Scripts).

trixmoto
Posts: 10024
Joined: Fri Aug 26, 2005 3:28 am
Location: Hull, UK
Contact:

Post by trixmoto » Sun Dec 11, 2005 2:13 pm

I'm very interested in this speech recognition idea. Please keep us posted! :) If you need a tester, I am willing and able.
Download my scripts at my own MediaMonkey fansite.
All the code for my website and scripts is safely backed up immediately and for free using Dropbox.

Steegy
Posts: 3452
Joined: Sat Nov 05, 2005 7:17 pm
Location: Belgium
Contact:

Post by Steegy » Mon Dec 12, 2005 5:28 am

Thanks for the interest.

But the application I'm making is actually (for now) just a "proof-of-concept", a "try-out" to see how easy it is to make a plugin for MM that would make it speech-aware.

The idea came from I have the most powerful speech control interface. I saw some simple SR applications in the past and thought I should give it a try. Plus, I want it to be for free (the most important component in it, Microsoft SAPI, is also free).

For now, I have an application that is made to recognise the artists name you speak and then play all it's tracks.
I don't have a good microphone and use Microsoft's TTS (TextToSpeech) to simulate a voice command.
However, results of recognition are quite poor, maybe because there are so many artist names, maybe because of the TTS, maybe because of my SR settings, maybe because of the SR engine. Who will say...

Cheers
Steegy
Extensions: ExternalTools, ExtractFields, SongPreviewer, LinkedTracks, CleanImport, and some other scripts (Need Help with Addons > List of All Scripts).

MCSmarties
Posts: 248
Joined: Tue Dec 06, 2005 8:01 pm

OK, this is probably science fiction...

Post by MCSmarties » Sat Apr 08, 2006 1:16 pm

But in theory, could such speech recognition be used to automatically transcribe lyrics (and save them to the tag) while a song is playing?

Of course, I realize that existing speech recognition software isn't powerful enough for that... yet.

But maybe a few years down the road...?
Now THAT would be cool!! :D

Steegy
Posts: 3452
Joined: Sat Nov 05, 2005 7:17 pm
Location: Belgium
Contact:

Post by Steegy » Sat Apr 08, 2006 4:16 pm

In theory, yes.
And maybe some powerful and very expensive speech recognition software can do that already (when you speak the text, not sing it...). But certainly not the free Microsoft Speech Recognition...

Cheers
Steegy
Extensions: ExternalTools, ExtractFields, SongPreviewer, LinkedTracks, CleanImport, and some other scripts (Need Help with Addons > List of All Scripts).

psyXonova
Posts: 785
Joined: Fri May 20, 2005 3:57 am
Location: Nicosia, Cyprus
Contact:

Post by psyXonova » Mon Apr 10, 2006 4:22 am

Assuming that the speech recognition engine is the same used in windows and MS office then those are 2 very important issues you should consider and perhaps can explain the low recognition percentage (offcourse those sould apply to all recognition engines but i am familiar only with the MS one)
  1. Buy a decent microphone. It doesnt have to be an expensive one, but in any case you should use the microphone with headphones, since problems occur when the microphone captures the sound of you speakers..
    This offcourse in a major problem on all speech aware applications but it is certainly worst when it comes to music. Imagine trying to say "Fisherspooner" the same time a track plays loud from you speaker.
  2. Train the engine. Without proper training results will be poor no matter what. The Engine included in MS office provides a training tool, use it to see how it works. The more you train your engine to your accent the more stable will be.
Also using the artificial voice of TextToSpeech as a voice command is not a good idea. SR engines use natural voice patterns and are not designed to accept commands from an artificial voice (machines can communicate each other with much faster and reliable ways :lol: :lol: :lol: ). I suggest trying using your own voice.

In any case i am also very interested in this idea... keep us update please

DiddeLeeDoo
Posts: 1017
Joined: Wed Mar 01, 2006 1:09 am
Location: In a jungle down under
Contact:

Post by DiddeLeeDoo » Wed Apr 19, 2006 4:40 am

Speech is interesting stuff, and I could not find much about it here in this forum.

Just for the fun of it I would love to just have MM say what's playing now.

Speak (Iter.Song & " by " & Iter.Artist )
sort of thing.

I found a snip that suppose to act as a function.

Code: Select all

// Speak the given phrase, returns when speaking finishes
function wshSpeak(phrase) {
   var vt = WScript.CreateObject("Speech.VoiceText");
   vt.Register("", WScript.ScriptName);
   vt.Speak(phrase, 1);
   while ( vt.IsSpeaking ) WScript.Sleep(100);
}
and combined with Java Script it suppose to work.

Save the above as wshTTS.js and use it like

Code: Select all

<job>
  <script language="JScript" src="wshTTS.js" />

  <script language="VBScript">
    wshSpeak "This is Sapi talking from a WSF file"
  </script>
</job>
Ref: http://www.generation5.org/content/2001/tts_wsh.asp

I try various things, but so far I've been unsuccessful, still reading..

Sorry if this is totally off topic in this thread. Since in study mode, I do not feel like starting a new topic just for this little thing.

Steegy
Posts: 3452
Joined: Sat Nov 05, 2005 7:17 pm
Location: Belgium
Contact:

Post by Steegy » Wed Apr 19, 2006 6:32 am

Hello

What you are suggesting can easily be done, however you need to have a TextToSpeech engine installed. SAPI5 is installed by default on WindowsXP systems, so let's use this one (it's the best free one).

Please do mind that a computer voice like SAPI can be very irritating, certainly if it is used with nice music.

Usage:
ttsSpeak YourTextToSpeek, VoiceNumber, Asynchronous
- VoiceNumber is the installed voice on your computer (most have only one). Use 0 if you want to use the default voice.
- Asynchronous should be set to False, unless you are calling the method from another method that takes long to complete.

Code: Select all

ttsSpeak "Hello, this is just a little test.", 0, False

Sub ttsSpeak(Text, VoiceNumber, Asynchronous)
    Dim tts, speechFlag
    
    On Error Resume Next
    
    Set tts = Nothing
    Set tts = CreateObject("Sapi.SpVoice")

    If Not tts Is Nothing Then
        If (VoiceNumber - 1) < tts.GetVoices.Count Then
            Err.Clear
            If VoiceNumber <> 0 Then Set tts.Voice = tts.GetVoices.Item(VoiceNumber - 1)
            If Err.Number = 0 Then
                If Asynchronous Then
                    speechFlag = 1
                Else
                    speechFlag = 0
                End If
                tts.Speak Text, speechFlag
            End If
        End If
    End If
    
End Sub
BTW: Because TextToSpeech and Speech recognition are about the same subject (like SAPI), this certainly isn't of topic.

Similar functions for SAPI4 also exist, but this is an old thing. Consider using SAPI5 instead (well, WindowsXP users will be very easy as it is already installed by default)

BTW2: This sample of code is very similar to what is programmed in major programming languages (c++, java, c#, vb, ...).

Cheers
Steegy
Extensions: ExternalTools, ExtractFields, SongPreviewer, LinkedTracks, CleanImport, and some other scripts (Need Help with Addons > List of All Scripts).

DiddeLeeDoo
Posts: 1017
Joined: Wed Mar 01, 2006 1:09 am
Location: In a jungle down under
Contact:

Post by DiddeLeeDoo » Wed Apr 19, 2006 7:18 am

This is so much fun!! :D Here I have been sitting for many many hours, reading, searching, trying, then, what if....and so on.. :o ..

I think I'm hooked to scripting. What a fine hobby..

You're right though, that Microsoft Sam does not really go that well with music. Maybe with an other voice... :wink:

DiddeLeeDoo
Posts: 1017
Joined: Wed Mar 01, 2006 1:09 am
Location: In a jungle down under
Contact:

Post by DiddeLeeDoo » Wed Apr 19, 2006 12:28 pm

Since it is SAPI related, I just wanted to share how your code-magic have been implemented as a non-default option in my first 'baby, AutoRateSongs.

I went ahead to get a better voice for SAPI5 and it's not that bad really. I enjoy this use of SAPI5 in MediaMonkey.

http://www.mediamonkey.com/forum/viewto ... &start=113

I just wish I could do something useful for you Steegy.

Morten
Posts: 1092
Joined: Thu Aug 11, 2005 11:31 am
Location: Norway

Post by Morten » Fri Apr 13, 2007 10:39 am

Ok, so Windows Vista has support for Speech Recognition. I haven't tried it with MediaMonkey yet, but does anyone now if MediaMonkey support this technology?
Best regards,
Morten

Steegy
Posts: 3452
Joined: Sat Nov 05, 2005 7:17 pm
Location: Belgium
Contact:

Post by Steegy » Fri Apr 13, 2007 10:51 am

Well, Windows XP and below also support speech recognition, but you have to download some extra files (SAPI) from the internet first. I guess Windows Vista already has these files in the standard installation. Maybe Vista introduces a newer version of SAPI, because the last version I tried (version 5) was good in general, but not good enough to use with things like MediaMonkey, where a lot of artist names and such resemble each other too much (and they're sometimes in a non-English language).

Addition: from what I found on the internet, Speech Recognition in Vista (unfortunately) doesn't seem better than SAPI5.
Extensions: ExternalTools, ExtractFields, SongPreviewer, LinkedTracks, CleanImport, and some other scripts (Need Help with Addons > List of All Scripts).

Morten
Posts: 1092
Joined: Thu Aug 11, 2005 11:31 am
Location: Norway

Post by Morten » Fri Apr 13, 2007 11:06 am

I guess you've seen the wrong videos then;

http://youtube.com/watch?v=zgJyqvcAXe0
Best regards,
Morten

DiddeLeeDoo
Posts: 1017
Joined: Wed Mar 01, 2006 1:09 am
Location: In a jungle down under
Contact:

Post by DiddeLeeDoo » Fri Apr 13, 2007 11:39 am

This is more like the way I have experienced voice recognition.
http://www.youtube.com/watch?v=2Y_Jp6PxsSQ

:)
Image

Post Reply