MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13)

Download and get help for different MediaMonkey for Windows 4 Addons.

Moderators: Peke, Gurus

Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.53 - Find & Fix Similar Spelling (7/4/13

Post by Scottes »

dtsig wrote:Would it be possible to get more information. For example, when looking at matches for artists there are many many artist with very similar names ... possibly just gender changes. If you included maybe the album name with the artist name it might go a way to be sure we are changing the right things.
Right-click on a name and choose More Info to get a list of all the songs associated with that name, with numerous fields shown for each song. Right-click on any name and choose More Info (All) - or just hit F11 - to get the info for all the names listed. Warning - these choices will take a few seconds, or many seconds if there are a lot of songs to show. The resulting list contains *every* song associated with that name, so choosing More Info on a popular artist name can take a few moments.

Also, if you're unsure of the correct spelling for a name, right-click and chose Google Search which will... yeah, do a Google search on that name.


Check those out and let me know if they will do what you need. I've thought about showing more info every time, but it would make each and every search take longer rather than just the times you need it.

One question I have in this area... When choosing More Info, is showing *every* song good, or should I limit it to 10 songs to speed things up? This would speed things up if the list were large. But what if those 10 songs don't show enough info?

If you have any desires or suggestions, I'm all ears.
zuilserip
Posts: 34
Joined: Wed Feb 22, 2012 8:00 pm

Re: MonkeyMatch 0.5.53 - Find & Fix Similar Spelling (7/4/13

Post by zuilserip »

Scottes wrote:Beta 0.5.53 - July 4, 2013
Download Here
Thanks Scottes - I spent some time yesterday using this and it is a lot more stable. No crashes or hangs.

I've been able to fix a bunch of inconsistencies on my DB and will continue using it this coming weekend to finish the job. Will let you know if I run into any issues, but it looks great so far!
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.53 - Find & Fix Similar Spelling (7/4/13

Post by Scottes »

Beta 0.5.56 - July 7, 2013
Download Here

HUGE performance improvement. If you use Blacklisting, the improvement is REALLY, REALLY HUGE.


I spent a good deal of time over the holiday weekend doing some benchmarking. I pulled apart the function that gets the match pairs, and extracted the 10 or 11 little things it does, then ran each one of them 320,000,000 times and measured how long each one took. I then explored ways to make each step a little faster, if at all possible.

One little bit took over 17 seconds to run 320 million times. I got it down to 1.2 seconds.
Another little bit took 3 seconds, but I got it down to 2.4 seconds.
Another little bit took 1.6 seconds, but I got it down to 1.2 seconds. Hey, every little bit counts.

But I found that my method of checking the Blacklist took a huge chunk of time - 281 seconds. As I explored it, I realized a few ways to get that time down to far less than a second. Basically, I made an assumption that was WAY OFF. The testing showed me how far off I was, and some more time showed me the best way to deal with it.


Once I made the changes, I ran some tests against my database using the old version and the new version. I ran my 18,000 unique songs with my Blacklist which contains 3,200 pairs of names.
Old Version: 7 minutes, 51 seconds
New Version: 14 seconds

I was a bit flabbergasted, so I ran the tests again.
Old Version: 8 minutes, 3 seconds
New Version: 13 seconds

As another test, I ran both programs after deleting my Blacklist file.
Old Version: 21 seconds
New Version: 8 seconds


So even without a Blacklist, the new version is 2.5 times faster. With a large database and a large Blacklist, you could see a simply tremendous improvement in performance. It all depends on the size of your database and the number of Blacklist entries, but everyone should see a significant improvement at the very least.

Download 0.5.56 Here
zuilserip
Posts: 34
Joined: Wed Feb 22, 2012 8:00 pm

Re: MonkeyMatch 0.5.53 - Find & Fix Similar Spelling (7/4/13

Post by zuilserip »

I've spent more time using the new version and it seems very fast for me, certainly fast enough. Thanks!

A quick question - the program identified a capitalization inconsistency between 'Various Artists' and 'various artists' - so this is great, and I had it fix it.

Now, there were probably a couple of songs tagged as 'various artists' and some 5-10K 'Various Artists'. But instead of just changing the couple of inconsistent songs, the program took a very long time and seemed to go through each of the many thousand songs that were fine to begin with, and did not need to be changed. Avoiding this unnecessary step might be a relatively straight-forward optimization.
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.53 - Find & Fix Similar Spelling (7/4/13

Post by Scottes »

zuilserip wrote:...the program took a very long time and seemed to go through each of the many thousand songs that were fine to begin with, and did not need to be changed.
Ooooh, yeah. The problem is that MonkeyMatch is case-sensitive, but the SQL calls to the MediaMonkey database are not, so asking MediaMonkey for "various artists" will return any such names, as well as "VArioUS ArTistS" and anything in between. Hmmm. I don't think this can be changed, AFAIK. MonkeyMatch should not be changing them, but it will iterate over them.

I'll do some investigation. That is a huge time-waster.


But, while this is pertinent, should MonkeyMatch be case-sensitive? Most everything else - if not *everything* else - around MediaMonkey seems to be case-insensitive. One of the main reasons I wrote this was to ensure that I could match duplicate songs (Artist + Song Title), but the excellent Duplicate Find And Fix quickly matched "elp - a time and a place" to "ELP - A Time And A Place".

And there are better tools for many scenarios, like the Title Case script, or the RegExp script, or just doing it by hand sometimes.

Should I rip out the case sensitivity, which would make it a lot faster?
Or make it an option? (Which will make it a lot faster sometimes, or a tiny bit slower the other times.)
zuilserip
Posts: 34
Joined: Wed Feb 22, 2012 8:00 pm

Re: MonkeyMatch 0.5.53 - Find & Fix Similar Spelling (7/4/13

Post by zuilserip »

Scottes wrote: But, while this is pertinent, should MonkeyMatch be case-sensitive? Most everything else - if not *everything* else - around MediaMonkey seems to be case-insensitive. One of the main reasons I wrote this was to ensure that I could match duplicate songs (Artist + Song Title), but the excellent Duplicate Find And Fix quickly matched "elp - a time and a place" to "ELP - A Time And A Place".

And there are better tools for many scenarios, like the Title Case script, or the RegExp script, or just doing it by hand sometimes.

Should I rip out the case sensitivity, which would make it a lot faster?
Or make it an option? (Which will make it a lot faster sometimes, or a tiny bit slower the other times.)
My personal vote would be not to rip out case sensitivity. Or at least make it an option. To be honest, the time in not such a big deal for me. It is a lot faster than many other scripts that need to query some web service for each song.

For me, I love to be able to run MonkeyMatch every time I add a new batch of songs onto my library to ensure I haven't introduced any goofy new spelling, or spelling variants to my existing databases.

In fact, in my ideal world, I'd even like to be able to create a file with pre-defined matches to 'standardize' names from the huge variation of odd spellings and alternate spellings we see in the wild onto something consistent. So matching pairs (or tuples, really) like: (Kid Jonny Lang, Johnny Lang -> Jonny Lang), (Dexy's Midnight Runners, Dexy Midnight Runner, Dexys -> Dexys Midnight Runner), (Beatles, Beetles, The Beetles -> The Beatles), etc.
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by Scottes »

Yeah, an option for case sensitivity would be the right thing.


And yes, that's exactly the way that I use it - now that I have everything in my database cleaned up. I am re-ripping my CD collection to FLAC, and MM lets me make sure that the new rips exactly match my 10-year-old set of MP3s with years of data and info. I rip a CD to FLAC, run MM, then Duplicate Find and Fix to find and copy all the info from MP3 to FLAC. Eventually I'll have all brand-new FLACs with all the years of edits and info from my MP3s.


As for that list, if you could somehow produce a list - just a small one to get me started - then I'll look into it. I have to let my brain spin on it for a while - how to work it into the program, how to work it into the process, how to update it, maybe how to get a community list going so nobody has to type in a thousand names, how to aggregate/collate/update that list... This feature could be a separate program, but MonkeyMatch already has a great engine for such things.
zuilserip
Posts: 34
Joined: Wed Feb 22, 2012 8:00 pm

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by zuilserip »

Scottes wrote: As for that list, if you could somehow produce a list - just a small one to get me started - then I'll look into it. I have to let my brain spin on it for a while - how to work it into the program, how to work it into the process, how to update it, maybe how to get a community list going so nobody has to type in a thousand names, how to aggregate/collate/update that list... This feature could be a separate program, but MonkeyMatch already has a great engine for such things.
Hi Scott - sure, see below for two possible ways to build this list

As far as building the list, we could use something like the 'alias' names from MusicBrainz (e.g., http://musicbrainz.org/artist/6a60adeb- ... a5/aliases ), or one of the other lists out there that collect such aliases/alternate spellings (e.g., http://www.metal-archives.com/todo/alt-spelling )

* Idea 1 - More straight forward, narrower scope (could work for artist, album artist and composer fields):

Code: Select all

// This is a comment
// Spacing, tabs, new lines are completely irrelevant
>"Wolfgang Amadeus Mozart"		= "Mozart", "W.A. Mozart", "Mozart, Wolfgang",  "Wolfgang A. Mozart"
>"The Rolling Stones"				= "Rolling Stones", "Rolling Stones, The", "The Stones", "Stones"
>"Emerson, Lake & Palmer"		= "ELP", "Emerson Lake and Palmer", "Emerson Lake & Palmer",
							   "Emerson, Lake and Palmer", "Emerson, Lake, and Palmer", "Emerson, Lake, & Palmer"
>"¡Cubanismo!"					= "Cubanismo", "Cubanismo!", "!Cubanismo!", "Jesus Alemany",
							   "Jesus Alemany's Cubanisimo"
>"Dexys Midnight Runners"			= "Dexy Midnight Runner", "Dexys", "Dexy's Midnight Runner", 
							   "Dexie's Midnight Runners", "Dexy Midnight Runners", "Dexy's Midnight Runners",
							   "Dexy's Mindnight Runners", "Dexys Midnight Runners & Kevin Rowland",
							   "Kevin Rowland & Dexys Midnight Runners"
>"The Beatles"					= "Beatles", "Beatles, The", "The Beetles", "Beetles", "Betles", "Fab Four"
>"Electric Light Orchestra"			= "The Electric Light Orchestra", "ELO", "E.L.O", "E.L.O."
>"Jonny Lang"					= "Kid Jonny Lang", "Johnny Lang", "Lang, Jonny", "Johny Lang"

* Idea 2 - More complex, but much more flexible and powerful. Extends concept of mapping/standardizing beyond artist fields.

Code: Select all

#IF <artist, album artist, composer> <casesensitive=OFF>
>"Wolfgang Amadeus Mozart"		= "Mozart", "W.A. Mozart", "Mozart, Wolfgang",  "Wolfgang A. Mozart"
>"The Rolling Stones"				= "Rolling Stones", "Rolling Stones, The", "The Stones", "Stones"
>"Emerson, Lake & Palmer"		= "ELP", "Emerson Lake and Palmer", "Emerson Lake & Palmer",
							   "Emerson, Lake and Palmer", "Emerson, Lake, and Palmer", "Emerson, Lake, & Palmer"
#ENDIF

#IF <genre> <casesensitive=OFF>
>"Rock & Roll"					= "Rock", "Rock n' Roll", "Rock n Roll", "Rock and Roll", "General Rock"
#ENDIF

#IF <mood> <casesensitive=OFF>
>"happy"						= "joyful", "cheerful", "fun", "giddy"
> "sad"						= "morose", "sad", "brooding", "wistful", "bittersweet"
#ENDIF

#IF <quality> <casesensitive=OFF>
>"very good"					= "perfect", "excellent", "great"
>"poor"						= "garbage", "terrible", "awful"
#ENDIF

#IF <track#> <casesensitive=OFF>
 >"1"	= "01"
 >"2"	= "02"
 >"3"	= "03"
 >"4"	= "04"
...
Last edited by zuilserip on Sat Jul 13, 2013 11:28 am, edited 1 time in total.
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by Scottes »

Sweet lists! That will help a lot.
prinzgilden
Posts: 1
Joined: Mon Jan 04, 2010 2:27 pm

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by prinzgilden »

Starting the program i get an error!
"System.ArgumentOutOfRangeException: Der angegebene Zeilenindex liegt außerhalb des definierten Bereichs.
Parametername: rowIndex

Then i can run the program and if i click on find matches he starts to work. But then i got the next error "System.NullReferenceException: Der Objektverweis wurde nicht auf eine Objektinstanz festgelegt."

Windows 7 Enterprise 32bit Service Pack1
MediaMonkey 4.1.0.1646

My database only contains 278 files.

Can someone help?

Thanks
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by Scottes »

Are you running the very latest version, posted in the first post?

I've fixed 2 or 3 of those "row index" errors, but every one has been due to some entry in the database that I just did not expect. Can you upload your database somewhere, and PM me a link so that I can download it? Then I will run it here and I can probably fix the bug pretty quickly.
trixmoto
Posts: 10024
Joined: Fri Aug 26, 2005 3:28 am
Location: Hull, UK
Contact:

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by trixmoto »

I love this, excellent little application, does a great job!
Download my scripts at my own MediaMonkey fansite.
All the code for my website and scripts is safely backed up immediately and for free using Dropbox.
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by Scottes »

trixmoto wrote:I love this, excellent little application, does a great job!
Aw, shucks. <blush> Thanks!


Now if I could only figure out the bug that is plaguing prinzgilden... Argh!


Well, it took a bit, but I finally did fix that error, thanks to plenty of help from prinzgilden. What a nasty little bug that was.
Last edited by Scottes on Sat Aug 03, 2013 3:46 pm, edited 1 time in total.
hintergrundrauschen
Posts: 211
Joined: Sat Mar 29, 2008 6:20 pm

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by hintergrundrauschen »

First, thanks for continuously working on MonkeyMatch.

I still get the "System.ArgumentOutOfRangeException" when starting up the application.

Running Artists/Find Matches, at SubSet Names counting 13401, I will get a "System.NullReferenceException":

Code: Select all

System.NullReferenceException: Der Objektverweis wurde nicht auf eine Objektinstanz festgelegt.
   bei MonkeyMatch.Form1.GetMatches(List`1 SubSet, List`1 SuperSet)
   bei MonkeyMatch.Form1.btnFindMatches_Click(Object sender, EventArgs e)
   bei System.Windows.Forms.Control.OnClick(EventArgs e)
   bei System.Windows.Forms.Button.OnClick(EventArgs e)
   bei System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   bei System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   bei System.Windows.Forms.Control.WndProc(Message& m)
   bei System.Windows.Forms.ButtonBase.WndProc(Message& m)
   bei System.Windows.Forms.Button.WndProc(Message& m)
   bei System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   bei System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   bei System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
Running for Albums, I will get the error at 3761.

I cannot exit the application afterwards:

Code: Select all

System.NullReferenceException: Der Objektverweis wurde nicht auf eine Objektinstanz festgelegt.
   bei MonkeyMatch.Form1.SaveBlackListToFile()
   bei MonkeyMatch.Form1.ApplicationExit()
   bei MonkeyMatch.Form1.Form1_FormClosing(Object sender, FormClosingEventArgs e)
   bei System.Windows.Forms.Form.OnFormClosing(FormClosingEventArgs e)
   bei System.Windows.Forms.Form.WmClose(Message& m)
   bei System.Windows.Forms.Form.WndProc(Message& m)
   bei System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   bei System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   bei System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
Claude
Scottes
Posts: 150
Joined: Sat Mar 21, 2009 6:51 am

Re: MonkeyMatch 0.5.56 - Find & Fix Similar Spelling (7/7/13

Post by Scottes »

hintergrundrauschen wrote: I still get the "System.ArgumentOutOfRangeException" when starting up the application.
Please give this version a try. It's a debug version that fixed prinzgilden's Out Of Range error. I just have not had time to remove the debug code and release it.
http://www.itsanadventure.com/MonkeyMat ... _0576b.zip
Please let me know if it works or not.

If you still get an error, stop as soon as you get the error and do not click anything. Send me the last 20 lines of the Actions.log file at that point, since it will pinpoint the error location for me.
Post Reply