Yeah, 2.5.5 was finalized and all the work is being done on 3.0 now, but I ran across this and since 3.0 is still ALPHA, I haven't tested. So, maybe someone with 3.0 could check to see if it still behaves the same as 2.5.5.
Anyway, even with the option for it to write in UTF-16 selected, atleast for me, it's still writing in ANSI. Not much more to really say, if you need/want me to say more, let me know what you need to know and I'll get back to you.
[2.5.5.996] ID3v2.3 Unicode (UTF-16) Problem
Moderator: Gurus
-
- Posts: 2283
- Joined: Tue Aug 29, 2006 1:09 pm
- Location: Kansas City, Missouri, United States
edit: same as in I get the same thing (ansi tags when utf-16 is set in the options)
Have not checked MM 3.0 to see if same behavior.
Have not checked MM 3.0 to see if same behavior.
New script:
Last.FM Node Now with DJ Mode!
Last.fm + MediaMonkey = Scrobbler DJ!
Tag with MusicBrainz ~ Get Album Art!
Tweak the Monkey! ~ My Scripts Page


Last.fm + MediaMonkey = Scrobbler DJ!
Tag with MusicBrainz ~ Get Album Art!
Tweak the Monkey! ~ My Scripts Page

-
- Posts: 2283
- Joined: Tue Aug 29, 2006 1:09 pm
- Location: Kansas City, Missouri, United States
It probably should have been documented somewhere (I'm not sure if it is), but it isn't a bug - the idea is that if you store a string that consists only of standard ASCII characters, UTF-16 isn't necessary and so ANSI is used. Maybe a special option, some 'Mixed' mode could be introduced to make it clearer.
Jiri
Jiri
-
- Posts: 2283
- Joined: Tue Aug 29, 2006 1:09 pm
- Location: Kansas City, Missouri, United States
This makes sense and is probably a good idea for extra compatiblity.jiri wrote:It probably should have been documented somewhere (I'm not sure if it is), but it isn't a bug - the idea is that if you store a string that consists only of standard ASCII characters, UTF-16 isn't necessary and so ANSI is used. Maybe a special option, some 'Mixed' mode could be introduced to make it clearer.
Jiri
A useful place for this tidbit of info would be the mouse over help, indicating that utf-16 will only be used when necessary.
Also, changing the option text to indicate that utf-16 will be *ALLOWED* to be used, instead of *WILL* be used.
proposed:
ID3v2 text encoding: Ansi + Unicode UTF-16 (only when needed)
New script:
Last.FM Node Now with DJ Mode!
Last.fm + MediaMonkey = Scrobbler DJ!
Tag with MusicBrainz ~ Get Album Art!
Tweak the Monkey! ~ My Scripts Page


Last.fm + MediaMonkey = Scrobbler DJ!
Tag with MusicBrainz ~ Get Album Art!
Tweak the Monkey! ~ My Scripts Page

Re:
This is a very reasonable clarification, which, 8 years later, has not been implemented in MM v. 4.1.7.1741Teknojnky wrote:jiri wrote:proposed:
ID3v2 text encoding: Ansi + Unicode UTF-16 (only when needed)
I am dealing with this problem in the context of trying to support import of iTunes playlists, in the iPlaylist Importer plugin in which filenames from iTunes, encoded in UTF-8, are imported into MM and do not play, since these UTF-8 characters are always interpreted as Ansi and I cannot force them to be interpreted the way iTunes interprets them.
The unpredicatability of MM is really a problem here. Especially when some glyphs exist both in the Ansi and Unicode encodings.
One case in point (among many others), just as an example:
I name an mp3 file, in Windows 7:
™.mp3
I import it into iTunes and export the XML playlist including this file.
iTunes <Location> Path (in UTF-8) to this file is:
file://localhost/E:/Docs/My Projects/Music/_Software/Music - Library Mgmt/MediaMonkey/iPlaylist Importer/Testing/â„¢.mp3
If you use a hex editor, â„¢ is the byte sequence: E2 84 A2
Which is the unicode for the trademark symbol.
Problem is that, in Windows 7, ™.mp3, is not unicode: it uses ANSI (Windows 1252) code page, in which the Trademark character is represented by 0x99.
MM always interprets these cases as Ansi, so I cannot import an iTunes playlist that includes a UTF-8 track since searching the MM library will never find a UTF-8 encoding if the same glyph also occurs in Ansi, or Windows 1252, perhaps (?), which is not the same as Ansi, by the way.
Clarification?
Here are a couple of good technical references about this problem:
http://www.joelonsoftware.com/articles/Unicode.html
http://www.i18nqa.com/debug/utf8-debug.html
Re: [2.5.5.996] ID3v2.3 Unicode (UTF-16) Problem
From what I see you have a case of BOM character import at the beggining of XML.
Have you tried to remove BOM from the beggining and than import in MMW?
XML files are natively supported by UTF8 and BOM character is not needed like in case of M3U/M3U8
I have Cyrillic filenames and MMW import them without problem from iTunes Library XML.
Have you tried to remove BOM from the beggining and than import in MMW?
XML files are natively supported by UTF8 and BOM character is not needed like in case of M3U/M3U8
I have Cyrillic filenames and MMW import them without problem from iTunes Library XML.
Best regards,
Peke
MediaMonkey Team lead QA/Tech Support guru
Admin of Free MediaMonkey addon Site HappyMonkeying



How to attach PICTURE/SCREENSHOTS to forum posts
Peke
MediaMonkey Team lead QA/Tech Support guru
Admin of Free MediaMonkey addon Site HappyMonkeying



How to attach PICTURE/SCREENSHOTS to forum posts
Re: [2.5.5.996] ID3v2.3 Unicode (UTF-16) Problem
I apologize.Peke wrote:From what I see you have a case of BOM character import at the beggining of XML.
Have you tried to remove BOM from the beggining and than import in MMW?
My post here was off-topic so I think you have misunderstood my problem: I don't have any problem with MMW, only with the iPlaylist Importer plugin that I'm working on.
I am not aware of any BOM in my input.
The XML I'm importing is just the standard iTunes XML, that begins with "<?xml", and doesn't appear to have a BOM character, which, as I understand it, would be 0xEF 0xBB 0xBF in UTF-8.
[The reason I have posted here, in addition to discussing with trixmoto, is that trixmoto doesn't know how to fix the problem and I haven't found a better place to ask the question yet. Perhaps I should have started a new thread, but in any case, here we are. Thanks for listening.]