Dearest audiophile siblings,
It is finally time to post on what I personally find one of the biggest bugbears of computer audio, managing metadata or 'tagging'. I have seen many posts along the way lamenting the difficulty of tagging audio files and the inordinate amount of time it takes, however very little discussion of or pointers to any articles with proposed solutions or recommendations. As my library grows it grows messier and I am now pretty much at the point of no return. If I can't find a better way of doing it now the amount of legacy data will exceed the time available for me to ever get it under control, at least that is until somebody invents a system or tool waaaay more powerful than the ones I am currently aware of. All-in-all it's enough to drive one back to analog!
So what am I talking about? Well, I see two main types of problem:
- Data structure problems, whereby the fields available to store metadata are inappropriate or their purpose is ambiguous and
- Data problems, where the intention is probably clear but the exact usage is not
Many of the structural problems I have found relate principally to classical, but affect most other genres to some degree or other. Here are some of my favourites:
- Artist. Who or what is the Artist? In the rock world the answer is pretty straightforward, it is the band or singer performing the song, e.g. 'The Beatles'. However consider the case of a classical violin concerto. Is the Artist the soloist, the conductor, the orchestra or all three? It is even worse with opera where you have a number of principal soloists, a choir (or two), an orchestra and a conductor. In many cases (and in my experience both Gracenote and AMG are equally guilty) I have found the name of the composer in the Artist field, e.g. Album: Fidelio, Artist: Ludwig van Beethoven!? Doesn't anybody moderate this stuff?
- Album. Easy enough in the rock world, 'Revolver', but what about classical? Here Album is (mostly) minimally defined by the combination of Composer(s), the Work(s), the Performer(s) and the Conductor(s). But most of these data are captured in other fields, so does one simply duplicate them in the Album field? Then consider the case of the 'Gouldberg' Variations of which I have at least ten different recordings of about seven or eight different performances. Better add Year and possibly also Venue to the list!
- Track. Again, simple in the rock world, 'Here, There and Everywhere' but switch to classical and things get complicated. 'Aria' is fine, but on its own it doesn't tell you much, I mean there are two Arias in the Goldberg Variations alone, not to mention all the other thousands of Arias in other works. The question here is how much of the information from Album to duplicate in each Track. The problem is that the display fields for track information on most players are quite limited in size so the actual track-specific information quickly gets pushed out of view.
- Genre. The twin problems here are that depending on the level of granularity one goes to genres are not mutually exclusive and that much music crosses several genres anyway. It is equally irritating to find Norah Jones in Folk, Vocal and Blues as it is to have to decide when Coldplay made the transition from Alternative to Rock. Is Weihnachtsoratorium Classical, Choral or Seasonal? And how on earth does one classify Myriam Alter??
Data problems on the other hand are less confined to classical works, although classical is once again hardest hit. Apart from the usual issues of typos, spelling errors (especially in names), and errors related to non-English characters here are some of the main ones:
- Names. There are just so many ways of representing them and, seemingly no standards or conventions. Mozart? W.A. Mozart or Mozart, W.A.? Wolfgang Amadeus Mozart? Wolfgang Amadeus Mozart (1756-1791)? etc. Okay, it is no big deal to have to look for Beethoven under 'L' for Ludwig (and Ludvig of course, groan) but it is surely irritating when you finally find F