What We Did Learn (Sort of) from the MP3Tunes Decision

In our last post, we looked at what the MP3Tunes decision didn’t tell us–that it didn’t put the music industry’s best argument to the test.  We looked at the contours of this “nuclear option,” including the elephant-in-the-room possibility that the music industry could once again go after individual consumers.  In this post, we’ll look at what the MP3Tunes decision did tell us–sort of–about the legality of music-locker services.

DMCA Safe-Harbor Protection Redux

As with Grooveshark, music-locker services have two ends.  In one end, the user uploads song files.*  Out the other end, the service “streams”** the song files to the user’s various devices.  Unlike Grooveshark, however,, the subscribers can enjoy only their own music (whereas Grooveshark users could enjoy everyone else’s music, too).  This might (or might not) make a big legal difference.

Apple’s service will have a major difference.  With Apple’s “Match” service, if you purchased your music through Apple’s iTunes store, you won’t need to upload the song file at all–you can “stream” Apple’s own copy of the song file.  Apple has obtained licenses from the rights holders to do this.

**  I put stream in quotation marks because (as we discussed with Grooveshark), there are two types of streaming: progressive downloading and true streaming.  We are entirely clear about what sort of streaming the music lockers will use.  From the point of view of the user’s experience, there isn’t much much difference, but the difference might have legal consequences.  See the relevant Grooveshark post for more.

The first (uploading) end implicates the reproduction right (again, just as with Grooveshark).  The subscribers make copies of their songs and place those copies in the remote “music locker” (i.e., somewhere in the “cloud”).  It is here that the music companies might unleash the “nuclear option” that we discussed last time.  If these copies aren’t a fair use (under the “space shifting” rubric), then the subscribers are direct infringers.  The music-locker providers might have a DMCA defense against a claim for secondary liability, but the subscribers don’t.

The second (streaming) end implicates the public performance right (again, just as with Grooveshark).  Recall that streaming is generally regarded as a public performance (but more on this in a minute).  In this case, the subscribers are off the hook because they’re not causing a public performance to occur–the music-locker providers are.  The music-locker providers (in contrast to Grooveshark) probably have a viable DMCA safe-harbor defense–something the MP3Tunes decision recognized.

Recall that there are four requirements for DMCA safe-harbor protection:  (1) having and implementing a repeat-infringer policy, (2) responding “expeditiously” to proper DMCA takedown notices, (3) acting to remove infringing content upon gaining either actual or “red flag” knowledge, and (4) not obtaining a direct financial benefit from the infringing activity.  As with Grooveshark, we can safely assume that the first two requirements are being complied with.

As we mentioned last time, whether the music-locker services stumble over the next two requirements (as Grooveshark might) could depend on whether the uploads themselves are a fair use.  If they’re not, then EVERYTHING in the music lockers will be infringing, and if EVERYTHING in the system is infringing, the music-locker providers could be found to have “red-flag” knowledge of infringing activity.  Further, the infringing activity could be found to be the “draw” for subscribers to pay money to use the site.

This might seem like a very incongruous result.  After all, with Grooveshark, the subscribers are listening to music to which they have no rights.  With music-lockers, the whole point is that the subscribers are listening to music that they putatively have a right to.  Incongruous, unfair even, and perhaps that’ll help push a court away from finding liability.  But, as I may have mentioned previously, copyright can be a highly technical law whose detailed regulations don’t always match up with our moral intuition.

But, wait, asked the music companies, isn’t there an even more fundamental question?  Recall that the relevant DMCA safe-harbor provision only applies to acts infringement “by reason of” by users’ storage of infringing material on the provider’s website.  When users upload music files, that infringing act is definitely eligible for safe-harbor protection.  But isn’t streaming an entirely different infringing act, implicating an entirely different exclusive right?

Courts have actually heard this argument several times before, and they have uniformly held that the playback of stored content is sufficiently connected to the act of storing the content to be eligible for DMCA safe-harbor protection.  Courts reason that playback is often the only practical way to access the content.*  The point of storing content on sites like YouTube is so that it can be accessed (and not merely stored), so a mechanism for playing back has to be brought into the DMCA safe harbor.  The mere fact that accessing the content happens to implicate a different exclusive right shouldn’t affect the safe-harbor.

The alternative would be downloading the entire file, which is often impractical, and represents an even worse act of infringement.

So the DMCA safe-harbor analysis is actually pretty easy, though not quite as simple as the MP3Tunes decision made it seem.  If the uploading to the music lockers is legal, then the music-locker providers are fine.  If not, they might have the same problems that Grooveshark has.

But we won’t stop there.  We’ve been assuming that MP3Tunes’ streaming constituted a public performance of the songs, but the court actually held that it wasn’t.  Although the court spent perhaps half a paragraph in this analysis, it’s actually a fascinating (and mind-mending) issue.  Let’s dig in.

How Many “Masters” Do You Serve?

What made Grooveshark such an open-and-shut case for public performance was that Grooveshark streamed the same music file to multiple subscribers.  I.e., Grooveshark had a “single master” for all performances of a given song.  But if each subscriber had his or her own special copy of the music file (and assuming that special copy is itself legal), it’s not a public performance.  Yes, its the same song, but it’s a different file.*  Yes, it’s incredibly inefficient, but copyright law sometimes forces you to do crazy things.

This, at least, is the consensus view of the Cablevision decision, as applied to computer files.

In their brief in the MP3Tunes case, the music companies argued that MP3Tunes used a single-master system, even though it made its subscribers upload their own content (which the music companies called a “charade”).  Once a user uploaded “Here Comes the Sun,” subsequent uploads of “Here Comes the Sun” would be discarded, and the initial copy of “Here Comes the Sun” would continue to be the “single master” from which all streaming would be derived.  The music companies, however, didn’t explain the technical procedure by which this was accomplished.  But MP3Tunes did in its brief–and the issue wasn’t as simple as the music companies thought.

MP3Tunes stores its customers’ music files using “content-addressable storage” (CAS), which is been commonly used for several years now.  It’s best suited for the storage of data that doesn’t change much.  When your computer stores data to a hard drive (or similar storage device), it breaks the file up into data blocks, then writes the data blocks onto a physical location on the hard drive.  (If you had small enough needle and some kind of super-power that sees data in magnetic storage, you could actually point to an individual block of data on the surface of hard disk.)

Most of the data on your home or work computer is stored on your hard drive using location-addressable storage (“LAS”).  After your computer is done writing the data block, it remembers (1) where it put the block, and (2) what file the block belongs to (and the order that the various blocks go in making the file).*  When you ask your computer to open a file, the computer checks the index for the data blocks that comprise the file, looks up their locations, then reads the data blocks into its RAM.

If your hard drive is highly fragmented, data blocks comprising the same file might be spread pretty far apart, which makes the task of retrieving the file take longer.  This used to be a terrible problem when hard disks were less capacious and data got crowded.

CAS works differently.  As it writes the data blocks, it assigns each a unique number based on the order of the bits of information (i.e., the 0’s and 1’s) that make up the block–i.e., its content.  (Thus, content refers to the bits of data that comprise the block, not to something a human being could actually understand.)  The computer runs the 0’s and 1’s through an algorithm and spits out a much more manageable number that, in essence, represents all of those 0’s and 1’s and is unique to all those 0’s and 1’s.  This is known as a hash number (which has lots of other uses).  Thus, when you ask your computer to open a file, it doesn’t have to look up location.  It just searches by hash number, which is much faster.  The downside is that if you ever alter a file (e.g., by editing it), the hash number of some or all of the data blocks will change because the content has changed.  That’s why it’s good for long-term storage.

A side-effect of CAS is that it only stores a data block with a particular has number once.*  That’s because content is key: blocks with exactly the same content are, to CAS’s point of view, the same data block.  By contrast, such blocks in a LAS system are regarded as different data blocks because location is key, and the blocks are written into different locations on the hard drive.  According to MP3Tunes’s brief–and unrebutted by the music companies–different music files for the same song will not yield precisely identical data blocks.  (MP3Tunes didn’t explain why.)  it further proffered evidence that there was only 5% overlap of the data blocks.

The MP3Tunes court called this a “standard compression algorithm,” but I don’t think that’s quite right.  Yes, it does (or might) save space, but it’s only appreciable if the files share a large number of data blocks.  Further, CAS’s primary function (as I understand it) is to increase speed; the space-savings is only incidental to that primary function.

Under these facts, the court had little trouble holding that MP3Tunes’ use of CAS isn’t the same thing as using a single master for each song.  But what if there were 100% overlap of the data blocks (i.e., an effective de-duplicating of music files)?  Consider two files–one called, say, “Resume” and the other called “Resume Copy”–that are comprised of the exact same* data blocks:  are they still two different works for copyright purposes?  When you computer goes to write “Resume Copy” to its hard drive, it sees that it already has those data blocks written (thanks to the wonders of CAS), so instead of re-writing those data blocks elsewhere, it simply indexes those pre-existing data blocks to the new file.  When it is asked to call up “Resume Copy” later, it looks up “Resume Copy” in the index and copies the associated data blocks into a file called “Resume Copy.”  Two different files, one set of data blocks.

Honestly, it doesn’t have to be 100% overlap.  Violation of the reproduction right occurs with a showing of “substantial similarity.”  So, perhaps 75% overlap would be similar enough?

If we consider a “file” to be its constituent data blocks in their “resting state” (i.e., sitting in storage) then “Resume” and “Resume Copy” derive from the same “single master.”  If we consider a file to be a uniquely identifiable set of data blocks in its “active state” (i.e., called up by the computer and in use), then “Resume” and “Resume Copy” are distinctive.  The Copyright Act of 1976 is more flexible than it is normally given credit for–given the massive technological changes since the 1970’s*, the Act sort of still works–but I think it’s simply not prepared to answer what is really a kind of existential question:  is a computer file (such as a song file) nothing more than the sum of its components, or is it something more?

Actually, we’re lucky that it was passed in the 1970’s when digital information was at least well-known and the Act could make some provisions for it.  It’s just that it couldn’t possibly have predicted the extent to which digital information would take over areas of the copyright that had traditionally be held in analog form.

My own preference would be that CAS be treated just like LAS, that we look at the level of the file and not the data block.  My reasoning would be that the file is the smallest unit of digital information that the human mind can readily comprehend.  Data blocks are, after all, just an arbitrary slicing up of a computer file.  Until a machine puts them together in the right order, they have no meaning–not even highly abstracted meaning–in the human universe.*

I don’t think Grooveshark would benefit legally from using CAS because its users would still be aren’t listening to their own personally-identifiable copy of the music file.

OK, we’re done talking about the music-locker services, such as Amazon, Google and Apple.  Next time, we’ll look at the webcasting services, such as Pandora and Turntable.fm.

Thanks for reading!