Will the Future Bring Even More Important Copyright Issues Than The Ones Raised by Online File-Swapping?
By JULIE HILDEN
Tuesday, May. 24, 2005
The issue of online file-sharing - or file-stealing, depending on your point of view - has dominated discussions of Internet copyright law thus far, and rightly so. As I discussed in a recent column, the stakes are very high - and the Supreme Court is primed to finally address this issue in the MGM v. Grokster case.
It's well-established that copying files in order to swap them - and swapping typically involves some activity that counts as copying - is a copyright violation. But the questions of whether sites hosting file swapping are liable for vicarious or contributory infringement - and if so, when - remain largely unanswered.
However the Supreme Court opts to address these questions, the real question may be whether the Court's decision is enforceable. The combination of Internet anonymity and the option of locating servers offshore may raise fatal obstacles to effective enforcement of any anti-file-swapping ruling.
Indeed, when I recently attended an industry conference, film and television company leaders shrugged their shoulders when the file-sharing issue came up. Either they haven't found a strategy to confront this issue, or they've decided it's wiser to keep it to themselves. But if disclosure threatens the strategy, it may not be much of a strategy at all: To be hackproof, a strategy ought to continue to be effective even after it's disclosed.
In this context, it seems almost unthinkable that there could be an Internet copyright issue more important than the ones file-swapping raises.
Yet, as I will explain in this column, it's very possible that equally - if not more - important Internet copyright issues may be on the horizon. Moreover, these issues may relate not only to how we get our entertainment, but also to how we get our news.
The issues are as simple and fundamental as they are troubling: Exactly how much content may be copied on the Internet - and of what kind -- before copyright is infringed? And more deeply, when is content "copied" in the first place when it comes to the Internet? Does the fact that the copying is done via a machine editor - not a human editor - make a difference?
An Ingenious Futurist Scenario Shows Why the Question Is Important
I was prompted to address these questions after reading a very clever set of predictions for the future of Internet media written by Robin Sloan and Matt Thompson.
I encourage readers to watch the entire eight-minute flash movie, but in this column, I'll focus on a few particular predictions Sloan and Thompson make. To summarize these predictions, I'll be quoting from Robin Good's English transcript of the movie:
The year 2007 sees the advent of a service - the Microsoft-owned, Friendster-derived Newsbotster - that "ranks and sorts news, based on what each user's friends and colleagues are reading and viewing and … allows everyone to comment on what they see."
But as of 2008, Newsboster has a competitor: Googlezon, formed by the merger of Google and Amazon. To form this titan, Google supplies (among other things) "unparalled search technology," while Amazon supplies "the social recommendation engine and its huge commercial infrastructure."
(The "social recommendation" engine to which Sloan refers amounts, I believe, to a number of features of Amazon: Its reader reviews; its system by which others can rate reader reviews as to "helpfulness," which yield rankings of top reviewers; and its own peronalized recommendations, extrapolated from buyers' viewing and buying histories with the site.)
Googlezon uses this combined "detailed knowledge of every user's social network, demographics, consumption habits and interests to provide total customization of content -- and advertising." (Presumably, this "content" includes content from the superior future version of the current day, real-life Google News.)
In 2010, Googlezon wins its fight with Newsbotser by inventing a clever new technique that further tailors content to the user: "Googlezon's computers construct news stories dynamically, stripping sentences and facts from all content sources and recombining them. The computer writes a [personalized] news story for every user."
So to give an example - mine, not Sloan and Thompson's - suppose the knowledge that Googlezon uses to customize content indicates that a particular user is very interested in international news. In putting together a news story even on a domestic happening, Googlezon could emphasize the international aspects - stripping from other sites (including blogs), say, only five sentences on the domestic happening, and twenty sentences on its international implications, to make a story.
Returning to Sloan and Thompson's predictions, in 2011, the New York Times and other media whose content is not customized to the user go the Supreme Court, "claiming that [Googlezon's] fact-stripping robots are a violation of copyright law."
But - according to Sloan and Thompson - the old media lose. As a result, they dispappear - relegated to the status of newsletters for the elite and the elderly!)
One legal issue here is obvious: the issue of "fact-stripping robots" and copyright infringement, which Sloan and Thompson predict the Supreme Court will resolve in 2011.
But as I will explain, these predictions raise a host of other legal issues as well. None of them is easy.
The First Issue: Do Search Engines Infringe Copyright?
Let's start with the copyright issues raised by the real-life Google News - which is a component of the hypothetical Googlezon. (In a prior column, I explained why Google News enjoys legal protection against defamation - but did not discuss its protection, if any, against copyright liability.)
When Google displays news items -- in the form of search results containing some text from a given site, plus a link to that site -- does it infringe their copyrights?
First, do the links infringe copyright? Probably not. As I pointed out in an earlier column, the legal status of linking isn't settled - but ought to be. When it is settled, however, it seems very likely courts will see links as being mere pointers -- much like non-copyright-infringing citations. A nonfiction book's reference section is hardly a host of copyright infringement; neither are links.
But what about the material that accompanies the links: chunks of text taken from the site itself? (In this sense, Google acts not only as a pointer, but also as a frame.) Possibly, this material is not literally copied - I don't know the technical specifics. But a court might find that display of material is, here, the equivalent of copying. And if so, another question will arise: How much verbatim copying is too much, in the online context?
How Much Verbatim Copying Is Too Much?
That brings us to our core issue: Would Googlezon, indeed, prevail in the hypothetical 2011 Supreme Court case challenging its "fact-stripping robots," which create personalized news stories by grabbing a sentence here, and a sentence there?
I think it would, but it wouldn't be a slam-dunk case of non-infringement.
Offline, as I explained in a prior column, the rule of thumb for many practicing copyright lawyers is that, if a work copies only a small fraction of another work, it's probably safe. For instance, say you reprint one page of a two-hundred-page novel. Most would say it's not copyright infringement, even if that page is copied verbatim.
There are some problems with this approach even in the offline world - as I noted in my earlier column. But in the online world, it appears to give Googlezon a license to do exactly what it is doing.
Under this rule of thumb, if a given story is assembled by taking snippets from, say, fifty sources, then it probably isn't infringing the copyrights of any one of them - even if the snippets are taken verbatim.
Granted, this can be a problem of journalistic ethics when actual people do it: It's plagiarism, if it isn't accompanied by citations. (Indeed, journalists have gotten in trouble for precisely this problem: Copying snippets of others' work verbatim, or nearly so, without according them credit.)
But ethics aren't law; journalists who plagiarize are typically fired, not sued. And it's hard to blame a robot - or "bot" for short -- for failing to have its own, creative thoughts and phrasings. (Maybe in the Twenty-second Century, we'll have that kind of bot accountability, but not now.)
And, more importantly, Googlezon doesn't even have to run afoul of journalistic ethics here: Its fact-stripping robots could easily (and automatically) provide citations to all their sources - each accessible by a link. Thus, rather than being an engine of plagiarism, Googlezon's fact-stripping bots might be better seen as an engine of compilation.
Making compilations like this illegal, as copyright infringement, would challenge the status of a lot of traditional research - such as virtually any doctoral thesis, nonfiction book, academic paper, and on and on. For this reason, I agree with Sloan and Taylor that the Supreme Court would likely rule for Googlezon - not "old media" - in its Supreme Court case.
But it's also possible the Court - or, ultimately Congress, in the wake of the Court's decision - would rework copyright in a way that better fits the Internet.
Copyright is meant, in large part, to protect the market for a given work, and thus to protect incentives to create new works. Yet allowing people to read (for free) a fact-stripping bot's compilation of news might undermine the market for newspapers and their online outposts. And that may lead newspapers to fight back in Congress for a broader version of copyright that would end, or limit, the reign of fact-stripping bots.