When the House of Common Culture Media and Sport Committee published its report on the Creative Economy headlines focused on the report’s call for more action from Google against piracy. Bill Rosenblatt, president of Giant Steps Media Technology Strategies and editor of copyrightandtechnology.com, examines what tech companies can do and what inspires them to do it.
Among the most interesting remarks in the Creative Economy Report of the Culture, Media and Sport Committee of Parliament, published in September, is the Committee’s expression of skepticism that it is “beyond the wit of the engineers employed by Google and others to demote and, ideally, remove copyright infringing material from search engine results. “ Google’s assertion that it cannot do this is disingenuous, and it is a distraction from the real issues underlying the disputes between the technology and media industries over online copyright infringement. Thus it bears some examination.
At a general level, any statement that a major tech industry player makes that such-and-such is “impossible” is inherently suspicious. Silicon Valley has a rich history of ignoring claims of “impossible” and producing one stunning innovation after another… if the business motivations are right. Today’s tech companies often heed the advice of Voltaire and come up with solutions to “impossible” problems that work “well enough” instead of perfectly. Google’s amazing achievements in personalized ad targeting are a great example of this. This was a pipe dream of the advertising industry before Google even existed; now it is reality and of great benefit to advertisers even though it’s not perfect.
Therefore, for a supremely innovative company like Google, a more accurate statement is either “impossible now but we’re working on it” or “impossible now and we’re choosing not to work on it.”
The tech industry’s response to the issue of online copyright infringement is clearly the latter, even though tech companies don’t say it outright. And this isn’t just a hypothetical; it is reality today. Google has shown itself to be perfectly capable of addressing infringement when it suits its business purposes. Two examples should demonstrate this.
First is Google’s Content ID system for YouTube. This is a system for recognizing content in files that users upload to the site. It is based on a technology known as fingerprinting, which is a sophisticated pattern-matching technique that identifies files as containing the same content if they look or sound the same, even if they are not bit-for-bit identical. A fingerprinting system maintains a huge database of numerical content descriptions (“fingerprints”). When a user tries to upload a file to a service like YouTube (or DailyMotion, SoundCloud, or various other sites that use the technology), a fingerprinting algorithm checks the file for a match against its database. If it finds a match, it can take action, such as blocking the upload.
YouTube’s fingerprinting system was originally set up so that it blocked files whose fingerprints matched the database, and it only examined music. After Google acquired YouTube in 2006, it created new fingerprinting technology, called Content ID, which also handles video and otherwise is improved. Why did it invest in this technology? Because it had figured out a way to monetize the process.
Now YouTube gives copyright owners a choice of actions to take when Content ID finds a match. Instead of blocking the file upload, a copyright owner can opt to allow the upload, let Google show ads when a user plays the video, and receive a share of the ad revenue. This serves to increase three things at once: “name brand” content on YouTube that attracts users, ad revenue for Google, and revenue for copyright owners. It’s a win-win-win. As a result, for example, the vast majority of major-label music content is monetized through ads rather than blocked, and YouTube is one of the most popular streaming music sites in the world.
The other example is Google’s database of takedown notices that it receives for websites found in its search results that host illegal copyrighted material. Under United States law, a copyright owner that sees its content on an online service can send the service a takedown notice, and if the service removes the content promptly, it avoids copyright liability.
Google makes takedown notice data publicly available as part of the Google Transparency Report. The Transparency Report can be searched to find the number of takedown notices a given Internet domain receives per month. The dichotomy between legitimate and serious pirate sites is dramatic: websites that host large amounts of infringing material attract hundreds of thousands of takedown notices per month, while even the biggest legitimate mainstream domains – Facebook, Yahoo, Tumblr, SoundCloud, etc. – only attract a few thousand at most.
Google could readily institute a policy of not indexing sites that exceed a threshold number of takedown notices received per month. But it chooses not to because of the lost ad revenue, instead opting to cover itself under rhetorical fig leaves such as “false positives” and “takedown notice abuse,” and the fact that some links on pirate sites are to non-infringing material (many of which are likely put there just to throw off the Transparency Report rankings). The threshold could be set so that false positives are very rare (thereby letting small-scale pirate sites through); and Google could stop responding to takedown notices from those who file false claims repeatedly. I am not suggesting that false positives and abusive takedown notices are not problems; I am merely suggesting that it is a matter of priorities that Google does not put out in the open in public policy deliberations.
It’s the Money
Of course, no scheme for foiling online copyright infringement is foolproof; nor does anyone involved in these debates expect it to be. It would simply help advance the dialog if the parties dispensed with posturing and started discussing the real issue: money. It takes money to invent and deploy technologies that address piracy, money that the media industry in most cases will not pay. Conversely, blocking content and websites from Google’s services means less traffic and therefore lost revenue. That’s the basis on which negotiations ought to take place if they are to be resolved.
Bill Rosenblatt will be discussing these issues at the Copyright and Technology London 2013 on Thursday 17 October in London. Speakers will include Richard Hooper CBE and several others who gave evidence before the Culture, Media and Sport Committee.
This article gives the views of the author, and does not represent the position of the LSE Media Policy Project blog, nor of the London School of Economics.