Playing With Numbers: Why SOPA Still Won’t Solve Internet Piracy


During a recent congressional hearing on H.R. 3261 “Stop Online Piracy Act”, Representative Conyers stated, “25% of Internet traffic is copyright infringement.” This dramatic figure is meant to justify SOPA as the solution to ending online piracy. But before we take this figure at face value, much less decide if SOPA is the right solution, it’s important to know where this number came from.

First of all, quantifying the amount of infringing content on the Internet is a difficult task. According to a 2010 GAO report, data on illicit activities such as counterfeiting and piracy is difficult to obtain for both governments and industries. The estimate that 25% of the Internet is copyrighted content most likely came from a report commissioned by NBC Universal prepared by Envisional, a “brand protection” company.

The Envisional report estimates that 23.76% of global Internet traffic is copyright infringement. Envisional arrived at these figures by first analyzing four studies released by network monitoring companies. Those four network reports estimate what the Internet is being used for as a whole, with nearly a third of all traffic consisting of web browsing (visits to sites such as Google, Yahoo and Facebook). Another third consists of video streaming, followed by BitTorrent at roughly 18%, with cyber locker sites, gaming and non-BitTorrent peer-to peer sites sharing the final 22%.

Envisional’s next step was to estimate the amount of infringing content found within the following categories: BitTorrent, cyberlocker sites, video streaming sites, and Usenet. For example, in order to estimate how much of BitTorrent traffic was considered infringing, Envisional sampled PublicBT, the largest tracker in the world. Envisional gathered data on PublicBT by selecting the 10,000 most popular shared files (out of 2.7 million) over the course of a day. From this one-day sample, they concluded that 63.7% of content managed by PublicBT was copyright infringement. A similar sort of analysis led them to find that 98.8% of files on eDonkey and 94.2% on Gnutella were infringing. For selected cyberlocker sites like Rapidshare, Envisional looked for publicly available links into cyberlockers and found that 73.15% of those shared links were infringing. The study also estimated that 4.7% of streaming data on select sites such as ThePirateBay and Isohunt was infringing.

The final step in Envsional’s method was to apply its estimated proportions of infringement to the estimates of Internet usage, and voila: they concluded that 23.76% of the Internet consists of infringing content. The breakdown of this figure concludes that roughly 11.4% of infringement on the Internet occurs via BitTorrent, 5.8% via other peer-to-peer networks, 5.1% via cyberlocker sites, and just 1.4% via video streaming. Moving past the shortcomings of this methodology (like the small sample size, or the fact that a majority of files shared within cyberlocker sites are not public), readers have to ask: Is SOPA a good solution for this problem?

But what do these numbers mean for SOPA? After all, they’re being pushed to support passage of the bill. The study seems to indicate that the biggest component of online infringement is based on peer-to-peer networks, while SOPA supporters are continually raising the examples of streaming sites and websites. After all, several measures are already in effect to target peer-to-peer file sharing, like the “Copyright Alert System” that the content industry rolled out to great fanfare earlier this year. One of the main targets however would be video streaming, which according to Envisional is responsible for just 1.4% of copyright infringement online. To put this figure into perspective, the most recent Sandvine estimates place legal video streaming via Netflix at 23.3% of all Internet traffic. Is it really worth all of the risks to the Internet to target infringement that makes up less than 2% of Internet traffic? Congress would essentially be using a sledgehammer to kill a fly with SOPA.

SOPA’s damage would spread to many unintentional targets. Websites such as Google, Yahoo, Facebook, Twitter, Tumblr, and Reddit, all of which have expressed opposition to SOPA, run the risk of being starved of funds and later shut down. Congress would be risking these businesses and the integrity of the Internet, with little effect on actual piracy rates.

What would? One recent study points to the answer: collapsing release windows and offering affordable legal online alternatives to infringement. Netflix actually pulls people (23% of all Internet traffic!) away from seeking illegal content by providing a viable alternative, and it makes money for artists and for itself while doing so.

Ultimately, a majority of the Internet is comprised of lawful use of content. It is important to accept that Internet piracy will never be completely eliminated. Congress can do better than accepting the first over-reaching solution that is sold to them by industry and the statistics that come with it.  

The Latest