Lorenzo Law Firm is “Working to Protect your Business, Ideas, and Property on the Web." Copyright 2016, all rights reserved Lorenzo Law Firm, P.A.
Monday, September 19, 2016
Website Crawling and Data Scraping Thoughts
Website crawling and data scraping have burdened the growth of e-commerce as website owners are witnessing their data scraped. The legal questions have lingered. Many questions stand out. The prevalence of crawling and scraping has become too of the norm for those using web content for business, research, or marketing purposes. The common theme is that website scraping is used by those who are seeking a short cut in order to catch up to their competition, seeking to emulate their competition, or are seeking to extract information that would otherwise involve too much time. The crawling can be useful for enhancing search relevance, indexing, and accuracy. The software used is not unique. It could be automated just to extract information similar to what search engines do plus do an additional feat by converting the data useful within a database. The data being sought can be extracted from many types of sources. As it could be used by potential newbie business desiring to start at some equal footing, they could seek to get their data from booking websites, yelp, eBay, or even a directory. The potential scrapers can seek to go after a business they desire to emulate. The purposes for which website scraping is pursued gives “big data” gathering a new image with unsavory impressions.
In addition to the ToS concern, there is the copyright infringement issue with website scraping data and content. The ultimate question is to determine which aspects provides the best argument. The Copyright Act seeks to protect the expressions whether they be in a visibly readable form or in a digital form on a server. The Copyright Act may not be effective in addressing or preempting the use sought to be addressed by the website owner. For instance, if the crawling and scraping are not done for commercial purposes, the Copyright Act may not yield the leverage necessary. Yet, Facebook’s case against Power.com which was underscored by the Copyright Act was effective in that the defendant was aggregating Facebook’s data unto another site and that was in violation of Facebook’s terms. The Northern District Court of California denied defendant’s motion to dismiss determining that scraping involves the copying that Facebook explicitly restricts in its ToS.
Aside from the copyright infringement issues, there are considerations that scraping a website or crawling a website against the owner’s ToS is tantamount to unauthorized access or exceeding the permitted use of a website and its content. Such a view resorts to the Computer Fraud and Abuse Act (CFAA) that points to the unauthorized access of a computer system and also points to exceeding the scope of use that is permitted. The use of a website must have exceeded what was authorized coupled with an express and clear statement on the website of what was a prohibited use or activity on the website regarding its content and data. Conjoined with this consideration is the often articulated defensive crutch of ‘fair-use’. Yet, scraping website content does not inherently engender to be the beneficiary of the ‘fair-use’ argument.
Furthermore, web crawling and scraping bring as well the concerns for determining the existence of damages if website content and website data is considered as ‘chattel’. As argued by eBay against Bidder’s Edge, the website platform content and data was argued to be chattel to which Bidder’s Edge trespassed. eBay also argued that the defendant’s act interrupted eBay’s operation. However, the effectiveness of the argument must rely on the existence of damages. Without damages, the argument withers and courts do not see trespass to chattels as a workable argument against website scraping and crawling. A frequently used argument against web crawling and scraping is the Digital Millennium Copyright Act (“DMCA”) which resorts to restricting fair-use of content. What is interesting is the actual bypassing that takes place to circumvent a website’s measures to restrict web crawling and scraping. The DMCA provides an enforcement means for copyright rights of a websites digital content.
The complexity created by the use of bots is elusive and evident. Also evident is that the fair use defense along with the absence of damages and the potential absence of the element of consent and constructive knowledge will continue as points of contention, as website owners oppose web scrapers. The legal issues thus far have crossed from intellectual property and contract concerns to unauthorized access to a network or computer system, raising the specter for continued legal disputes over website scraping.