GTech Booster • What is the Automated Content Access Protocol

Automated Content Access Protocol (ACAP) was the new open standard that enables website owners to express terms under which crawlers (also known as robots or spiders) and others are allowed access to and use of their website contents.

ACAP has been devised by publishers in collaboration with search engine operators and other web content aggregators to revolutionize the creation, dissemination, use, and protection of copyright-protected content on the worldwide web.

The protocol was set to become a universal permissions protocol on the Internet, a totally open, non-proprietary standard. ACAP was established as a joint initiative of the European Publishers Council, the World Association of Newspapers and the International Publishers Association and is now maintained by the International Press Telecommunications Council IPTC.

The idea behind ACAP is to allow automated processes, such as search-engine web crawling, to be compliant with publishers’ policies without the need for human interpretation of legal terms. ACAP was proposed in 2006 by a group of publishing companies as a supplement to the aging Robots Exclusion Protocol.

In November 2007 ACAP announced that the first version of the standard was ready. No non-ACAP members, whether publishers or search engines, have adopted it so far. A Google spokesman appeared to have ruled out adoption. In March 2008, Google’s CEO Eric Schmidt stated that

“At present it does not fit with the way our systems operate”.No progress has been announced since the remarks in March 2008 and Google, along with Yahoo! and MSN, have since reaffirmed their commitment to the use of robots.txt and sitemaps.
Google’s CEO Eric Schmidt

In 2011 management of ACAP was turned over to the International Press Telecommunications Council and announced that ACAP 2.0 would be based on Open Digital Rights Language 2.0.

The World Association of Newspapers is developing ACAP as machine-readable rights information that search engines would use to determine what content they can and cannot index.

ACAP has been met with some skepticism, with some arguing that it is unnecessary since the Robots Exclusion Protocol already exists for managing search engine access to websites. Others support ACAP’s view that Robots Exclusion Protocol is no longer sufficient. ACAP argues that Robots Exclusion Protocol was devised at a time when both search engines and online publishing were in their infancy and is insufficiently nuanced to support today’s much more sophisticated business models of search and online publishing.

ACAP is being developed by a cabal of publishing companies with very little regard for implementation challenges. It would likely represent a significant burden for major search engines, assuming that any decide to support the protocol. ACAP broadly expands on the Robots Exclusion Protocol concept and introduces many new facets and syntactic elements that make it possible for site operators to express indexing instructions with a significantly higher level of precision and granularity. The new features are almost all geared around articulating various kinds of restrictions on content use.