Software

Publishers want more control over content indexed by search engines

The ACAP (Automated Content Access Protocol) proposal has been designed by a consortium of publishers to have greater control over the indexing of their content by search engines.

The ACAP (Automated Content Access Protocol) proposal has been designed by a consortium of publishers to have greater control over the indexing of their content by search engines.

An excerpt from AP:

The Automated Content Access Protocol proposal, unveiled Thursday by a consortium of publishers at the global headquarters of The Associated Press, seeks to have those extra commands — and more — apply across the board.

With the ACAP commands, sites could try to limit how long search engines retain copies in their indexes, for instance, or tell the crawler not to follow any of the links that appear within a Web page.

Web administrators use the robot.txt file to convey to search engine crawlers or indexers information regarding what portions of the Web site is up for indexing. The unofficial standard was developed in 1994, and most engines complied with it.

The new ACAP adds more commands to the robots.txt for more control, such as the time limit after which the content should be removed from the search engine index.

The issues here are that the new proposal is not backed by any major search engine. Also, the proposal is in a voluntary acceptance mode.

Will not having major engines on board for the proposal result in a compliance war?

More information:

News sites seek new controls over search engines (PC Pro)

Publishers punt new web crawler blocking standards (Register)

1 comments
pr.arun
pr.arun

Will not having major engines on board for the proposal result in a compliance war?

Editor's Picks