Google Inc.'s Internet-leading search engine on Thursday will begin serving up the entire contents of books and government documents that aren't entangled in a copyright battle over how much material can be scanned and indexed from five major libraries.
The list of Google's so-called "public domain" works — volumes no longer protected by copyright — include Henry James (search) novels, Civil War histories, Congressional acts and biographies of wealthy New Yorkers.
Google said the material, available at http://www.print.google.com, represents the first large batch of public domain books and documents to be indexed in its search engine since the Mountain View-based company announced an ambitious library-scanning project late last year.
The program is designed to make more library material available through a few clicks of a computer mouse and attract more people to click on the highly profitable ads that Google displays on its Web site.
During the next several years, Google wants to create digital versions of millions of books stacked in the New York Public Library (search) and four university libraries — those of Stanford University, Harvard University, the University of Michigan and Oxford University.
Google declined to disclose how many books have been scanned from the libraries so far. The project is expected to require years to complete.
But a bitter copyright dispute is threatening to crimp Google's plans. The Authors Guild (search) and five major publishers are suing to prevent Google from scanning copyrighted material in the libraries without explicit permission. Because it plans to show only snippets from copyrighted books, Google argues its scanning project constitutes "fair use" of the material.
Google postponed the scanning of copyrighted books in August to give writers and publishers more time to opt out of the program. The scanning of copyrighted material resumed this week, with an emphasis on books no longer in print.