October 26, 2005 1:06 PM PDT

An open-source rival to Google's book project

SAN FRANCISCO--When it comes to digitizing books, two stories appear to be unfolding: One is about open source, and the other, Google.

Or so it seemed at a party held by the Internet Archive on Tuesday evening, when the nonprofit foundation and a parade of partners, including the Smithsonian Institution, Hewlett-Packard, Yahoo and Microsoft's MSN, rallied around a collective open-source initiative to digitize all the world's books and make them universally available.

Google was noticeably absent from the cadre of partners, considering that the search behemoth has a high-profile project of its own to scan library books and add them to its searchable index.

Click for photos

Some supporters of the Internet Archive, a San Francisco-based nonprofit, took the opportunity to criticize such private ventures.

"We want to digitize all human knowledge...and we can't risk having it privatized," said Doron Weber, an executive of the Alfred P. Sloan Foundation, a philanthropic organization that has contributed more than $3 million to the Internet Archive since 2003. Citing the importance of an open library for educational purposes, he called on private companies to "rein in their impulses" while urging libraries to "embrace the future."

Still, a Google executive in attendance downplayed the perceived rivalry.

"I think (the project) is great," said Alexander Macgillivray, Google's senior product counsel, following a presentation on the book-scanning effort. "It's a shame it's being portrayed as a battle between the two projects because the efforts are complementary."

Digitizing books has become a focus in recent years as people try to make otherwise analog information available on the Internet. Academic research, music from classical to pop and video are all being digitized, and now books are in technology's path.

Google put its own far-reaching digitization project in the spotlight 10 months ago, when it announced partnerships with Harvard University, Stanford University and others to digitize collections of copyright and out-of-copyright books. In 2004, Amazon.com also opened up a digital book collection on its Web site and announced its efforts to scan popular works in partnership with publishers. Amazon visitors can "search inside the book" as a result.

Still, to make the millions of books in the world available online is a Herculean task. Issues of publisher copyrights, data storage and backup, and labor costs must still be hashed out. It would take 6 petabytes to digitally store just 1 million books, according to the Internet Archive. By comparison, Google reportedly has stored nearly 10 million Web documents, requiring between 1.7 and 5 petabytes of storage.

One thorny issue has already reached the courts. Google faces lawsuits from publishers and authors that claim it is violating their copyrights and overstepping the boundaries of fair use laws. Google has made scanning books an "opt out" program for publishers, meaning they must actively tell the search company not to scan their books to stay out of the company's Web index.

The Internet Archive only plans to scan books that are in the public domain and those that copyright holders have given the green light for scanning.

Though it has been working on the effort for years, the Internet Archive recently jump-started its effort by introducing the Open Content Alliance. Members include Adobe Systems, Columbia University, the European Archive, the Biodiversity Heritage Library and Smithsonian Institution Libraries.

Yahoo and MSN Search are also notable members, given their investments in Web search and driving traffic to their proprietary services. The two companies boasted the openness of the project Tuesday night, but their allegiance to the open-source project surely is a strategic counterbalance to Google's project. In the end, the open-source library will also be searchable using MSN Search and Yahoo.

Their support means donating money. MSN Search, for example, has committed approximately $5 million to ensure 150,000 books are scanned and added to the collection over the next year.

Last week, the Internet Archive launched Open Library, a Web site that will eventually house all the world's books, according to the nonprofit. It now demonstrates the project with 15 digitized works. The Web site's interface is modeled after that of the British Library in the United Kingdom.

The foundation will digitize 18,000 works of fiction chosen from the University of California archive project that are no longer bound by copyright.

For now, people can download 15 demonstration books from the Open Library site and print them for free at home. Visitors can

CONTINUED: ...
Page 1 | 2

See more CNET content tagged:
Google Inc., open source, publisher, Amazon.com Inc., MSN

Add a Comment (Log in or register) 8 comments
What about the Gutenberg project?
by powerclam October 26, 2005 9:57 PM PDT
http://www.promo.net/pg/list.html
Seems like ANY discussion of this sort of project MUST at least MENTION Project Gutenberg.
THOUSANDS of past-copyright books scanned in an ongoing project that was essentially open-source before open-source had a name.
Why doesn't the article make even a passing mention?
Reply to this comment
What about the Gutenberg project?
by powerclam October 26, 2005 9:57 PM PDT
http://www.promo.net/pg/list.html
Seems like ANY discussion of this sort of project MUST at least MENTION Project Gutenberg.
THOUSANDS of past-copyright books scanned in an ongoing project that was essentially open-source before open-source had a name.
Why doesn't the article make even a passing mention?
Reply to this comment
Gutenberg, gutenberg, gutenberg
by ciropabon October 27, 2005 1:04 AM PDT
What a shame! You did not do your homework... What about spending 10 minutes looking for e-books? You can get 16.000 books from Gutenberg through P2P, RSS or you can download them to your PDA. You can even get DVD or CD images for the entire catalog (the million dollar DVD). You can check if the book you are transcribing is in the public domain. And this time the volunteer work is, without doubt, better than having a guy flipping pages in a voting booth contraption, because WE READ AND PROOFREAD the books. But no, quality problems are only for Wikipedia articles, I guess. Well, I can understand you: the project has been around only for 34 years... it is not news: for news, you have Google, google, google. Gurgle.
Reply to this comment
Gutenberg, gutenberg, gutenberg
by ciropabon October 27, 2005 1:04 AM PDT
What a shame! You did not do your homework... What about spending 10 minutes looking for e-books? You can get 16.000 books from Gutenberg through P2P, RSS or you can download them to your PDA. You can even get DVD or CD images for the entire catalog (the million dollar DVD). You can check if the book you are transcribing is in the public domain. And this time the volunteer work is, without doubt, better than having a guy flipping pages in a voting booth contraption, because WE READ AND PROOFREAD the books. But no, quality problems are only for Wikipedia articles, I guess. Well, I can understand you: the project has been around only for 34 years... it is not news: for news, you have Google, google, google. Gurgle.
Reply to this comment
gutenberg.org
by spytrdr October 27, 2005 10:03 PM PDT
I agree with the previous comments.
It's very cheap reporting to not even MENTION Project Gutenberg as the grand daddy of all these new book-digitizing projects that are just warming up the scanners.
There's also SunSite, onlinebooks.library.upenn.edu and many others.
And then there's the coolest project of all:
UNILIBRARY (com/net/org)
Reply to this comment
gutenberg.org
by spytrdr October 27, 2005 10:03 PM PDT
I agree with the previous comments.
It's very cheap reporting to not even MENTION Project Gutenberg as the grand daddy of all these new book-digitizing projects that are just warming up the scanners.
There's also SunSite, onlinebooks.library.upenn.edu and many others.
And then there's the coolest project of all:
UNILIBRARY (com/net/org)
Reply to this comment
Bookmobile connectivity
by finlandforum June 21, 2007 1:15 PM PDT
Yes, the bookmobile is driving proof that universal access is possible today. But there is a problem. And its name is Internet connection.
http://www.highspeedsat.com/bookmobilesinstalls.htm
Reply to this comment
Bookmobile connectivity
by finlandforum June 21, 2007 1:15 PM PDT
Yes, the bookmobile is driving proof that universal access is possible today. But there is a problem. And its name is Internet connection.
http://www.highspeedsat.com/bookmobilesinstalls.htm
Reply to this comment
Powered by Jive Software
advertisement

Latest tech news headlines

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

More feeds available in our RSS feed index.

advertisement

Inside CNET News

Scroll Left Scroll Right
  • News - Business Tech

    Chrome's JavaScript challenge to Silverlight

    The advent of Google's Chrome browser, software pros say, should spur a big speedup for JavaScript, which would raise its standing against Microsoft's Silverlight technology.

  • Gallery

    Photos: Top 10 reviews of the week

    Here are CNET Reviews' 10 favorite items from the past week, including the TiVo HD XL, Sony Cyber-shot DSC-H50, and the Dish Network's newest digital TV converter box.

  • News - Apple

    Apple watchers spot 'iPod Nano' pix, iTunes hints

    The rumor mill has long been predicting a longer, leaner new version of the iPod Nano, and now it's conjuring up some pictures.

  • Outside the Lines

    EIC Squared: Chrome, iPods, and a Dell-Salesforce union

    On this week's EIC Squared podcast CNET's Dan Farber and ZDNet's Larry Dignan discuss Google's latest rocket launch--the Chrome browser--as well as Apple's iPod event next week and a Dell-Salesforce.com union.

  • Video

    Katie Couric reflects on first Webcast

    The political conventions are over and so are CBS Evening News anchor Katie Couric's first series of Webcasts. CNET's Kara Tsuboi sat down with Couric on the final night of the Republican National Convention to discuss what she liked about Webcasting, some of her most memorable guests, and whether TV news will still be around by the next round of conventions.

  • News - Digital Media

    At 10 years old, whither Google?

    Daniel Sieberg of CBS News looks at how the company grew exponentially from start-up to superstar and part of our culture, but what's ahead?

  • Video

    YouTube plays party politics

    During the presidential campaigning four years ago, YouTube didn't even exist. Now it's a tool candidates must master to get their message across. CNET's Kara Tsuboi stops by the YouTube upload booths at the Democratic and Republican conventions to find out why Google's video site has such a big presence in Denver and St. Paul, Minn.

  • News - Gaming and Culture

    Are Demo and TechCrunch50 fragmenting their audiences?

    With both events scheduled to start Monday, many press, as well as venture capitalists and others are having to choose which one to attend.

  • News - Cutting Edge

    Execs predict next Google-like tech

    On eve of company's 10-year anniversary, researchers and business pundits speculate about what technologies might someday have as much impact as Google.

  • Gallery

    Images: The art of 'Spore' prototypes

    Will Wright and his Maxis team worked on dozens of prototypes to test the elements of their soon-to-be-released evolution game. Here's a sampling.

  • Webware

    Mozilla releases second Firefox 3.1 alpha

    Added features include support for a new video tag element introduced with the HTML 5 standard, along with some speed enhancements.

  • Green Tech

    Duke Energy to invest in mini solar power plants

    Can hundreds of rooftop solar panels collectively operate like a central power plant? Duke Energy launches $100 million distributed solar program to find out.