Nook, Project Gutenberg, and Distributed Proofreading

I got a Barnes and Noble nook, which is an ebook reader, for Christmas.  So far, I’ve only loaded it up with free ebooks from Project Gutenberg.  Project Gutenberg is a project whose goal is to digitize out of copyright books and make them freely available online.  So far, I’ve downloaded Kafka Metamorphosis, Thoreau’s Walden, a book of essays by Ralph Waldo Emerson, and Upton Sinclair’s The Jungle.  All of these books were published before 1923, and so there are editions of those books that are out of copyright.  The basic process is that these books are scanned, OCR’d, proofread, put into an accetable format, and then published online.  However, only editions from before 1923 can be used in this way; even though we all know Shakespeare’s basic texts are out of copyright, you can’t freely digitize a version of it that was published in 2005, because that version isn’t out of copyright.  And so the goal of Project Gutenberg is the make as many of these books freely available online before their sources are too difficult to find anymore.

Their system of proofreading is also very interesting, and something which I have started participating in.  They have a distributed proofreading system where people can each proofread a page or two or as many as they want, so that the burden can be spread out.  Using this system of volunteers, they are able to publish between 100 and 300 books per month, which is really good seeing as there seems to be no money put into the proofreading process at all.  People find books that are out of copyright, scan them themselves, and upload them to the site, where the volunteers then proofread it, put it into the proper format, and then publish it on the website.  This website is a great success of free culture, and the sort of thing that should happen to works once they come out of copyright and into the public domain: they should be made available to the public at large.

Something which I’ll probably address in an RAO blog post soon is: are archives making their digital works available in ebook formats?  The Kindle and the nook can read pdfs (kinda), but are the repositories putting works out there in mobi/epub format, do they specifically mention ebooks on their website?  I know the Internet Archive has ebook formats, but I find their search system fairly difficult to work with.  It will be interesting to look at this further.

Leave a Reply