Archives vendor spreads fear, uncertainty, and doubt about free software

For those of you who don’t know, I am a member of the free/open source software community.  I have used versions of Linux on my personal computer for four years now.  I run this server and this installation of WordPress on a Debian GNU/Linux installation on a computer that sits in my living room.  I contribute bug reports and post on various fora to help both developers and users.
I am, obviously, also a member of the archival community, and I got an email yesterday that really made me angry.  It was from an archival vendor, Eloquent Systems Inc., the seems to provide software that fills the same role as Archon.  In trying to convince me to purchase their system, they painted the entire free software community with a single stroke, saying that free software is basically unusable in a production environment.  They obviously have a right to compare themselves to other products and to state why they believe their product is better than this competition.  However, I feel that trashing an entire community is just ridiculous.  I’ve attached their email to me, along with my response, beneath the fold.
Continue reading “Archives vendor spreads fear, uncertainty, and doubt about free software”

David Price responds about PAHR

I emailed my representative, David Price, about the PAHR bill before the House a couple of months ago, and he finally responded to me today.  He basically says that the much of the records are held on the state level, the NHPRC gives out grants to states, he likes the NHPRC, but won’t commit to supporting PAHR.  He wants some sort of new state-based program, and he’ll “consider it carefully” when it comes up for a vote.
Its seems like what PAHR is going for is a state based solution: the states have to pay a 50% cost share to get money from this new grant funding agency.  I guess what he wants is the states to pay for 100% of it and have NHPRC be the only national funding agency.  With all the money thrown around in the past couple of years, whats another $50 million, which is far less than one-tenth of one percent of the national budget for last year.  I’m glad we’ve gotten to 50 co-sponsors, and maybe we’ll soon have some debate and even a vote on it.
Full email from David Price after the fold.
Continue reading “David Price responds about PAHR”

Appropriating the word "archive"

Gmail uses the word “archive” to describe the action of moving mail out of your inbox and into the “All Mail” folder.  Barnes and Noble, on their new nook ebook reader, uses the word “archive” to actually mean delete!  When you click on the “Archive” button, it deletes the file off of your nook, although it is still available to redownload from their website.
The technology world has appropriated the word “archive” and have made it mean “hidden from view, but still accessible.”  Gmail is telling people that they don’t have to delete any email anymore: just keep it all, and archive it.
Archivists don’t have the resources to keep everything.  In the digital world, the phrase that is commonly used is “disk is cheap.”  However, the time of archivists isn’t cheap and that don’t have a lot of it to give.  Processing backlogs are already big, and even the advent of “More Product, Less Process” style processing hasn’t cut into backlogs that much.
But what about digital files? Surely we can keep all of those, since we don’t have to worry about physical space nearly as much.  However, describing these files properly is just as important as describing physical, paper files.  We have a Google site search on the Southern Historical Collection’s website; it is not that sophisticated of a search system, but it is a popular search system.  And it basically does a full text search of all of our finding aids; with the system, you get a lot of hits that aren’t the best collections for the for which topic you are searching. Trying to do a similar sort of thing, the full text searching of all of our digital files, would lead to an explosion of that problem.  It would be very difficult to actually find the materials for which you are looking.
Is there a way to get the word back?  I don’t think we can just abandon it, since a fundamental word of our profession.  The general public probably doesn’t need to know everything about archival theory, but they should probably know that we’re not just “the people who keep everything.”  I guess what we need is a sentence or two, an elevator speech, about who we are and what we do.  I think there might have been an SAA project about this, but we need to come up with something thats easy to understand, short, and lacking in jargon that truly explains what we do for people who don’t know.

The word "archive"

There is a problem of language in the archival world.  And that first sentence has already run upon it, and it is both an internal community problem and an external, general problem.  We have lost control of the term “archive;” or, perhaps, we’ve never had control of it in the first place.
There are two different aspects of the archival community itself, the manuscripts/personal papers meaning of the term and the records management meaning of the term.  Both of them get lumped under the term archival, which causes some confusion.  I’ve been developing a series of tutorials for the Southern Historical Collection, and in trying to explain concepts and build “archival intelligence,” we’ve run into a problem of defining what an “archive” is.  A description of an archive as a place where manuscripts and personal papers are held doesn’t really describe the the form and function of University Archives.
I may be late to this, but it seems like this quote  “Archival is an approach rather than a quality” is how we should really began to describe what makes an archival repositories actually archival.  Archival materials are not manuscripts, books, diaries, photographs, etc.  They are not business records or records that an organization is required to keep by law. Archival materials are manuscripts, books, diaries, photographs, business records, or records that an organization is required to keep by law when these materials are selected, arranged, described, housed, and preserved in an archival repository.
This is not even to mention the appropriation of the word “archive” by people in general.   But I’ll talk a little bit about that in another post.

Sorry if I spammed your RSS feed

Hey everybody who’s on my RSS feed (all 4 of you), I’m sorry If I spammed your feed with all of my posts being reposted. I just moved my WordPress to a new server, which should be faster and better. This will also let me get updates from WordPress faster, since I’m now running Debian Testing instead of Ubuntu for my server.
All in all, it should be a better experience for all of us. Thanks!

Teas I've been drinking recently

I’ve been drinking two teas recently, both from Essencha back in Cincinnati.  The first is White Peony, also known as Pai Mu Tan or Bai Mu Dan.  Its a classic Chinese white tea, and its one I’ve had many times before.  The White Peony from Essencha is noticably better than the same kind of tea thats available at Southern Season, which is here in Chapel Hill.  It has a very mild flavor, and thats what I like about it.  It can be steeped up to four or five times.
The other tea that I’ve been drinking is Moroccan Mint.  It is a combination of green tea and mint leaves, similar to the kind of mint tea that they serve in Morocco.  It, obviously, has a minty punch and is a good way to wake up in the morning, or whenever you’re feeling sleepy.

Master's paper 2.0

So, for a variety of reasons, I’ve decided to go in a new direction with my master’s paper.  The reasons include the fact that my IRB application was pretty terrible and the second version of it wouldn’t be done until too late; the class I was going to experiment on didn’t have to use the SHC, which for my paper they would have needed to; and the videos weren’t going to be done in time to show them to the class anyways.  Also, my master’s paper advisor is leaving to get a new job at Barnard College in NYC.
And so, instead, I’m doing a website content analysis of various archival websites to see if they are trying to teach people archival intelligence in their online user eduction.  So I’m doing to do things like counting terms and then assigning them to the various aspects of archival intelligence, and see what apsects of archival intelligence the archival community is actually teaching.  It should be good, and actually achievable.  It will still be a lot of work, as a masters paper should be, but I think its a level of work that I can actually get done in the time that I have left here at UNC.

Browser ennui, part 2

So, in my followup to my first Browser ennui post, I have decided to stay with Firefox, albeit upgrading to 3.6 beta 5, and keeping 3.5.6 on my computer as well.  3.6b5 is faster than Firefox 3.5.6, and only occasionally crashes (usually on sites that have a lot of flash animations on them).  It doesn’t have the problem with gmail that Chromium has right now, and Chromium has also become somewhat crashypants, so it no longer has that advantage over Firefox for me.  I’m excited for the final of Firefox 3.6 and where Mozilla takes Firefox browser from here… now, if only Zotero would update their plugin (which currently only works with FF 3.5)

Nook, Project Gutenberg, and Distributed Proofreading

I got a Barnes and Noble nook, which is an ebook reader, for Christmas.  So far, I’ve only loaded it up with free ebooks from Project Gutenberg.  Project Gutenberg is a project whose goal is to digitize out of copyright books and make them freely available online.  So far, I’ve downloaded Kafka Metamorphosis, Thoreau’s Walden, a book of essays by Ralph Waldo Emerson, and Upton Sinclair’s The Jungle.  All of these books were published before 1923, and so there are editions of those books that are out of copyright.  The basic process is that these books are scanned, OCR’d, proofread, put into an accetable format, and then published online.  However, only editions from before 1923 can be used in this way; even though we all know Shakespeare’s basic texts are out of copyright, you can’t freely digitize a version of it that was published in 2005, because that version isn’t out of copyright.  And so the goal of Project Gutenberg is the make as many of these books freely available online before their sources are too difficult to find anymore.
Their system of proofreading is also very interesting, and something which I have started participating in.  They have a distributed proofreading system where people can each proofread a page or two or as many as they want, so that the burden can be spread out.  Using this system of volunteers, they are able to publish between 100 and 300 books per month, which is really good seeing as there seems to be no money put into the proofreading process at all.  People find books that are out of copyright, scan them themselves, and upload them to the site, where the volunteers then proofread it, put it into the proper format, and then publish it on the website.  This website is a great success of free culture, and the sort of thing that should happen to works once they come out of copyright and into the public domain: they should be made available to the public at large.
Something which I’ll probably address in an RAO blog post soon is: are archives making their digital works available in ebook formats?  The Kindle and the nook can read pdfs (kinda), but are the repositories putting works out there in mobi/epub format, do they specifically mention ebooks on their website?  I know the Internet Archive has ebook formats, but I find their search system fairly difficult to work with.  It will be interesting to look at this further.

Browser ennui

I’ve had a problem deciding what web browser I should be using.  I’ve been using Firefox for a long time, but for some reason the current versions on Ubuntu like to freeze a lot, causing me to have to force quit and then start it up again.  Its also kinda slow, both in opening the program itself and in opening new webpages.  I’ve also been using Chromium, which is what Google Chrome is based on; however, its pretty new and not solidly mature yet.  Also, the development version of it has been suffering from some bugs recently.  So now I’m writing this from Opera, which is the third of web browsers that I have been considering.  It seems fine so far, but it isn’t open source, which is kind of not good.  Also, I don’t really like the way it does extensions… I would rather them just be in the browser bar rather than separate applications that hang out on their own.
Any suggestions for which browser I should go with, or new browsers to try?  In the end, it doesn’t really matter; all of them deliver the same thing: the internet.