Things I might have tweeted, part 5: Rights, reproductions, and reality #marac

Sorry that I didn’t post from the last session yesterday, my phone died. But I may post a real blog entry about it later. Anyways, here us Rights, Reproductions, and Reality.
A google group about this topic: marac rights and reproductions. The panel wants a discussion forum for when we aren’t at marac.
Roundtable format, the people here are Wendy Hurlock-Baker, Archives of American Art, Daisy Njoku, Smithsonian, Karma Foley, freelance archivist, Jenny Ferretti, Maryland Hist. Soc., Robin Pike, Catholic Uni.
Not going to talk about copyright, except for one thing. Sound recordings. Check out recordingcopyright.org. They are pushing for all sound recordings to be put under federal, not state law. You can see my previous post about this issue for more information.
AAA has a fee, similar to their web use fee, for digtal applications.
Giving permission to use the digital file, through property rights, but most places are putting the burden of copyright permission on the researcher. Wording “the onus of third party rights is on the applicant.”
If you start trying to help them too much with finding copyright holders, then some of the responsibility comes back on you.
Ebooks are considered different editions, so if your forms say one time, non exclusive use, you can charge them twice.
Put rights and citation metadata into your files. You can do this with photoshop.
Gifs strip out some metadata when you go from tif or jpeg to gif.
MHS charges fees based on fule size, so higher dpi gets a higher fee.
Tax exempt does not mean nonprofit. Tax exempt places, like some university presses, still make a lot of money. And archives charge a lot less than stock photo places like Getty.

Things I might have tweeted, part 4: new developments in describing av materials #marac

Jane Otto, Rutgers:
Av description in a digital repository.
New Jersey Digital Highway is the digital repository for all of nj.OpenWMS is the platform that they use. Web based and.open source. Uses the METS metadata schema.
Descriptive metadata is MODS, which is MARC-lite, but extended by the institution.
Build content by targeting faculty whose products are not typically accepted for publication: datasets, video, audio, etc.
Works with faculty to develop custom metadata fields to creat searchability wihin small collections in the digtal repository. They also add fields for specific classes and specific assignments within a class. Sounds like a great service, but a lot of work.
Megan McShea, Archives of American Art
The problem is that there are no published standards for describing av materials at the non-item level.
Archivists often describe av format instead of by topic, which is not helpful.
Since there are only item level guidepines, many av records are over described.
Poor av description is creatong a hidden backlog within processed collections.
Used DACS as a basis for guidelines for processing av material. Like paper records, av should be kept in their intellectual place, not lumped into a series at the end.
IASA gives decent list of standardized format names.
Margaret Kruesi, Library of Congress
DACS-SR, Companion content standard for describing sound recordings (although this is unrelated to the current revision of DACS)
What are archival sound recordings? Unpublished, ethnographic, broadcast, one of a kind, rare published materials.
Researchers want much more data from catalogers.
Want to create a standard that can be used by non-specialists. Drafts of addiions to current DACS sections are mostly complete.

Things I might have tweeted, part 3: New tools to address electronic records challenges #marac

Peter Bajcsy, National Cenyer for Supercomputing Applications:
What do you do if you come accross an unsupported file format? Move the file, get a new format from the creator, or buy the software.
Presidential erecords: reagan, 200k, clinton, 33 million, gwb, 300 million.
Almost 80k different file name extensions.
We need to figure how to automate and scale systems dealing with erecords.
His wanted solution: cloud services. (I have questions about this, but hopefully he will address them).
Conversion Software Registry tells you how to covert from one file type to another.
Polyglot is their software to convert files, available in the cloud and as a download for your repository.
Universal content viewer is part of this, this software tries to display content of any format.
Content based file comparison: compare two files and evaluate information loss over multiple metrics, not just checksuns or something similar.
With this metric, you can tell how much data you will los due to conversion.
They are providing a prototype of polyglot for free, but t can only deal with open formats. You have to buy the system and licenses for proprietary formats to convert them.
Universal viewer, if possible for all formats, could lessen the need for DIPs.
William Underwood, Georgia Tech:
Tools for file format id
Unix file command is the most widely used, but it is limited.
Created a file signature database to extend the file command.
Automatic markup of emails with xml allows for searching and organization. Would be useful for legacy finding aid conversion, transcriptions, mass digitization, etc.
Manually created grammars for 14 categories of documents and used those grammars to apply the xml in each document.
Not practical if we have to create grammars for every type of document, but they are working on creating automatic grammar generators.
Maria Esteva, Texas Advanced Computing Center:
Mapping archival processing to visualization
Analyze large and multivariable collections for archival processing. Lots of different file formats.
Trying to extract as much information as possible from the records hemselves to help create finding aids.
Virtualization of archival collections: a visual representation of archival collections. Colors and size used to represent size and formats in different parts of collections. A non-textual way to see collections.
Visualization of files types can also help with processing of large collections of electronic records. Can show arrangement as well as preservation risk.
The demonstrations were pretty cool, and I can’t wait to get my hands on these various pieces of software and play around. My one critique is that the presenters made it a little difficult to find out where we could get more information about their prijects. Hopefully with some google-fu, I can figure out more.

Things I might have tweeted, part 2: Rand Jimerson at #marac

The plenary speaker for MARAC is Rand Jimerson, suthor of Archives Power. Here is what I might have tweeted:
Students from the University of Kentucky are at my table. They like what marac doea more than their own organization.
Archival Ethics and the Call for Justice is the title of Rand’s talk.
Justice requires resistance to censorship, sanitization, and hiding of documents.
We are always making choices; chosing to save some things means that we aren’t saving other things. We sometimes reinforce the majority view.
Our neutrality is a lie that is built into our profession.
Neutality reinforces the current power structures.
Archives have always been a battleground of conflicting ideologies.
We can be objective, but not neutral.
Objectivity does not preclude advocacy, but puts responsible limits upon it.
Archives are on he front line of public controversy and have a responsibility to the community throuh the records.
9 keys for archivists to respond to the call for justice.
Ensuring diversity in the archival record. One of the three greatest challenges to the profession.
Welcoming the stranger into the archive. Records are witnesses to those who are voiceless.
Recognize hat selecting and appraising should be based on clear and open policies.
Listening for oral testimony and going out and creating oral histories.
Make archival description sensitive to the representation of those described.
Providing inclusive reference and access, within he bounds of laws and cultural representations.
Embracing new technologies, web 2.0 and digital records.
Supporting open government and public accountability.
Public advocacy, boh in archival terms and in the general public interest. Become whistleblowers if records are threatened.
(end of the 9 keys)
The result of these actions will keep archives by, for, and of the people.
Social responsibility is one of SAA’s 11 core duties in a current draft statement.
We must overcome our biases and embrace social responsbility.
Archivist must not support or seem to support elites at the expense of the people.

Things I might have tweeted, part 1

The hotel in which MARAC is being held is in Alexandria, Va., and happens to be made of lead. Or, at least, that is how it seems, because if you take more than 2 steps away from a window, all connections drop out. When I go outside, my connection is just fine. So, I am going to try an experiment: writing what I may have tweeted into WordPress, and posting them when I get internet. It is not the same experience as live tweeting, but hopefully it will give you at least some insight into what had been happening at MARAC Spring 2011.

A journey through copyright for sound recordings

Recently I was looking for some music to use for a project a friend of mine is working on. I wanted to either use something public domain or something using a Creative Commons license, and not even try to navigate the waters of “fair use.” Librarians and archivists generally know the scope of US Copyright law. We know that all published works that came out before 1923 are in the public domain. We also know the scope of copyright for new works: 70 years after the death of the author, or 120 or 95 years if a work was created for hire.
However, sound recordings are much more complicated. Federal copyright law did not cover sound recordings until February 15, 1972. When that law was passed, the federal government left the task of determining the copyright status of pre-1972 works to the states. Also in that law, they said that federal copyright law would supercede whatever the states decided on February 15, 2047. (The Sonny Bono Copyright Term Extension Act of 1998 postponed this date until February 15, 2067.)
Wikipedia claims that this means that works created before 1972 are not under copyright. I was able to find folk recordings, including Gene Autry, bluegrass recordings, and a whole host of other things from the 1920s, 30s, and before. However, I did not find anything by any prominent bands. This is most likely due to a ruling by the New York Court of Appeals, which is the highest court for the state of New York. In 2005, Capitol Records sued Naxos, a record distributor from the United Kingdom. Naxos had been digitizing old records from the 1930s and 40s, putting them onto CDs, and selling them in both the UK and the US. The copyright on these records had expired in the UK, and so it was legal for them to sell these CDs there. Some of the records that they digitized were owned by Capitol Records, who claimed that they retained common law copyright protections over those recordings, even though federal copyright law did not apply.
The New York Court of Appeals agreed. In their opinion, they said that “The musical recordings at issue in this case, created before February 15, 1972, are therefore entitled to copyright protection under New York common law until the effective date of federal preemption—February 15, 2067.” While this is currently only law in New York and other states are free to enforce copyright differently, I have the gut feeling that this will be the standard applied everywhere. Wikipedia is getting away with it currently because either no one has noticed or because the owners of the copyright don’t care enough to sue. However, I bet Wikipedia would lose if they were sued. As a result of this case, it seems as though no sound recording can be reliably claimed to be in the public domain until February 15, 2067.
So instead of trying to navigate the world of “public domain” sound recordings, I turned to another source of freely available music: Creative Commons licensed music. There are websites out there that allow artists to post their music under any of the Creative Commons suite of licenses. For example, I got an album by the band Walker Fields from the website Jamendo; I’ve never really explored that website too much, but I’m definitely going to do so more in the future.
I know that the Southern Folklife Collection at the University of North Carolina has multiple streaming stations available, which stream bluegrass, country, and other old-time music from their collection. I also know that the Wax Cylinder Project at UC Santa Barbara has digitized wax cylinder recordings and made them available under Creative Commons licenses, but these are only two examples.
So my question to you is, if you have sound recordings in your archive, what is your policy for making them available and what is your takedown policy?

A sidelong look into the world of special collections dealers

First of all, if you’re not listening to This American Life, whether or the radio or in podcast form, you should start. But this past week’s episode, entitled Original Recipe, gives us a sidelong glimpse into the mindset of special collections dealers. As should become very obvious, this is about a unique situation that happens very rarely. But I think that, throughout the course of the story, John Reznikoff says things that give you insight into the mindset of dealers more broadly.
Briefly, Reznikoff is a document expert, handwriting expert, and big money dealer of artifacts and documents. In 1993, he was befriended by a man who told him that he had documents proving that John F. Kennedy paid off Marilyn Monroe and that he had ties to the mafia. These documents were verified as true by other experts and sold. They, of course, turned out to be forgeries.
But the part that interests me is more in the set up to the story, before Reznikoff is duped. When he gets a collection of items, whether it be artifacts or documents, his goal is to make as much money as he can off it. For example, he sold President Obama’s first car, a Jeep, but was allowed to strip many of the original parts out first, which he then sold as well. This mindset, applied to cars, is one thing; but when applied to special collections material, it becomes much more of a problem. The items with which Reznikoff is dealing, the artifacts and documents of the rich and the famous, can be sold as individual items because there are people out there willing to pay thousands of dollars to seem closer to someone famous. But through the power of the Internet, more people think that they can make money by chopping up collections and selling them piece by piece. We, as archivists, need to reach out to amateur dealers and try to get them to at least understand where we are coming from and why keeping collections as a whole can be important.
The other half of the episode of This American Life is about the search for the original recipe for Coca-Cola. They talk, briefly, to Coke’s corporate archivist, asking him questions about the recipe for Coke that was found in their archives. I know that they have trade secrets to protect, but their archivist seemed to be more focused on obfuscating information rather than providing access. That may be a byproduct of the mythology that Coke has built around the original recipe, but it still seemed off coming from an archivist.
If you have listened to the episode, what do you think?

Debian releases Squeeze!

For those of you who don’t know, I have run some distribution of Linux on my laptop since 2006 (Ubuntu, Debian, Fedora, Ubuntu, Fedora, and currently Debian(in that order)) and I currently run Debian GNU/Linux on the server that runs this blog. Most Linux distributions release every six months to get the latest and greatest software out there to the masses, whereas Debian releases every 18-24 months and is meant to be a rock solid platform for you to build upon. Debian is also completely run by volunteers and has been for 17 years, proving itself to be a rock of the Linux community. I may experiment with other Linux distributions, but I always seem to come back to Debian. On Sunday, Debian released their 6.0 release, nicknamed Squeeze. It shows that a group of dedicated volunteers, now numbering around 1000, can really accomplish something remarkable.
Not only do I use Debian personally, but I know that my library’s IT staff uses a Debian derivative (Ubuntu) to run its servers. That means that all of the great services that the SCRC offers, such as our collections database, our digital archive, our public wiki, and our soon to be released Omeka site all run on free software created by volunteers. That’s not to mention that our new library website is going to be built on Drupal 7 and our new OPAC will have VuFind on top of it. Commercial vendors that provide library services, such as OCLC, Sirsi, and others can provide good services but their prices start to add up. Without free software, libraries and archives wouldn’t be able to provide nearly as many great services that we give to our patrons. With one of the rocks of the free software world coming out with a new release, its a good time to remember how much special collections and archives rely on our IT staff and the solutions they can provide to do remarkable things.

New Year’s Resolutions

I’m not usually one for New Year’s Resolutions, since I end up not keeping them. But I wanted to put out some professional resolutions and hopefully by putting them out in public, I’ll be more likely to keep to them.

  1. Become a better archivist: This has been my first few months as a professional archivist.  I’ve done some good things, but I’ve also made some stupid mistakes.  I know that I can do better, and I will do better.
  2. Learn PHP: For years now, I’ve wanted to learn some sort of scripting/programming language (I’ve even wanted to try and become a Debian Developer).  I taught myself HTML back in 1998 and have picked up some CSS along the way, but I haven’t been able to do more complex stuff than that on my own.  The three main content management systems (WordPress, Drupal, and Joomla!) are all written in PHP, Mediawiki is written in PHP, Archon is written in PHP, and the new AT/Archon platform may be written in PHP.  I want to know enough PHP that I can go into their code and not feel completely lost.  Maybe someday I’ll even write a plugin or a theme or something.  So if anyone has any good resources for learning PHP, I’d be much obliged.
  3. Attend conference(s): I’ve now been to two professional conferences (2007 MARAC and 2010 SNCA), but I’ve never really been a part of them.  In 2007, I helped usher people into Swem for the evening reception and went to one panel discussion the next day, but I had other things to do.  In 2010, I was part of a panel that discussed collaborative processing techniques, but I was unable to stay around for the rest of the conference (I was proposing to my then-girlfriend the next day).  This year, I want to at least go to the spring MARAC conference and perhaps make it to SAA.
  4. Start a podcast: This one is more of a wish than a resolution, but its something I’ve been thinking about.  I’ve wanted to do a podcast about archival topics for awhile now, but I’m refusing to let myself unless I know I have enough topics to last me at least a few shows.  If I don’t have enough to say to regularly post on this blog, will I really have enough to sustain a podcast? Or will starting a podcast prod me into actually working on it? Who knows. If I end up doing it, it will probably be in the shotcast format, meaning a short, 10-ish minute show that is merely the start of the conversation.  I do have some ideas smoldering, so perhaps this will eventually happen.
  5. Blog more often: Every blogger’s resolution.

Those are the four resolutions that I’ve been pondering on.  As long as I succeed with #1, I’d be fine with the others falling by the wayside.  What about you all? Any archival resolutions from the crowd?

Facebook’s Profile Download

As you may have heard, Facebook has been planning to allow all of its users to download all of their information.  At least for my account, that promise has now become a reality.  Not only is this good for people concerned with , but this is also good for archivists.  Now that people have control over getting their data out of Facebook, they can chose to donate it as a part of their personal papers, just like how someone would donate their letters or diaries.
So I decided to download my own Facebook profile, to see how long it took and in what formats the data was presented.  The first step was go to Account Settings; there is now a link there called “Download Your Information.”  When you click on it, a popup tells you that you’re going to have to verify your information before the download can begin.
First screen of Facebook Profile DownloadThe “Pending” button appears when you click to start the download of your information.  For my profile, it took somewhere between 20 minutes and an hour; I left to do some cleaning for awhile.  When it has finished, Facebook sends you and email.  You then re-enter your password and download the .zip file.  Mine was 44mb.
Second shot
Photos folder in Facebook Download
html folder from Facebook Download
This is what you find once you unzip the file.  The photos folder holds, obviously, all of the photographs, arranged by the album in which you put them.  The names of the photographs are the names that you give them in Facebook, not the original names, but that’s not a big concern.  There are html files for each of your albums, your list of friends, your wall, messages, etc. However, its through the “index.html” file that you navigate through most of your information.
Third ShotWhat you see is basically a stripped down version of your profile.  It has all the posts on my wall going back to August 2006, even though I joined Facebook in October 2004.  Looking at my posts from around that time, it seems like that was when the first “New Facebook” was created.  However, my received and sent messages go all the way back to 2004.  There is also a list of all your current friends as just a static text list.  Under the events tab, it shows all of the events that you have said yes, maybe, or no to but it doesn’t say which response you gave to each individual event.
As you can see, this information is easily downloadable and easily able to be browsed.  For all the flak that Facebook has taken for privacy concerns over the year, this is a step in the right direction.  As an archivist, it also makes me happy that people now have the ability to download, preserve, and even donate this part of their digital life.  Google has already started to document how their users can get their data out of their various services with the Data Liberation Front.  I hope that other social networks will follow suit and allow their users to start having more control over the information that they put in.