Libraries and Metastuff: December 2008

Thursday, December 18, 2008

DONE

I finished all my final projects, received all my grades, and am now officially done with my MLIS!

It is sort of a strange feeling to be done with school and have a couple days of free time before I head back to NH for Christmas. I am not used to this. I have been watching a lot of neglected Criterion Collection DVDs that had been sitting on my shelves and setting up a new LibraryThing account (my old one was made specifically for a school project and I wanted to start fresh with a lifetime account). I think I also need to start a seperate blog for music/comics/film related stuff.

I have more items to post from library school. Basically, I want to use this as a portfolio (if need be) as well as a general LIS, Metadata, and Cataloging blog. I would have done this earlier but this week my desktop PC suddenly stopped auto-updating and then crashed. I was able to bring it back from the brink long enough to pull most of my documents off of it and put them on my MacBook. The whole thing is sort of a bummer because I am usually really good about archiving my work but I kept putting it off after my external hard drive broke and I switched to my first Mac.

Lesson learned: ARCHIVE YOUR WORK!!!

Anyway, problem solved for now and I picked up a 320g digital passport because I am sick of extra power cords and wanted something portable. So far so good...

As it gets closer to the end of 2008 it is hard not to reflect on the causalities of the past year. 2008 has been a really hard year for my electronics, which is strange because I NEVER break or lose anything.

R.I.P.

Cell Phone (It either broke at the effed up show or when I was texting in the rain outside the club.)
Digital Camera (I dropped it on the floor.)
Laptop (It is doing better but is sort of old and had some overheating problems.)
External Hard Drive (It fell off my pc onto the rug, which is about a foot. Pretty lame.)
(almost) iPod Touch (I dropped it in the middle of the street. The protector did its job but it was still pretty horrifying.)
TV (It was old and had been moved a bunch of times. It survived college and countless hours of Mario Kart. It still stunk.)
VCR (I actually think it can still be fixed and just has a minor tracking problem. Either that or it just hates having to play terrible Italian horror films from the 1970s all the time.)
Final Fantasy III (I hit a glitch 25 hours in and had to return it for a new copy. That's what I get for playing Final Fantasy games?)

Saturday, December 13, 2008

Rice-Aron Zine Collection (my process)

When arrived in Vermont and began working with The Marlboro College Rice-Aron Library’s zine collection (library website is HERE. LibraryThing page is HERE.) I started by putting of the zines in a pile, making a little bit of a mess. It was a great and rare luxury reserved for small informal libraries that are essentially closed (save for a few lone researchers making use of the library’s Rudyard Kipling archive, a couple straggling students, and people from the music camp that occupy the campus during the summer looking for DVDs to watch) and for small collections being built up from scratch. However, I later found that this activity is basically the same when dealing with cataloging or reorganizing any library collection. The only difference is that I did not have any records to work with, no one really knew what was in the collection, and I was able to come up with a way to organize it. I didn't have to fit it within Dewey and the rest of the collection.

I started by sorting out what I already knew from the pile of zines. I have been reading and collecting zines for almost ten years now and I am really familiar with the format and its variations. I knew that the Rice-Aron collection contained a number of zines by Marlboro students and faculty, so I separated all those out. This was easily done because most of the names were familiar to me. There are only 315 students at Marlboro College so it is pretty easy to know most of them by name, even after you graduate. I also knew that most zines fell into three genres (personal zines, D.I.Y./how to zines, and political zines) so I also separated those out. Political and D.I.Y. zines often have really clear and precise titles so this was easy to do. I am most familiar with personal zines so much of the collection I had actually read or knew enough about to identify. Whenever I saw a zine in a different language I also separated it out into a smaller pile. The same was true of zines that needed to be repaired (I spent a great deal of time restapling, taping, and repairing zines throughout the internship).

This left me with a much smaller pile of zines that still needed to be filed and the following established subjects/genres: Personal Zines, Political Zines, D.I.Y. Zines, Zines in Other Languages, and Marlboro Zines.

As I sorted through the unfiled zines I began noticing patterns that revealed the rest of the subjects/genres the collection would need.

For example, Many of the political zines were actually reprinted writings from famous activists or theorists (Emma Goldman, for example) and these fit well with the history zines I was finding. Similarly, the zines about health and sexual health seemed to go well with some D.I.Y. zines dealing with comparable topics. Through this process of reevaluating the collection and reclassifying zines I ended up with the following genres/subjects: (Sexual) Health, Animal Rights and Bikes, Art and Comics, Fiction and Poetry, and Prison Zines.

When trying to figure out how the zine collection should be physically organized I considered my zine collection and the other zine libraries and shops I had previously visited. I knew that zine collections that are shelved like books easily end up really disorganized and are hard to browse. Zines also tend to get damaged when stored this way because they are often frail to begin with. The same is true of magazine racks, which tend to work a little bit better for browsing but can be physically hard on the zines. I keep my zines in magazine storage boxes and thought that they might be good for this collection. I liked that the zines could be moved, shelved, and browsed easily with minimal handling. I also thought that, since the collection was going to be maintained primarily by student volunteers and student workers, I needed a system that required the least effort to maintain. Boxes are pretty low effort and it doesn't take much to match a sticker on a zine to a sticker on a box.

After organizing the collection into the genres/subjects mentioned above, I began scanning their covers. After browsing other zine collections on LibraryThing I noticed that not many of the zines had cover images and that those that did either had poor scans or slightly different covers (something that is not particularly uncommon for zines). I really wanted to have decent scans so the library could use the images again, if the need arose. I plan on uploading all of these images along with images of my extensive punk and hardcore patch collection soon. I'll post a link when it is done.

Most of the zines that I tagged were personal zines because they typically had vague titles. Many of the other zines had titles that were so specific that additional tags were unnecessary and would not aid in searching the collection. For example, if a zine was tagged "Political" and its title was "Anarchism," there was really no reason to then tag it with "Anarchism." This was also partially done to save time because cataloging, scanning, and physically organizing 300+ zines and then designing their containers and the collection's signage was pretty time consuming (especially when cataloging and doing other jobs around the library). In retrospect I am not sure this was the best idea and I probably would have made a bigger effort to tag these items had there been more time. I also decided not to tag certain items because most users browsed the zine collection, it wasn't a research collection, and because the covers were scanned. From all of this information people would likely have enough information to decide whether they wanted the item or not.

I always added Series Information and issue numbers to the Common Knowledge section of LibraryThing. I always used the word "number," even if the actual zine referred to them as something else (like "issue"). I did this for consistency, but I would not have made this decision after this last semester's advanced cataloging courses.

I also added any other information I knew about the zines or was readily apparent (important places, notes about additional artwork by someone else, etc.). I occasionally came across different printings with slight differences that I always included notes about. Sometimes I knew about these differences beforehand but sometimes I looked them up online. This is one area where I wish I had been more careful about noting where the information came from, although this might be counter the whole idea behind LibraryThing's Common Knowledge section. However, I thought this information was important because sometimes there were slight unnoted content differences and, for someone dealing with zines, a silk screened cover is much different than a photocopied one. I added book descriptions from Microcosm Publishing or other websites (personal or publishing) that were mentioned in the zine.

For the author I would use whatever name I could find, even if it was clearly fake.

The format for the publication information was as follows:

Low Hug Productions, Champaign, IL (Self Published) Zine, 53 Pages

If the publisher was just a business name for the same person who wrote the zine (you could tell by the address or the web page listed) I added "(Self Published)". If the publisher was a different person or organization then I left it out.

If there was no publisher information (it was presumably self published) but the zine was well established and listed in the mailing address (as in "contact Doris at this address" vs. "contact Cindy Crabb (the person who actually writes Doris) at this address") then I put that in lieu of a publisher.

Doris, Ashville NC (Self Published) Zine, 32 pages

Then whatever publication location information I could find and the number of pages. I counted pages when reasonable to do so. Sometimes, however, I took the page numbers from a publisher website. I wouldn’t do that now, at least not without putting this information in brackets. I also would more clearly define "self published." I think the distinction is important but remains hazy in my records. I also think that I tried to cram too much information into this field.

For the title information I used this format:

title: number (I always called it "number" for clarity and consistency, a mistake I would not make today) "subtitle" (if there was one)

Sometimes zine writers put out split zines that are either smaller/condensed issues bound together or collaborative zines. I used a slash (/) between the zine titles whenever this came up. For example:

Invincible Summer: Number 11 / Clutch: Number 17: Fifth Annual Split Zine by Nicole Georges (2006)

I ran into a lot of problems when dealing with reprinted political zines where several publishers, distributors, printers, and writers were named (presumably because the zine was claimed, reprinted, and sold by a number of different political groups that changed names and split off from one another throughout the 1990s). When in doubt, I listed them all.

For smaller collections I really I think LibraryThing can work well. In terms of cataloging time it is faster than cataloging the items using MARC, building a database, or creating an online interface for items cataloged in Dublin Core. It is also more versatile than a simple webpage. However, it does take getting used to and can be difficult for users to navigate at first.

It is also unfortunate that having a library thing account doesn't really increase the options available to users, which is a shame. Users should be able to review or assign tags to items not in their personal collections.

One of the strengths of LibraryThing is that users can easily browse and make connections with similar collections. It is often hard to find other collections containing zines or other ephemeral print items and I suspect that this would be a valuable feature for students doing serious research with zines.

I think LibraryThing (despite its faults) was a good choice for this collection and is a great way to maintain and promote collections in a smaller library. It also is an easy way to increase access to a collection that might otherwise just exist in lists on the library's webpage.

Metadata in Context (Pitt's DRL)

This is an assignment completed for part of my MLIS coursework. The goal was to look at a digital collection, read its documentation, study its use of metadata, dissect its workflow, and speak to a metadata librarian at the institution. I chose The University of Pittsburgh's Digital Research Library (D-Scribe) partially for the sake of convenience but also because doing so gave me the opportunity to regularly speak with a metadata librarian and visit the DRL to get a better sense of the process.

The University of Pittsburgh's Digital Research Library

(see "Documentation" in lower right hand corner)

According to the DRL's Mission Statement: The ULS proactively identifies and evaluates resources for conversion into electronic format to be hosted and served by the ULS.

According to the DRL’s "Guidelines for working with the DRL to create Image collections" document: The DRL supports the teaching and research mission of the university through the creation and maintenance of web-accessible digital research collections.

Collection Content:

The collections primarily include digitized historical content with a focus on Pennsylvania and Pittsburgh. They are mostly photographs, text, books, photos, postcards, maps, images, finding aids, and postcards. They are browsable and searchable HERE.

The metadata schemes that are used include:

MARC (Machine-Readable Cataloging) for the text collections.

MARCXML (Machine-Readable Cataloging using an Extensible Markup Language schema) also for the text collections.

TEI Metadata (Text Encoding Initiative) also for the text collections.

EAD (Encoded Archival Description) for archival finding aids. The DRL uses EAD encoded in XML.

Dublin Core for image collections.

MARC records for the cataloged items come in. They are then turned into MARCXML. They are then converted into a TEI header, which is pretty similar to MARC and used for print collections and can also hold structural metadata. Metadata is usually extracted from existing records but, when records do not exist and the contributing agency does not have the means to catalog them, a TEI record is created from scratch.

Dublin Core is pretty much just used for the image collections with a few local modifications. DC is also required for the DRL's participation in the Open Archives Initiative, which ultimately increases access to the collection.

Dublin Core is also central to the DRL's participation in the Open Archives Initiative (OAI). The OAI metadata protocol provides a standard for sharing information about digital objects so that diverse collections from multiple institutions can be searched together, thus increasing awareness, use, and communication about digital resources in the academic community. The OAI metadata protocol requires use of the Dublin core standard to describe digital objects.

"Guidelines for Working with the DRL to Create Image Collections"

The final product for textual items is the TEI including bibliographic info, structural data, links to images, and OCR text. The final product for images is the Dublin Core record and the link to the corresponding image.

Controlled Vocabularies:

When MARC records for print material always have LCSHs assigned to them. The same is true for all items coming from the Archives Service Center and finding aids also have LCSH assigned to them by the ASC or some other archivist. Whenever possible the DRL try to have the content provider assign LCSH, as it is sort of a "preferred standard." However, sometimes collections coming in to be digitized have general LCSHs to describe the collection and then locally created keywords or subject headings to describe the items in the collection.

Types of Metadata:

The structural metadata is primairly for books in the print collections and becomes part of the TEI metadata and is saved as XML. It is also primarily generated using software developed in-house.

Descriptive metadata is used for pretty much all collections, either DC or TEI.

Administrative metatata isn't really used all that much. Currently administrative metadata is only included in certain fields of the DC and TEI records. Apparently the DRL is working on including MIX (Metadata for Images in XML).

The Basic Text Workflow Seems to be:

- get a list of items to be scanned with ID numbers from the books' barcodes and MARC records.

- MARC is converted to MARCXML then MARC XML is converted to TEI.

- the books are sent to the DRL (they are tracked on the computer system through each step of this process).

- the books are processed and sorted depending on scanning methods and materials.

- the books scanned, first in grey scale for text and then in color for images.

- structural metadata is created using in-house software.

- item is reviewed as part of quality control.

- final XML is created from TEI header, structural metadata for the body, links to the corresponding images, and OCR data.

- update the index for collections and update the online collections.

- return deliverables (I know this at least consists of spreadsheets of URLs and bar codes) to Technical Services.

Dublin Core Cataloging (Coleman's article/workform)

I am going to use Anita S. Coleman's "From Cataloging to Metadata: Dublin Core Records for the Library Catalog" and a Dublin Core workform provided by a professor to catalog a digital and physical item (this blog and this thing I just got in the mail from a friend a few minutes ago with some records I traded for). Pretty informal exercise but it should be fun.

THIS BLOG

Title: Libraries and Metastuff

Creator: Tyler M

Subject (using natural language, not LCSH): Library School. LibraryThing. Marlboro College. Metadata. Photos. Vermont. CONTENTdm. DVD Aficionado. DVDs. Digital Collections. Dublin Core.History. Internship. Interview. MARC. NH. Presentation. Punk. The Elvis Room. Zine Libraries. Zines.
(Coleman's document says to take key words from the document and tagging makes doing this easy. However, there are certainly some subject terms and phrases that I might apply or key words/phrases I would take from the website that are in this list of tags. This is one of those places where the restrictions of a controlled vocabulary or local standards is actually helpful. It is also interesting how this becomes an issue of subjectivity and objectivity.)

Description: A blog where Tyler M posts writing and photos primarily about Library and Information Science, Web 2.0, Metadata, and Cataloging.

Publisher: Blogger

Contributor:

Date: 2008-11-17
(First post, since this is usually the date of creation)

Type: Text
(used THIS to fill in Type. What a limited selection...)

Format: Text/html
(it is actually XHTML but that didn't seem to be an option. The Internet Media Types list Dublin Core refers to.)

Identifier: http://librariesandmetastuff.blogspot.com/

Source:

Language: en
(From ISO639, which I was referred to and don't really know much about.)

Relation:
(This is a really interesting field with a lot of potential, but not really applicable to this item.)

Coverage:

Rights: Accessible freely

Audience:
(None listed, and this is used mostly for children's materials.)

PATCH

Title: The Official Seal of the Bureau for Paranormal Research and Defense

Creator:

Subject (using natural language, not LCSH): Patches. The Bureau for Paranormal Research and Defense. Hellboy.

Description: A patch with the official seal of The Bureau for Paranormal Research and Defense embroidered on it. It has a small beer stain on it, belongs to Tyler M, and was a gift.

Publisher: Dark Horse Comics, Inc.

Contributor:

Date: 2008-12-13
(Date received.)

Type: Physical object.
(used THIS to fill in Type. What a limited selection...)

Format.Extent: 3" x 3"

Identifier:

Source:

Language: en
(From ISO639, which I was referred to and don't really know much about.)

Relation:
(this is a really interesting field with a lot of potential, but not really applicable to this item.)

Coverage:

Rights: TM 1998 Michael Mignola. Accessible by permission of the owner, Tyler M..

Audience:
(None listed, and this is used mostly for children's materials.)

MARC and Dublin Core

This is a short assignment I did for the Independent Study of Metadata that was part of my MLIS coursework. There is nothing earth shattering below but its sort of interesting to see records broken down like this. I pretty much just write in those thin sharpies so it may be hard to read, but if you are familiar with MARC then it doesn't really matter. The numbers correspond to where information is located in each record.

Everything that's in the Dublin Core record is also found in the MARC record. The most notable (and obvious) difference is that Dublin Core is readable and much more intelligible than MARC's cryptic numeric and fixed fields. Even qualified Dublin Core is pretty easy to figure out. However, MARC includes quite a bit of information not found in the Dublin Core record. Most of this relates to holdings information and the descriptive information found in the fixed fields. It also doesn't include some of the numbers automatically generated by WorldCat. It seems like Dublin Core is not as concerned with the describing the book's format (probably a poor choice of words), which is what quite a few of the fixed fields deal with. It also doesn't have a bibliography note. It also doesn't seem to be as concerned with the source of information for the catalog record. The cataloging source and modifying agency is absent, same with the local holdings information, where it was cataloged, etc.

While Dublin Core might be missing some information, none of it seems all that essential. The information in the fixed field is almost never put to any great use. As far as I know OPACs don't allow users to narrow search results based on any of the information found in the fixed fields. You can't simply call up a list of English biographies that include illustrations, are held in Massachusetts libraries, and were published in 1973. So, really, nothing all that useful is lost for the user.

The major downside to Dublin Core would have to be that it is not nearly as standardized and records are not as easily shared because it is pretty customizable. The main thing lost in Dublin Core is the precision, but it can be applied to wider range of resources (physical or digital) more easily (although MARC has been adapted and simplified to work for archivists and to create simple crosswalks from MARC to DC).

Metadata Extraction Tools

For this assignment I also played around with DC Dot, a simple metadata extractor that can be found at http://www.ukoln.ac.uk/metadata/dcdoc/. It was up and running when I first played with it but it seems to be down right now. I'm not sure what the deal is. I inputted Pitt's Library and Information Science program's website (http://www.ischool.pitt.edu/) and this is the information that I got:

< rel="schema.DC" href="http://purl.org/dc/elements/1.1/">
< rel="schema.DCTERMS" href="http://purl.org/dc/terms/">
< name="DC.format" content="">
< name="DC.format" content="13347 bytes">
< name="DC.identifier" scheme="DCTERMS.URI" content="http://www.ischool.pitt.edu">

I am not sure where this information comes from other than the URL, which I inputted. I assume the format information is from the program loading the site and then sending back how many bytes of information the page contained. The general idea of metadata extraction is interesting but I wonder how useful it really is at this point (or at least this particular tool).

I entered several different URLs and couldn't get it to provide any more information. I wonder how much more information could be taken from other portions of the webpage. Could it just take from plain text? Could it judge the text size to get a title? Could it use the title on the top bar? Could it look at all the pages on a website, rather than just individual pages? Right now this just doesn't seem that useful. However, being able to extract metadata from webpages would obviously increase findablilty in the long run and would certainly be a better approach than expecting web designers (or just anyone making a webpage) to encode DC in their HTML. Someone will eventually have to organize this mess.

"A barrier to electronic resource cataloging is that many library professionals and information specialists continue to believe that cataloging web resources is a waste of time; it is better to make we pages (essentially webliographies or lists) because many of the web resources are too ephemeral to be included in the library catalog. However, new tools such as URL link checkers make the maintenance of metadata for web resources much simpler. It is more efficient to have users start with the library catalog as a single gateway to the universe of knowledge, no matter the format or type of information sought."

Anita S. Coleman's "From Cataloging to Metadata: Dublin Core Records for the Library Catalog"

Also, a simple URL checker to play with can be found HERE.

Friday, December 12, 2008

The Rice-Aron Library Zine Collection (2.0)

I'm sure I will post this link again when I post more about the process I went through in dealing with the zines but if you want to check out The Rice-Aron Library Zine Collection I built on LibraryThing you can go HERE. Oh, the above photo is the collection. It has only been there for about a year and was built from donations and funding from Town Meeting (Marlboro College is a democratic community and people can bring projects to be funded to be voted on by the community) so right now it consists of about 400+ zines. I believe that it has since grown but currently does not fall within the library's collection development budget and is entirely maintained by students (this had to be a consideration in my planning). The items that did not get cataloged included the run of Punk Planet (because it was decided that it would take more time than it was worth and could remain browsable), some more ephemeral items (music heavy review zines), and the health/sexuality collection (because it was promoted and listed elsewhere on the website, I believe). In total I scanned and cataloged 335 zines.

Top: Punk Planet, Ephemeral Music Zines, Sexual Health (all 3 are browsable, rather than cataloged. Middle: Personal zines (and a really great collection). Bottom: Poetry and Fiction zines and art and comic zines.

NOTE: Each zine was stamped "DO NOT REMOVE FROM LIBRARY" and had a sticker. The stickers corresponded to the subject heading the zine was tagged with in LibraryThing (tags with a "*" were subject the headings, although the fact they were used to maintain shelf order and browsability a the shelf level made them actually more like classification numbers. Oh well.) The stickers also corresponded to the box where it should be shelved. All students had to do was match stickers. Simple.

Top: Collection Rules and Shelving Instructions. Bottom: How to Use LibraryThing. I made these signs. They looked better in person.

LibraryThing Instructions (again) and To Be Repaired Box.

Top: Political and History zines. Middle: Foreign Language zines, zines related to Marlboro College (like the zine Meg Mott, my Political Theory professor, wrote throughout 90s. Awesome!), and Bikes/Animal Rights zines. Bottom: Prison zines (about prisons, about political prisoners, or by prisoners. maybe also some history zines about radical prison groups) and D.I.Y. zines.

MORE TO COME...

Marlboro College Rice-Aron Library Summer Internship 2K8

This summer I returned to Marlboro College to complete a short cataloging internship where I pretty much got to take over their new zine collection, come up with a way to organize it, and make records accessible online somewhere other than in the OPAC. There was some talk in the library about working with LibraryThing for Libraries sometime in the future (god knows the OPAC could use the face lift) and the library also seemed to be getting into integrating web 2.0/social networking, so I opted to catalog the zines using LibraryThing. It made more sense than working with a spreadsheet, building a database, building a webpage from scratch that fit with the school's standard layout (or in the program they use to manage their website), or any of the other options that had been thrown out. In the next day or so (most likely tomorrow) I will post about working with LibraryThing as a sort of additional online space for a small academic library with a limited web presence, cataloging ephemeral materials using LibraryThing, and some of the issues that arose during the process.

In the mean time I am going to post some quick photos from that trip because, well, I want to.

This is where I left from in Pittsburgh. Doesn't that look nice?

Here's the white vibe. Packed and ready for the 12 hour drive to NH and 2.5 to VT.

Here is the little house I shared with some people I went to college with while working at Marlboro College. They liked to talk about the ecological disasters, hating Bush, comic books, and guns. I counted seven spray painted Anarchy logos while I was in Brattleboro. Awesome...

Here is the little room I stayed in. I brought too much stuff with me but it is sort of hard to plan for a month. I am not sure I watched a single DVD I brought with me.

Here is the front of the Rice-Aron Library (The Rice portion, I believe). That balcony on the second flood is pretty much for smoking and drinking while writing papers in the library late at night (The Rice-Aron Library is is open 24 hours a day).

Here is the front of the Aron portion of the library. I worked in this section. Actually, I also had some Political Theory courses in the classroom of the third floor (top window on the right). Looks nice, doesn't it?

MORE TO FOLLOW...

CONTENTdm

I did some group work this past semester on CONTENTdm and thought I would post a little write up about the experience. Pieces of this were taken from a larger group final paper on CONTENTdm, Dublin Core, and VRA Core. There will be more of these types of things coming in the near future.

HOW DOES CONTENTdm WORK?

Only administrative accounts are able to create new collections in CONTENTdm. Similarly, only CONTENTdm administrators can add, delete, and modify user accounts, index metadata, make items searchable within the collection, implement and manage controlled vocabularies, and approve items and projects into the collection. One or two positions in larger organizations would be responsible for managing workflow and providing quality control for content and metadata. This would be provided by other catalogers or metadata librarians with standard CONTENTdm user accounts.

Standard accounts allow users to upload content, provide metadata, and edit some search, display, and formatting options for collections. These features will be further explained below and are mostly concerned with rights management and structural metadata. Users with standard accounts upload content into projects, where they are stored and organized until they receive approval from the organization’s CONTENTdm administrators and are added to the collection. These items are sent to queue in the administrative side of the CONTENTdm acquisition station for the metadata to be checked for accuracy, completeness, and quality.

Content and metadata records can be uploaded in bulk to make larger projects more efficient and so that digitization tasks can be divided throughout a whole department. Once items are uploaded into a project, the metadata can be assigned item by item or applied to multiple items using spreadsheet-like features. This series of screens would seem familiar to anyone accustomed to editing information about songs and albums in iTunes. The spreadsheet view also looks a great deal like the social networking website LibraryThing and the two work quite similarly. Users only need to double click on boxes to complete or edit metadata fields and there are an array of shortcuts to aide this process across multiple items.

Controlled vocabularies help maintain standards, consistency, and allow for faster metadata entry. If the CONTENTdm collection has a controlled vocabulary associated with it then the users cannot add terms to their metadata records not already contained within that vocabulary. This vocabulary can be pre-established, like the Library of Congress Thesaurus for Graphical Materials, or it can be built from scratch and imported into CONTENTdm. A vocabulary needs to be a list of approved terms saved as a simple text document with only one term per line. However, a pre-established vocabulary can also be modified within CONTENTdm. Terms can be added and removed to suit a particular organization’s information needs. Users can submit terms for approval by the administrator as they upload content and records to their project. Controlled vocabularies can also be built automatically from records that have already been created and indexed. This vocabulary can then be applied across collections and with other institutions using CONTENTdm.

CONTNETdm works with any file format, including audio and video and other file types corresponding to plugins and applications working with the user’s web browser. Options available for audio and video collections include the ability to divide files into segments and to assign metadata, structural and descriptive, to each, while still allowing them to be retrieved as a whole. CONTENTdm will assign images to these file types or provide the option to upload a representative image.

In image-based collections there are three ways to indicate copyright information directly on items within the collection. The first is banding, which adds a color band and text to the bottom of images. The second is branding, which puts a small image or logo in the lower right hand corner of all images in the collection. The third is watermarking, which embeds an image into the center of each image in the collection to indicate ownership and copyright. The original image is retained within the system and the watermarked version is displayed for the user.

When working with text-based images, like PDFs or scanned documents, collections can be indexed by the item or by the page. Full text is automatically extracted from born digital PDFs and included in the full text metadata field in the item’s record. A thumbnail image is also automatically generated to represent the document in the collection. If the PDF or text-based image is not born digital then the item can be run through optical character recognition software and then that text can be included in the full text field for the item.

MY/THE GROUP'S EXPERIENCE WITH CONTENTdm

According to the documentation provided by OCLC, our academic CONTENTdm demo setup was supposed to already contain collections built specifically for several different metadata standards (varieties of Dublin Core and VRA) and designed for different types of content. However, this wasn’t the case and we were unable to reconfigure the existing collections to suit our needs or to create new collections that all users in our group could access.

Many of the collections that we found already contained a great deal of content but we were unable to access much of it. What we could access most of it could not be deleted or modified. Our CONTENTdm group continually had issues with access, which eventually lead to the early termination of the project in favor of a research-based project on metadata, Dublin Core, VRA, and CONTENTdm. It was assumed that there was simply a miscommunication between professors and OCLC because the problems our group encountered were clearly unexpected.

We found that a project of this nature required a full time tech support person to put together. As group leader and faculty liaison, this became my role. While some issues were eventually sorted out (everyone was finally able to install the acquisition station, set up accounts, log in, begin uploading items, and start testing CONTENTdm), more arose. The fact that we were unable to continue with the project as initially planned was partially because of the complications involved in providing virtual technical support to students collaborating from across the country.

Some of the specific problems encountered by the group included: projects disappearing after being created; members only being able to access one collection; members being unable to view what had been uploaded, approved and indexed; members having access to entirely different collections; and members having issues running the Acquisition Station on non-Windows machines. This last issue of only providing software for PCs seemed unfortunate considering the growing number of people switching from PC to Linux and Apple, especially those dealing with the manipulation of large image files. Perhaps these developments are still to come.

Despite claims of being scalable, flexible, and customizable, the main issues we encountered with CONTENTdm concerned the way that it seems to only allow for specific workflows. This could be because the group was only working with an academic demo or it could simply be the nature of turnkey library software. However, the group found that CONTENTdm worked best when digitization labor was divided into specific compartmentalized tasks and roles. These roles end up being manifested in CONTENTdm in a rigid hierarchy of access and privileges that could easily become a hindrance for certain projects (namely smaller projects with only a few librarians involved or projects with a workflow based on collaboration and with more fluid or modular roles). From our experience it did not appear that CONTENTdm was particularly adaptable.

We were never able to find a way to get around the division between administrative and standard user roles and privileges. That being said, we were able to use the administrative password to fill all roles, uploading and approving. It is possible to create multiple administrative passwords from the server side of things (which we did not have access to in this demo version), and it is also conceivable that a smaller library would have one or two administrators covering the entire work flow. However, it does appear that this set up would be less than seamless and might become a bit tiresome down the line.

It would have been great to see how CONTENTdm works with OCLC Connexion and Worldcat, how it works on a consortial level, and to get a better sense of its use as an institutional repository, as these areas seem as though they could be its strengths. While several members of the group expressed some pretty strong negative feelings about CONTENTdm, I would have to say that it is difficult to get a real sense of how it would work for different types of projects from a demo version. It certainly seems to have its weaknesses (I have doubts about its adaptability and it seems much less intuitive and user-friendly than OCLC’s advertising would have us believe), but it might very well be the best option available to libraries in need of a turnkey solution requiring minimum in-house technical support.

Libraries and Metastuff