Finding Corporate Knowledge: Three Case Studies

October 21, 2008

Deb Hunt presented this session. The first two case studies were for corporate environments (an environmental engineering firm and an architectural/design firm) , the third for a non-profit (The Exploratorium). WiFi went up and down throughout the session.

Universal challenges:

  • Too many information silos
  • Dirty data
  • No metadata/classification nor taxonomy in place
  • Differing needs of different groups
  • Multiple physical locations with differing protocols for storing information
  • Loss of intellectual capital when people leave
  • [One other thing I didn't capture in time, but the presentation is available online at ]

Environmental firm had a traditional library of externally-published documents (both print and digital), five offices in California with plans to expand to Nevada and Oregon. Security of proprietary information was key.

Deb interviewed the staff to understand the information-seeking behaviors of different groups and levels and what issues those groups and levels had, then presented at the annual all-staff meeting. She only spoke for a few minutes, the majority of the presentation was staff people talking. Two people had created their own databases, yet nobody else knew about them. One person had a LexisNexis subscription which nobody else knew about.

She used a number of sources to research appropriate systems (Capterra’s Library Automation Software Finder, Marshall Breeding’s Library Technology Guides, article in October issue of Computers in Libraries) and created an RFI and spreadsheet for vendors to fill out. Narrowed candidates down from 12 to 3, only one candidate allowed the company to retain proprietary documents in its system. The firm is using a cataloger to catalog print and digital items (with items prioritized), instituted a Dumpster Day, the intranet is being redesigned and marketed to staff. The firm will continue to train staff, market the portal and get buy-in for contributing documents to be catalogued. Education needed (not everything is digital).

The architectural/design firm has six U.S. offices and one in Asia. The staff is young and uses Google to find images and information (despite the firm’s already owning similar information). No information professionals on staff, lots of silos, badly-designed intranet which nobody used, and each office had its own culture for sharing and maintaining documents.

The CEO wanted a simple “Google-like solution,” hates the end-user searching in Canto Cumulus’ image database. He had a very naive view of searching. She got advice not to use Google Search Appliance (not cheap, hard to get support) and brought that to the CEO. She knew she was looking for enterprise search solutions, and identified 43 possibilities from Capterra’s main enterprise software directory and input from colleagues. After sending an RFI to the 43 vendors, she narrowed the set to 19, and then to 8 solutions in 3 tiers. The number one choice was a partial solution, as it only handled project management, but it did have the architectural/design focus.

Deb recommended they hire a librarian/information professional, the firm wanted to hire another IT person but she talked them out of it. She wrote a job description and had it posted to multiple job sites, and the firm hired a recent graduate of San Jose State University’s SLIS program who had worked part-time at an architectural firm for 10 years. The firm is on the verge of implementing a solution, despite the CEO initially not having much faith in finding a good solution the first go-round.

They are looking at open-source content management systems and portals.

The Exploratorium was an early adopter of the Web and has 577,000 in-person visitors/year. They began looking into knowledge management in 2003. Challenges included multiple format contents (including print, images in both digital and print, Hi8 video, VHS, U-Matic, audio cassettes, audio reel-to-reel), some of which no longer had players in production. Content was also in multiple places. The digital asset archive has a clunky end-user search but staff use Canto Cumulus (both a client and an internal database). 1.8 FTEs in media archiving, 0.45 full-time equivalents in knowledge management and 2.75 full-time equivalents in the Learning Commons. The Exploratorium is still struggling with enterprise search, but the intranet is now the main source for information internally and is well-marketed.

Key takeaway - there is no one-size-fits-all solution.

10/25/08 UPDATE: Deb has her presentation up (direct link to PowerPoint).

Technorati Tags:


Danny Sullivan’s Keynote

October 21, 2008

As he did last Internet Librarian, Danny Sullivan talked about trends in search and what he saw as future developments. Before that, Cindy Shamel presented the AIIP Technology Award to 10KWizard (and Marie Varelas from my local chapter of SLA accepted the award).

Danny sees Google as ruling the search roost for at least the next five years, and being the dominant tech company generally. He talked about the transition of tech leadership from Microsoft to Google, similar to the transition from IBM to Microsoft. Danny gave the example of Cuil.com as a failed challenger, along with Powerset and Microsoft itself.

Google has a 60%-70% share in the U.S. and it’s higher in many other countries. 80-90% share of website traffic is coming from Google.

But there are some areas other search engines handle better, and they might get more attention.

  • Summize searches Twitter in real-time.
  • Urbanspoon is a mobile search engine which randomly selects restaurants near you (using cellphone tower triangulation). A similar tool is Chowhound.
  • For events, Eventful and Upcoming (I wondered about microformats and whether Google would be able to drive adoption).
  • Yelp offers local reviews - Google Maps is trying to grow the equivalent.
  • Trulia and Zillow for real estate information.
  • Kayak is a multi-site travel search engine.
  • Farecast offers predictions of fare prices.
  • Craigslist for buying/selling items, personals. Google tries to compete with Google Base, but not there. Looks pretty, but the information isn’t there.

And there are even more options out there:

Hard to remember all the smaller search engines out there and what they do.

Yahoo! innovates (mobile search, BOSS, SearchMonkey) but its future is uncertain.

Microsoft is focused more on ads than search and is exclusively focused on consumer search, has major branding problems. There is some good stuff, but will people notice and will Microsoft search grow?

Google’s master plan? Look at Danny’s article on the Google hive mind. Google might focus more due to the current economic problems, and of course ads everywhere.

Other Google search initiatives:

  • Google Video has been a video meta-search for about a year.
  • Universal search mixing of results continues.
  • Google Trends continues to grow and provides data on website traffic now.
  • Community editing on maps grows but so is map spam.
  • Google Blog Search clusters stories and identifies top headlines.

Google is customizing searches based on geographic location, previous queries, Web history. It’s done some of this for a while, but it’s being more explicit about this and will likely be doing more customization going forward.

Mobile and vertical search do offer new opportunities for search, but generally you’ll have Google dominance in search with Microsoft a very distant number two.

Technorati Tags:


Improving Navigation and Findability

October 20, 2008

Tom Reamy, who presented on folksonomies and tagging last year, led this session. Since it’s in the San Carlos Ballroom, back to no Wi-Fi access. Then again, from what I’m hearing, even if there’s Internet access in a given room it’s spotty at best.

His talk is about integrating semantics, taxonomy and faceted navigation, with a look at what works and what doesn’t for media sites.

Facet - Orthogonal dimension of metadata
Taxonomy - Aboutness of documents
Ontology - Relationship between entities and facts
Software - Text analytics and auto-categorization
People - Tagging and evaluation of tags, fine-tuning rules and taxonomies, social tagging and suggestions

Facets are metadata attributes (e.g., people, location) and are not identical with categories (which are limited in number and involve aboutness). By orthogonal Reamy means mutual exclusivity –> an event is not a person is not a document is not a place. Facets have a variety of units and structures, and are designed to be used in combination.

Faceted navigation is more intuitive to end-users and allows for dynamic selection of categories (rather than forcing the user to go down a single path to find information). It involves fewer elements - 4 facets of 10 nodes can yield a 10,000 node taxonomy, it’s flexible and easier to maintain.

Taxonomies deal with semantics (meaning, aboutness) and documents and are complementary to facets. Taxonomies support multiple meanings and purposes, and can be relatively small if combined with facets. Formal taxonomies work better (is-a-kind of, is-a-part-of) than broader classifications.

Ontologies deal with relationships between entities - e.g., Vice Presidents have employees and bosses. They can be represented with XML, RDF, OWL, inference rules.

Best approach is dynamic search and browse. Reamy gave the panda, monkey, banana example (which two terms go together best) which I seem to recall Dave Snowden using at KM World last year.

Sample sites:

  • Wine shop (pure facets - uses sliders, which is increasingly common)
  • Search engine (Source facet - news, video, pictures, Web - with a few selection filters)
  • CNNMoney.com (Source facet again, doesn’t allow for detailed drilldown, too many sponsored results)
  • Search engine (includes Source and Date facets)
  • New York Times (uses semantic technologies to suggest search terms and to offer related stories, has a Source facet identifying section of the ‘newspaper,’ but semantic technologies have issues - Obituaries is one section Obama appears in)
  • Forbes.com (chaotic interface)
  • Factiva (true faceted navigation - uses Source facet but has sub-categories, graph showing number of documents in a given date range, clusters of co-occurring terms displayed in a tag cloud, multiple traditional facets such as company but also a Subject facet
  • Financial Times (limited number of results, but auto-summarization. Standard facets but also taxonomic elements - Topics)

Common themes:

  • Balance of commerce and information
  • Basic facets - Source and Type
  • Standard (People, Companies, Place, Industry)
  • Interactive interface (sliders, date ranges)
  • Keywords vs. simple taxonomy
  • Tag clouds/clusters - how genuinely useful are they? Reamy hints the answer is not very, but doesn’t go into details.

Common issues:

  • Advertiser dominance
  • Auto-ads
  • Non-orthogonal facets (Topics and Issues for one client)
  • One or two filters (don’t provide enough intersections)
  • Semantic component is still the hardest
  • Good information architecture - Summary or full facet display? Simplicity vs. research power

Design issues:

What is the right combination of elements? What is the right balance of elements (Are all facets treated equally?) When should elements be combined - before or after search?

Tools and approach:

Text analytics software extracts entities and noun phrases from a document or set of documents, and can be designed to feed facets,signature, ontologies.

Auto-categorization software feeds subject facets, can be ‘taught’ using training sets, a set of terms (with literal strings, stemming and a dictionary of related terms), simple rules such as position in text, saved search queries, Boolean. Advanced features can include fact extraction, sentiment analysis.

Entity extraction can be dictionary-based, rules-based. A collection of entities can be the aboutness of a document.

Documents are more complicated than products! Facets can’t be an add-on. There has been some progress on semantics. Future of search will be smarter ways to refine results, not better relevance.

When do you add metadata and how? Depends on the environment. Using content management in the enterprise one can balance taggers, results from software and company policies. Software can suggest categorization and facet values. Relevance is best based on ontologies.

10/27/08 UPDATE: Tom’s presentation is up.

Technorati Tags:


Practical Guide to a User-Focused Digital Library

October 20, 2008

Sophia Guevara and Qin Zhu presented this session. As Jane Dysart had announced before the keynote, WiFi was available in every room except the San Carlos Ballroom (where the keynote was). So back to liveblogging.

There are many definitions of digital libraries, so they suggested starting with an understanding of a given library’s users, an understanding of what the goals are of the organization and of the users and how the library could align its services with those goals, and an understanding of digital content and the digital collection.

Physical and space constraints, changing user information needs and changing information behavior may all drive digital libraries. Digital content may include e-journals, e-books, e-reference, image collections, digital audio and video and electronic databases.

The lifecycle of digital content, as expressed in a 2005 article written by Tamar Sadeh and Mark Ellingsen (available as a PDF):

  • Discovery
  • Trial
  • Selection
  • Acquisition
  • Access
  • Renewal or cancellation

Questions to ask include bibliographic details, terms of access and pricing. Analysis of content - granularity, dates of coverage, subject focus and overlap with existing sources.

Content analysis resources:

Assuming everything checks out, user feedback is positive and a vendor is found that can provide an option for filling a given information need, some questions to consider:

  • Can content be purchased a la carte or only as part of a package?
  • Can a better deal be negotiated by agreeing to a multi-year purchase?
  • What kind of access is available (per-seat, per-site, etc.)?
  • Try to negotiate out auto-renewal clauses

Licensing 101:

  • Contract template available at Yale’s LibLicense site
  • Connect with procurement and legal departments and agree on what will and will not be acceptable
  • Understand access and pricing options
  • Be flexible

Resources:

  • LibLicense site
  • two other things I didn’t catch because the presenters went too fast

How will you provide access to users? Be prepared to complete routine maintenance and encourage feedback from users.

There are different methods for content authentication - IP authentication, URL referral, username and password. There may be a single-sign on for your institution, the Athens access management system or a proxy server.

You might want to provide several different information access points - A to Z lists, pathfinders, subject guides, journal/e-journal lists, your library catalog, RSS feeds, etc..

Is content searchable at the top level and as full-text? Is it findable at the publication level? How will you integrate it and aggregate it with content from different providers? Journal lists? Library catalogs? Federated search or alerts? RSS feeds?

Electronic Resource Management Systems, SUSHI or COUNTER can help collect usage statistics. Measurements include site visits, page visits, time spent on site, downloads, etc.. Study your log files, understand user behavior and how they access content.

Criteria for deciding whether or not you will renew a resource:

  • High cost per use ratio? If so, does another vendor offer the resource at a better cost/use ratio?
  • What value is coming from price increases?
  • How frequently do users have problems with the product and how does the vendor respond to problems?

Questions:

  • At what point do you decide keeping print resources is a better deal than trying to choose between different electronic resources with multiple interfaces?
  • What do you do if you don’t have the final say on purchase decisions in your organization?

Apologies if I missed anything, the speakers went very fast. Hopefully at least some of the missing content will be available whenever the slides are posted.

Technorati Tags:


Howard Rheingold Keynote

October 20, 2008

Howard Rheingold spoke on “Communities & Communication in a Social World.” Lawson and I got there early enough to grab seats near the front. Equally importantly, the seats were by a power strip.

I took a photo of Howard’s massively-stickered laptop before things got started - Lawson was smart and actually went up to the podium to take his photo.

Through black magic he also took a photo of me which I actually like.

Howard sees the increasing importance of cooperative arrangements and complex interdependencies. He talked about some of the things which led him to this conclusion. Several examples were from Smart Mobs, but he also talked about the Spanish protests against their government blaming the Madrid bombings on ETA and the Penguins’ Revolution in Chile. But there were also negative examples - the SMS-organized protests in Syria and Egypt against Denmark because of the Mohammed cartoons, the racist riots in Australia, the riots in Nigeria around the Miss World contest.

Howard traced the development of social forms based on collective action.

  • Prehistoric hunting strategies (family groups and allies)
  • The development of cities
  • Alphabetic writing (limited to an elite)
  • Development of the printing press in Europe (mass audiences)

This is still going on - new kinds of cooperation and wealth creation (e.g., IBM’s embrace of open source, Eli Lilly’s opening up of research problems to the wider community, Google opening up its ad network, Amazon.com opening up its API). Howard thinks this will definitely spread beyond the computer industry.

ThinkCycle - nonprofit means of tapping the world’s design student community - came up with a vastly-improved hydration system. SETI@Home is the most powerful computer in the world (40 teraflops). Emergent collective response to disaster - Asian tsunami blog for organizing aid, Katrina people finder wiki. Trying to find Jim Gray’s sailboat using photographs from Google and NASA, Microsoft and Amazon.com put up sections of the images on Amazon.com’s Mechanical Turk and 12,000 volunteers looked through them.

Barriers to participation continue to be lowered. Technologies which enable cooperative and sharing economies are easy to use, enable connections and group formation, are open, are self-instructing and leverage individual self-interest.

Participatory media are social media which open up knowledge and wealth creation. There is nothing innate in knowing how to apply skills to education, politics, etc.. Education must recognize a new way of learning and teaching, not just tack on new skills to an existing curriculum. Howard and others are developing curricula, syllabi and teaching notes as well as a social media classroom and a community of practitioners.

Don’t try to keep up with the technologies, keep up with the literacies.

http://socialmediaclassroom.com, or contact Howard at his first name at his last name dot com.

Technorati Tags:


Open Source CMS for Libraries

October 20, 2008

Karen Coombs and Jason Griffey (standing in for Amanda Hollister who had bronchitis) presented on Open Source CMS for Libraries in the same room as Web Services for Libraries. Since I’m thinking about setting up a website or blog at stevenkaye.info and at some point will be working with Alfresco, I figured the workshop would be good experience to learn about setting up CMSes generally. About twenty people showed up. Karen noted that the slides and handouts would be available online.

Karen led with the background of herself, Jason and Amanda (different sizes of libraries and websites), then followed with a definition of content management systems (CMS) and why people would want to use content management systems. Karen and Jason covered Drupal, Joomla and WordPress. Amanda noted that she had created a backup of her site and promised she would give us her wireless login and let us ravage modify her site.

Jason presented on Joomla. He gave examples of libraries and companies using Joomla and requirements for installing and running Joomla, then showed some screenshots. He discussed the organization of the Cortland Free Library site and organizing content in Joomla generally. In Joomla, sections are broad topics, categories are narrower topics, articles can only belong to one category). You can mix and match front-end organization while keeping a consistent organization on the back-end.

Next up were:

  • adding and editing content
  • working with templates
  • modifying pages
  • extensions
  • menus (Drupal and Joomla require content before building menus for that content)
  • managing modules (widgets)
  • favorite extensions

Joomla does not have a review process (i.e., in the default install items are either published or not published, no draft option). To have different page layouts within a site, you have to shift-click (otherwise Joomla will deselect everything except the item you’ve clicked on). Other issues with Joomla:

  • High learning curve
  • Inspired hatred at first
  • Many extensions are not updated to 1.5x
  • Many templates are not updated to 1.5x

Strengths of Joomla:

  • Exceptional flexibility
  • Good community support
  • Inspired love at the end

At the time (about two and a half years ago), WordPress did not have much content management functionality and Drupal templating was much more complex. Jason’s university as a whole is moving to Drupal, Jason now has more support and Joomla will be abandoned in the next 12-18 months. Amanda is on the fence about Drupal vs. Joomla – she’s a one-person operation.

Karen is the Drupal and WordPress guru. She talked about Drupal last.

Similar to Joomla, Karen talked about examples of libraries using WordPress and requirements for installing and running it. WordPress is a blogging tool with CMS functionality – pages are outside the normal blog post sequence. Pages can be hierarchical and use different templates. By going to Settings à Reading à Front page administrators can set a home page rather than the default of blog posts in reverse chronological order.

Theme questions (Theme editor under Design à Theme Editor):

  • What kind of banner?
  • Where should site navigation appear?
  • Number of columns?
  • Fixed or scalable layout?

Theme should be widget-enabled. Themes are customizable using widgets (Design à Widgets) and template & conditional tags. Administrators can define widgets as only displaying on specific pages. Conditional tags are functions –this tag applies if a specific page is a front page, whether or not a specific page template is being used, etc. Administrators can add custom fields to pages.

WordPress’ strengths:

  • Easy to use
  • Many plugins available
  • Easy to create new themes or modify existing themes
  • Large user base

WordPress issues:

  • Can’t easily create custom content types
  • Lacks flexibility to deal with complex types of objects with different types of fields
  • Customizing display of groups of pages or posts requires knowledge of PHP

For Drupal, Karen strongly recommended using PHP 5 or higher. She went through how to make a given page the home page and where to get themes, as well as modules to add (The same thing as components in Joomla – differing terminology for different CMSes). Blocks are the same as modules in Joomla or widgets in WordPress. All types of content are referred to as nodes, and can be pages, stories (informational content which changes, such as blog posts or press releases) as well as custom types such as events, images and links. URL Path Settings lets you create friendly URLs as well as grouping related content.

Administrators can create custom blocks from Administer –> Site Building à Blocks –> Add Block. Views are content filters based on defined criteria – think of them as easier-to-use SQL queries.

Menus can be primary links, secondary links (less-important content such as legal disclaimers and the like), navigation (This refers to Drupal administration – do not edit this menu) or custom menus. Administrators can create taxonomies consisting of categories, tags or both (hierarchical or free-form) with different taxonomies for different content types. Feed aggregator lets administrators create a block for each feed and allows administrators to embed feeds in nodes. Administrators must give Anonymous users permission to access feeds for them to be visible.

Image Gallery uses the Image module. Another useful module is CCK, which stands for Content Construction Kit. CCK lets administrators add fields to content types and to control the type of field added. Those fields can then be used in views, and administrators must grant Anonymous users permission to access those fields. Karen created Date and Time fields for the Events form and other fields for the Links form.

The explanation of Views was a bit rushed. Types of Views are page, block and feed. Views can display nodes or feeds, and include limiting criteria and sorting criteria.

Drupal’s strengths were its exceptional flexibility, the ease with which administrators can create new content types and a substantial user base (especially in libraries). However, Karen found the creation of dates in CCK buggy and problematic, Drupal has a high learning curve and there are not as many modules and filters developed as one might like (especially true for library-related modules and filters).

Generally, Karen’s take on which CMS to choose:

  • WordPress for smaller sites, as it’s easy to get working
  • Joomla is better for medium-sized sites of average complexity
  • Drupal is best for multiple sites or sites with a high degree of complexity

She finished with a quick demo of the custom CMS that was developed for her library, which had drag-and-drop functionality and the ability to edit directly from within a page rather than switching between administration sections.

Questions:

  • Does Joomla have a staging environment?
  • Does Joomla support a review process?
  • How would you handle legacy systems migration to Joomla?
  • Is Gallery in WordPress an extension or an add-in?
  • Could you use the hosted version of WordPress as a CMS?
  • How would you handle a list of databases in Drupal?
  • Are you using Drupal for your intranet as well as for the library site?
  • Who developed the custom CMS University of Houston uses?

10/25/08 UPDATE: Karen put the slides and handout from the session up.

Technorati Tags:


Web Services for Libraries

October 20, 2008

First, a mistake from the last entry. The “heavy on larger-scale IT projects” session I was fearing was Project Management in Practice, which is why I took Practical Project Management instead.

Today, I started taking photos I could use for the San Andreas SLA blog, which mostly consist of signs outside workshop rooms and not great photos of projected screens so far.

Web Services for Libraries was presented by Jason Clark and Karen Coombs, and was conveniently two doors down from Academic Libraries 2.0 so I could heckle Lawson and run away. If I were to do such a thing. Which would be wrong. Terribly, terribly wrong.

I was especially excited about this session to see if there were any lessons we could take away for code4you, as well as anything I could use for my employer.

Jason passed out a handout with lots of tips, links to data sources, code samples and sample request URLs. It’s also available on the Web at http://www.lib.montana.edu/~jason/talks.php. The session was absolutely packed - from what one attendee told me, preconference sessions are increasingly popular at Internet Librarian. 38 people attended. I’m trying to give a sense of what was talked about without going into excruciating detail, but these things are always difficult to judge. Jason and Karen, if there are any bits you’d like me to take down let me know.

Ideally, Jason would like to have had a lab environment, where we could work on code real-time. Most of the attendees had worked with XML, a lot less with Javascript and DOM and only a handful with PHP.

Dorothea might be happy to know that sample queries were provided for the Open Archives Initiative and there were mentions of OpenDOAR and Repository66. Karen in particular was very passionate about repository issues and using APIs to deposit copies in multiple places (LOCKSS).

The first part of the presentation was about defining terms - web services, structured data, etc.. Web services allow libraries to provide enhanced services to users (e.g., providing suggested tags from Flickr in their catalogues) without having to maintain additional resources themselves. Jason and Karen have mostly worked with consuming as opposed to providing and urge people to start with that to understand the structure and conventions of APIs.

ProgrammableWeb has an API directory on their site. Take the time to read the terms of service of a given API - linkbacks required, allowable number of requests, etc.. Caching data is one way of getting around throttling, if it’s allowed. Don’t be afraid to ask questions of API providers.

Types of web services - Jason and Karen favor REST over SOAP (complex, little uptake) and XML-RPC (primarily an updating protocol, little recent adoption).

Many REST-based APIs are not “pure” REST, in that they have query strings and such (verbs), because pure REST APIs can be difficult to construct.

Data from web services can be formatted in XML, JSON or HTML (e.g., getting HTML back to include in your website for a widget).

Jason showed TerraPod, which leverages Blip.tv’s API to let local library users post videos through the TerrapodUser account (which go through an approval process). Blip.tv was chosen because it had better video quality and less conspicuous branding, and it lets you post to the Internet Archive.

After a fifteen-minute break, the second half focused on examples of Web services in libraries and bibliographic services.

Since the WorldCat Search API was released fairly recently, we dove into that. It defines service levels, search formats and response formats.

Jason noted that Atom is gaining acceptance (in the context of OpenSearch returning results in RSS or in Atom).

OCLC has provided several URI evaluators which test requests sent to and responses received from the WorldCat Search Web Service. Karen highly recommended starting with OpenSearch as a search format rather than SRU, and not dealing with MARCXML if you don’t have to. SRU offers results in either Dublin Core or MARCXML, and allows you to limit holdings to a specific library using the library’s OCLC symbol, limit holdings by zip code, etc..

WorldCat Registry Search allows you to search for member libraries by a number of criteria - so a library could link back to its holdings for a given item.

xISBN identifies related ISBNs for a given book (different editions and formats).

WorldCat Identities links Library of Congress name authorities to information about their works (number of works, works by and about, genres, subject headings, alternate names, etc.). The Virtual International Authority File combines the name authorities of three national libraries (France, Germany, U.S.).

Other APIs Karen talked about:

  • LibraryThing’s APIs (including ThingISBN, JSON Books API)
  • Google Book APIs

Jason gave a number of workshop examples from Google, Amazon, Flickr and WorldCat and showed the Code & Files portion of his site (http://www.lib.montana.edu/~jason/files.php). He demonstrated not only how to get the data, but how to parse and display it. Then Karen finished up with a WorldCat Search API (OpenSearch) example. One thing to note is that you may (always?) have to define the namespaces you’re using. The examples shown combined JavaScript and PHP (for forms, loops, definining variables based on nodes, displaying results), but Karen did give a brief HTML example. She suggests working on each piece of a given script and getting it right before moving to the next one, and Jason reiterated starting with consuming data.

Questions from the audience at various times through the workshop:

  • When would you want JSON vs. XML?
  • Does the provider of the API determine the structure for XML, JSON, etc.?
  • Are terms of service for APIs long, in legalese, etc.?
  • Can you use the del.icio.us API to pull items without the del.icio.us bookmark applet?
  • What is Atom?
  • Can you use the WorldCat API to retrieve recently-added publications in a specific field from a specific library?
  • Does Flickr have a URI builder page for using their API?
  • Do you have to get an Amazon.com developer key?
  • How would I use these in libraries?
  • Are there any forums, listservs, chat rooms etc. for APIs?

Technorati Tags:


Down in Monterey

October 19, 2008

I must apologize - this will not be liveblogging (at least, not this first entry). I’d forgotten the joys of WiFi and power outlet access at Internet Librarian, though I will have access to the Internet cafe for bloggers when that opens. I’m on the MSI Wind, so I’m trying the ScribeFire addon for Firefox as a blog editor for now.

Got in a few minutes late, walked to the hotel, dropped off my things and made my way to the conference center to pick up registration stuff. No free WiFi, and I neglected to print out the WiFi access suggestions from the wiki, which meant I couldn’t check to see if anyone had responded to my FriendFeed post about meeting up for lunch. So I went to Round Table Pizza, which was nearby and even had WiFi, but unfortunately it was secured.

About 1 PM (30 minutes before the workshops) they turned on the escalators. I was already upstairs, listening to Stephen Abram hold court. I thought I recognized Karen Schneider, Steve Cohen and Michael Sauers, but I’m not positive on that.

My first session was Practical Project Management, presented by Kathleen Cameron, Sadie Honey and Leslie Wolf. I was a bit nervous about this one - after all, I’m one of only two researchers, and the description sounded focused on larger scale IT projects. But it was a great, jargon-free discussion of basic project management concepts and how to train others on them. They even have a wiki with project management resources, including templates for project plans and descriptions.

Next up, Web Services for Libraries and Open Source CMS for Libraries.

Technorati Tags:


Hanging it up

October 7, 2008

As my posting’s become ever more erratic, and the few posts I have done come from product announcements or other sources, I’ve decided it’s time to hang it up. I’ll be blogging from Internet Librarian, to vex Walt with my pernicious liveblogging if nothing else ;) , but after that curtains.

Anyone want to take over the blog, make something interesting of it? You can either email me from the About page or comment here, and using my completely arbitrary criteria for deciding who gets it I’ll hand it over along with domain renewal for the next year.


No, really, this time the standard will get adopted!

September 23, 2008

Catching up on industry news, I found an interesting post on CMS Watch’s blog by Kas Thomas (and if you’re not following CMS Watch’s blog or subscribing to their monthly newsletter and you’re interested in content management systems, I highly recomend it).

EMC, IBM and Microsoft are pushing a standard for platform-independent repository exchange, CMIS (Content Management Interoperability Services) as an OASIS (Organization for the Advancement of Structured Information Standards) spec. Alfresco, OpenText, Oracle and SAP are also involved. As the post points out, this isn’t the first time there’s been a lot of hype about a proposed standard, but it will be interesting to watch. It will also be interesting to see how this plays with the vendors of blogging software.

Any reactions from content management pros reading this?