[il2007] Meta

October 31, 2007

I’ll be posting notes on the remaining sessions I attended tomorrow night. Tomorrow has the advantage of my being able to download pictures from my roommate Lee’s camera, which will hopefully make some of the posts more interesting. Also, maybe more presentations will be up by then.

Tonight I’m attending a final dinearound dinner on the topic of social computing - I’m curious what this actually means.

Tomorrow, I check out and see the sights of downtown Monterey while waiting for the bus back to San Jose, which will get there about 5:30 P.M. Lee says he should be able to pick me up.

Technorati Tags:


[il2007] Folksonomies and Tagging: Libraries & the Hive Mind

October 31, 2007

Tom Reamy from the Knowledge Architecture Professional Services (KAPS) Group led this session. No connection to the science fiction/fantasy author that I know of.

Themes

  • People categorizing their stuff on their own terms (Matt Haughey, Metafilter) - plus or minus
  • Obligatory snarky Andrew Keen quote in reaction to unnamed Web 2.0 evangelist
  • Yeats, “Second Coming”
  • Key is social mechanism for seeing other tags

Essentials of folksonomies

  • Advantages - very simple to use, lower cost of categorization (spread over large population), flexible (can respond quickly to changes), higher relevance because in user’s own terms, supports discovery (serendipitous browsing), object-neutral (can tag bookmarks, documents and photos), better than no tags at all?, gets people excited about metadata
  • Disadvantages - don’t work well for finding, no structure or conceptual relationships, lack of scale, highly personal (or popularity wins), errors
  • Dangers - the ‘unwisdom of crowds,’ the tyranny of the majority (popularity of tags beats quality, narrowing of choices, losing content), belief that hierarchy and taxonomy not needed

Will social networking make better folksonomies? Tom’s research suggests not:

  • A few tags dominate del.icio.us
  • Quality is not popularity
  • Most non-techies don’t tag, let alone re-tag
  • Despite following NISO guidelines, folksonomies do not seem to work to let other people find things
  • Regular people and infrequent users don’t do this stuff

Flickr facets analysis

  • Faceted navigation is extremely powerful and easy to use
  • 90% of Flickr content can be described by six basic facets- place, events, date, things/animals, color
  • Subject matter is less than 1% of tags
  • Works on lower level scales

Del.icio.us tags analysis

  • Tags are subject matter, not facets
  • High level topics
  • Related terms are grouped by popularity, not by conceptual relations (that is, people who tagged something X also tagged it Y)
  • One functional facet turned up - howto, tutorial, toread, todo
  • Dominated by computer terms - searching on the design tag got 1 million computer-related results versus 3,909 on interior design
  • Again, the tyranny of the majority
  • Top 25 tags over time - basically the same set, with an order shift
  • Del.icio.us findability:
  • Too many hits
  • No plurals (singular preferred) or stemming (Searching on blog turns up 1.7 million items, searching on blogs turns up 516,340. The contrast is more extreme for other terms.)
  • Personal tags - not good for findability (e.g., cool, fun, funny) but interesting for social research

So, how do we improve folksonomies as finding tools?

  • Add automatic facets to Flickr (one-time cost, some monitoring required)
  • Cluster tags (entity extraction, populate facets and subjects, types of relationships)
  • Add a broad general taxonomy of popular tags (tags as natural categories*, build up and down), start and evolve a simple two level taxonomy
  • Evolve the quality of tags and the emerging structure of tags (preferred term is the popular term, add mechanisms to rank tags, taggers and categories)

Folksonomies and libraries:

  • Library catalog (he said library catalogs didn’t include tags, as I recall)
  • Internal service (I think - my scrawl was hard for even me to read)
  • Enterprise (KM) contributor

University of Pennsylvania is experimenting - PennTags

LibraryThing:

  • High level concepts, either too general or too specific
  • Variety of terms is an issues (cognitive science required 40 tags for an adequate coverage of the subject)
  • Strange tags (he cited “book” as a tag, less than helpful, or “my living room”)
  • Facets and topics
  • Redundant/inconsistent tags

It was about here that Tom realized he had way more material than time to present it - and as a result, he raced through the remaining slides and I’m not sure what he meant by much of this. I’ll add a link to his slides once they’re up.

What won’t work:

  • Trying to “improve” users
  • Social networking
  • Either/or approach - folksonomies OR LCSH (Library of Congress Subject Headings)

What might work:

  • Environment and dynamic social rules
  • Integrate evolving social (something) structure - people, technology, policies, procedures with feedback incorporated
  • Interpenetration of opposites (top-down, bottom-up and all over)
  • Reduce the folk aspects and add more -onomy (Wikipedia has 2,000 editors)
  • Increase the folk aspects - add discussion and context around tags

New library now?

  • Central (something) feedback system
  • Communities of practice
  • Technology - enterprise content management/knowledge management platforms for the enterprise, metadata, policies, tag clouds, more sophisticated displays
  • Ranking - either implicit or explicit, but rank tags, taggers, categories

He recommended starting with a formal taxonomy (I believe that’s right)

Library 2.0 is about social collaboration, not tagging

Folksonomies need to evolve

Tension between ease of use and findability

90% hype

Not much time for questions afterwards - I passed on my business card and asked if I could e-mail followup questions, and he said that would be fine. It turned out I was sitting next to a guy from CiteULike, and he noted he’d found only a 10% overlap between tags and formal subject descriptors.

I have lots of questions and pushback on points, but I’m interested what the audience out there thinks.

Technorati Tags:


[il2007] Closing Keynote - Gaming, Learning, & the Information World

October 31, 2007

Elizabeth Lawley presented on this, and as she notes on her blog Information Today took her joke that she’d appear in costume seriously and advertised that fact. So, here’s what she looks like (apologies for the blurriness of photos):

IMG_0051.JPGIMG_0048.JPGIMG_0049.JPG

That’s her character Maleficent’s outfit in World of Warcraft (WoW).

The focus of her talk was on ways to take the engaging elements of gaming and bring them to everyday life (and to point out the already existing parallels).

What makes games engaging isn’t especially surprising:

  • Understanding
  • Accomplishment
  • Progress
  • Acquisition
  • Communication

Actually, that might have been the important things to get in the first five minutes, now that I think of it. The important elements she talked about in terms of game design were:

  • Collecting (i.e., getting stuff)
  • Points (i.e., keeping score)
  • Feedback
  • Communications exchanges
  • Customization

She compared the first five minutes of World of Warcraft (building a character, the introduction which explains the backstory, nice design bits like exclamation points appearing over people’s heads or making sure you’ll survive the first combat) with Second Life (movement is unclear, no “goal” per se, no rewards or sense of accomplishment). And to be fair, she did point out there’s a lot of neat stuff in Second Life.

She pointed out that the lines between online games and offline social interaction are blurring - WoW guilds consisting solely of people who’ve met in real life, Joi Ito’s line about WoW as the new golf.

Liz showed Nick Yee’s 5 stages of MMO players and compared them to everyday jobs.

She also pointed to things like Budweiser’s Passively Massive Multiplayer Online Game which turns web browsing into a game with quests and prizes and such, something developed for the Social Computing Symposium which encouraged people to learn each others’ names and faces by rewarding them with contact info and points, to Chore Wars which turns household chores into a roleplaying game.

I have some issues with the notion of making the corporate world even more like a game, and I would have liked to see some discussion of cooperative games, how to build humane games into household and work activities. Possibly I am turning into a mushy-headed feel-good type after two months in California, though.

Technorati Tags:


[il2007] Alternative & Customized SEs

October 31, 2007

Mary Ellen Bates led this session.

Beyond the big 3.5

Exalead:

  • Great advanced search tools (including words starting with, phonetic spelling, approximate spelling, adjacent words goes up to 15 words apart, logical expression, regular expression for internal truncation and other uses)
  • Excellent tool for complex searching
  • Customizable front page; add shortcuts to other finding tools
  • Smaller index than Google or Yahoo

Clusty:

  • Meta-search engine (can give number of results by source and by site)
  • Clusters on the fly
  • Few advanced features
  • Can add topic clouds to your site from Clusty Cloud Creator (cloud.clusty.com)

Search engine beta sites:

Intelways:

  • Aggregates search engines and finding tools by category (general, images, video, news, social, files, reference, academic, business, tech, shop)
  • Allows for execution of same search in multiple search engines

SRCHR:

  • Like Intelways, aggregates search engines and finding tools by category
  • Creates RSS feeds for search results
  • Mouse over search result for a preview

Scandoo:

  • Searches Google, Yahoo, MSN, Ask
  • Checks for malware, phishing, offensive content in real time (green/yellow/red)
  • Powered by corporate web security company

Slideshow

  • Images from Google, Yahoo or Live
  • Presents slideshow of images based on searching 3 search engines, can view images one at a time

FindSounds

  • Can identify similar sounds based on wave pattern
  • Categorization by file format. # of channels (mono/stereo), resolution, sample rate
  • E-mail this sound link

Custom search engines

  • Filter search, limit to certain domains or sites, or tweak the relevance ranking
  • Can put institution’s imprimatur on search results

Yahoo Search Builder

  • Limit by domain, append key search terms, exclude sites or terms
  • Shows tag cloud of search terms
  • Very customizable (can include logo)
  • builder.search.yahoo.com

Google Co-op

Swicki

  • Collaborative approach to filtering
  • You supply initial key words
  • Learns from clickthroughs on search result pages and modifies relevance rankings
  • eurekster.com

Rollyo

  • Maximum of 25 web sites
  • Best for getting list of authoritative sites
  • Can see others’ search rolls, such as KM-related topics at snurl.com/16×8o
  • rollyo.com

Gigablast

  • Fairly rudimentary
  • Specify domains to search (effective limit of 200 domains, search slows down after 50-75)
  • www.gigablast.com/cts.html

Questions:

  • Is there a directory of customized search engines?

Technorati Tags:


[il2007] What’s Hot with RSS!

October 31, 2007

Steven Cohen, who gave this blog a pointer when I was just starting out, led this session.

Presentation is here.

He started with a brief history of RSS.

He’s really excited about Google Reader:

  • Lots of keyboard commands
  • E-mail from reader
  • Mobile version
  • Personalized trends (reading, subscriptions, etc.)
  • Subscription numbers are way off
  • Can search subscriptions (shared items, starred items, specific blogs, etc.)
  • Offline reading with Google Gears
  • Sharing of items (social bookmarks, can be subscribed to as an RSS feed)

Talked about Tumblr (can redirect to domains)

Vista has RSS on the desktop (as News), RSS built into Internet Explorer

Showed LibWorm custom search engine for librarian blogs

Inconsistencies in news alerts and news feeds

TechMeme (tech blog aggregator)

Page2RSS (creates RSS feeds for web pages without them)

OpenCongress (aggregates official government data, news and blog coverage)

Justia Dockets (searches Federal District Court filings, can specify court, party, lawsuit type, etc. and includes RSS feeds)

Justia Case Alerts (searches Federal District Court opinions and orders)

EBSCO (create alert for this search = RSS feed)

Library catalogs

His top 10 favorite tools:

Technorati Tags:


[il2007] Content Management Systems (CMSs)

October 31, 2007

Ruth Kneale, whom I’d met previously at the LSW meetup and one of the dinearounds, was one of the presenters. The other two, Amy Radermacher and May Chang, are university librarians at Concordia University and the University of Maryland respectively.

Ruth’s presentation (which she will post) was called “From Static to Dynamic: Choosing and Implementing a CMS.” She quickly went through the definition of a CMS, and explained why it was needed for her institution:

  • To avoid one person being the sole person who can administer and edit a site (bottleneck)
  • Ease of administration
  • Increasing team collaboration
  • Improving functionality
  • Improved website presentation (standard look-and-feel across site)

Her starting point was that they had no money (which made open-source the way to go) and any solution had to be LAMP (Linux, Apache, MySQL and PHP).

Must-haves:

  • Audit trail
  • Content approval
  • WYSIWYG editor
  • Granular privileges
  • Friendly URLs
  • Versioning
  • Content resue
  • CGI support

Should-haves:

  • Sandbox
  • Online administration
  • Inline administration (administration from within the page being edited)
  • Mass uploading
  • Site map/index

Nice-to-haves

  • Contact management
  • Drag-and-drop
  • Photo gallery
  • Events
  • Calendar
  • Web statistics

Her first decision was whether to go with a content management system or a wiki. Sources she used to help make her decision:

She narrowed the field down to 9 Web content management systems and 9 wikis, and created a frighteningly-detailed comparison spreadsheet. Reasons for excluding candidates:



  • Missing functionality (in some cases, missing functionality that was claimed to exist)
  • Hidden costs (e.g., charging based on the number of viewers of content)
  • Currency of support
  • Required database underpinnings
  • Ease of installation (or lack of same)
  • Ease of use (or lack of same)
  • Evaluation on OpenSourceCMS.com

The candidate evaluations included local evaluations including user feedback, and resulted in a final field of Drupal, MediaWIki, TWiki and WebGUI.

Local users created and edited content and gave feedback. They really wanted a familiar look-and-feel, as well as some terminology changes. Wikis were seen as too much work.

Ruth’s next steps are to expand functionality, import the existing site (which is apparently a painful manual process), training more staff on how to use the CMS (again, the bottleneck thing) and migrating everything to a new server by 2008. The whole evaluation process is documented on her blog, and she’s going to put up her presentation on SlideShare.

Amy and May presented on quick succession, and in contrast to Ruth they were not initially involved in the selection of a CMS. As an aside, I loved this, and wish more panels were group panels with people discussing their different experiences.

Amy’s goals were to have a more uniform website, to improve ease of use for multiple non-technical users and to increase the number of editors while maintaining design consistency. The IT department, which led the CMS implementation, did not realize how dynamic the library’s website was compared to other university sites (always changing, growing, adding tools).

She had issues with access to source code and control over the design, links (All the university’s links were dumped into a single folder, which made library website performance slow and hurt findability), the site structure and the pace of testing and development (deploy times were automated, needed permission to test).

However, the outcome was improved campus communication, especially once IT saw all the problems, and the library is now involved with the implementation of the university portal.

May’s story was similar - the library has to go through campus IT for development, but is generally involved except for the implementation of a CMS. The University Relations Office saw the content management system as a marketing tool, whereas the library and other units saw it as a service point. The key success factors May saw, as someone who’d been involved with multiple implementations in multiple places, were:

  • Early involvement in CMS evaluation, selection and implementation and user feedback
  • Negotiating flexibility
  • Developing in-house expertise
  • Communication
  • Being clear about the drivers of a CMS

One point she made was that you’re not limited to using a Big Name package like Collage, Joomla or Mambo. You can work with file directories, XHTML, CSS and includes and roll your own.

I really enjoyed this session, and not just because Ruth bribed us all with candy. The speakers were engaging and honest, did not simply read from their notes and had a good back-and-forth with the audience (which included someone who apparently had strong opinions on Joomla).

Technorati Tags:


[il2007] Keynote by Danny Sullivan

October 31, 2007

I came in late, due to a wrenched back and the resulting lack of sleep. So I sat in the annex next to the San Carlos Ballroom which had a projection from the Ballroom. And missed breakfast, damn it. But on the bright side, Danny ran over.

The keynote was an overview of trends in the search engine space. He talked about federated search coming to the Web - but not under that name (meta search, universal search, 3D/morph, answers, shortcuts depending on the company).

Google Universal Search (May 2007) - relevancy of each silo assessed and measured against others. Added local search results (10 results, some web search results removed for other resources - reducing importance of web searches). Can watch video right on search results page. Still refining placement of search results.

Ask 3D & “Morph” (June 2007) - 3 pane design, with third pane identifying relevant non-web sources (blogs, dictionary, news, news images, video, WIkipedia, etc.) Why have a separate news images section, just have news at the top with images included.

Microsoft Live (September 2007) - Pushing Microsoft Answers (entertainment/celebrities, health, local, shopping). Smart Motion Video Previews. First company with dedicated health search engine. Shortcuts at top of page are old-school to Danny’s mind.

Yahoo (October 2007) - Pushing Yahoo Shortcuts (business, events, health, movies, music, restaurants, shopping, sports, travel). Customized information depending on what you search (search on Barry Bonds, get stats, search on destination, get popular tourist sites).

Blended search overview:

  • Metaphor/presentation of blended search results still being worked out.
  • Ask 3D didn’t generate a boost, but the toolbar download (which gives free avatars) and iWon ramp-up (giving away prizes) did.
  • Live saw gains through “Search Club” (they started giving away prizes to people who used the search).

New link analysis abilities are coming (personalization of results, specialty search like OriginSearch.com and Scirus.com doing well, crawlers may take a new lead)

Personalized & social search reshape results based on what you or people associated with you do:

  • Google moves pages up, down, in or out of top 10 based on your personal preferences). Rewards ego searches. Google influencers if you’re logged in and if you use the Toolbar - Personalized Home Page content, Google Bookmarks, search history (clicks), web history (visits).
  • Social search - Eurekster experimented with friend clicks reshaping results in 2004, Yahoo My Web promised to let us tag and use a network to reshape search results, but that got dropped. Eurekster has moved on to “swickis” (small group of people working on similar things). But what about Facebook?
  • Social graph/social network data is potentially useful, in that you can monitor links in a more trusted environment and reshape results based on what your friends seem to like (but issue with what is a “friend” in a social network).
  • Do you need to consider what you’ll share with friends? Does Facebook work better on an aggregate level? What’s the underlying platform (and someone else would likely run the search engine)?
  • Will Facebook do search? Plenty of people search engines out there already, events search?
  • Search versus discovery (search is on-demand, discovery is less-specific, things like StumbleUpon or Digg, iGoogle related magic tabs). Maybe this is Facebook’s niche - discovery?

Natural language search

  • Powerset isn’t out
  • Hakia.com is more interesting for how it clusters results based on natural language analysis

Human refinement

  • Mahalo.com feels cluttered at times but nice to see someone trying.
  • We’ll see what happens with Search Wikia.

Overall:

  • Verticals grow
  • Personalized search faces privacy issues but will survive and be helpful
  • Perhaps social search will play a role
  • Lots of room to grow- recent report claimed 7 out of 10 Americans experience ’search engine fatigue” (but Danny has some issues with how they measured things, stats - mentions a past claim of “search rage” after 12 minutes of search

Then he talked about Search Engine Land, SearchCap, Daily SearchCast (news via podcast) and Sphinn

Technorati Tags:


[il2007] Gadgets, Gadgets and Gaming

October 31, 2007

We got back late from the dinearound, so I can only report on the last bit of the session. Not only have I let all of you down, but my fellow Steve. Erik Boekesteijn and Jaap van der Geer of the Delft Public Library presented bits of the documentary they’re assembling, collecting best practices of libraries across the United States. Which is much less dry than I make it sound, my favorite part being the two librarians singing about the glories of open-source ILS (integrated library systems). I have no idea what Sirsi did wrong, but “It’s not Sirsi” will haunt me to my dying day. They had a film crew at the session, looming over hapless interviewees with a boom mike and lighting.

And no, I won’t tell you about the dinearound. All I can say is, thank you Laura, for not taking a picture of me with a bib. And thank you, Ruth, for finding a place with a nice waitress.

Technorati Tags:


[il2007] Developing a Taxonomy

October 30, 2007

This session was led by Kathryn Breininger and Mary Whittaker, librarians with Boeing. They have additional materials available, which is good as there’s no way I can keep up. I’ll add the link or links once they’re up.

Taxonomy is a controlled vocabulary with broader/narrower relationships and a browsable hierarchical structure. It may include equivalent relationships.

Considerations

  • Don’t duplicate an existing vocabulary
  • Construction methods (committee, empirical, machine-assisted)
  • Top-down better for new taxonomies, bottom-up for adding terms to existing taxonomy
  • Dimensions of a taxonomy (industry perspective, business process
  • Size of taxonomy
  • Facets (can add to query to express in natural language)
  • Intended use of taxonomy

Steps in developing a taxonomy:

  • Idenfify scope, purpose, content format, subject/facet coverage, depth, type of content, volume of content, target audience, user needs, technological requirements
  • Identify concepts (source materials, analyze search logs, inventory content, analyze content, determine content types, interview SMEs, identify existing taxonomies, extract candidate terms
  • Develop draft taxonomy with common rules, reconciliation of terminology issues, use concepts universally, start broad not deep, develop upper levels of structure (7-10 major buckets), work from bottom-up and top-down
  • Review with users and SMEs (provide draft for review, conduct usability studies, build consensus, keep a history of decisions, involve stakeholders, SMEs and users across the business). Iterative process.
  • Refine taxonomy (incorporate refinements, review and refine cycle, know when to quit - don’t overbuild, low level of detail vs. value at the leaf node, establish test criteria)
  • Apply taxonomy to content (provide guidelines for use, deploy - navigate web sites, tag content and integrate with existing applications)
  • Manage and maintain taxonomy (establish ownership, establish governance and change control processes, develop maintenance plan, review content for new concepts, develop user feedback process for new concepts, maintain lifecycle - version control, review success criteria, provide documentation)

Review taxonomy periodically for currency, create a candidate list of terms for consideration, analyze items returned in error, sample newly added content, consider terms used excessively or infrequently

Testing the taxonomy

  • Does the taxonomy provide appropriate search results
  • Does the taxonomy match user expectations
  • Evaluation criteria - should support taxonomy purpose
  • Testing methods - heuristic evaluation (expert evaluation), affinity modeling (card sorting), usability testing (overall system)

Qualitative testing: demonstration to SMEs, conducting user satisfaction surveys, performing usability studies, analysis of items returned in error, tagging of sample content, testing of relevancy

Quantitative testing: How evenly does the taxonomy divide the content, how well does the taxonomy match the content, how well does the taxonomy cover the field, is the indexing repeatable?

Engaging people:

  • Find a strong sponsor and champion
  • Build a multi-disciplinary team (including end users as well as information professionals, IT, SMEs)
  • Task IT with software maintenance
  • Give SMEs, content owners and librarians responsibility for taxonomy
  • Obtain end user buy-in

At this point, my battery died. Topics I remember being covered:

  • Governance processes
  • Taxonomy drivers
  • Taxonomy benefits (productivity, searching, business)
  • ROI considerations and analysis
  • Best practices
  • Things to avoid
  • Critical success factors

Some of the questions:

  • How to test a taxonomy before it goes public?
  • Who applies the taxonomy?
  • What software did Boeing use?
  • Could they give a sample taxonomy?

Technorati Tags:


[il2007] Librarians as Knowledge Managers

October 30, 2007

Susan Braun of The Aerospace Institute led this session.

She opened by introducing the Aerospace Corporation - interesting bit is that the average age of employees is 47 (needed credibility to act as trusted technical interface between government and contractors). This led to a concern about knowledge being locked in people’s heads. Also, a lot of these programs are 20-25 years old (e.g., launch vehicles), so there was concern about loss of historical perspective. Combined with new hires expecting desktop access to information and increasing geographic diversification.

Library has staff of 27, have historically been knowledge managers. Today they’re funded by the Knowledge Management Office, the Library Director is on the Corporate Livelink Steering Committee and Subcouncil on Information Policy, facilitators/knowledge managers/content managers of communities of practice (which is a paradigm shift for the company). 2 librarians are Livelink certified trainers. All Knowledge Stewards are trained by librarians.

Goals:

  • Broad collaboration
  • Centralization of knowledge for ease of access and reuse
  • Cultural paradigm shift away from NIH (not invented here) syndrome, hoarding knowledge

The expectation is that all the reference staff (7) will be embedded in communities of practice as technical experts/in-house researchers.

Different organizations are at different levels of comfort with various technologies, librarians spread basic best practices across the organization, train in usage of document management system, etc. but don’t have any illusions about 100% standardization.

Plans for FY08:

  • Revising the corporate taxonomy with the assistance of an outside taxonomist, to allow for metadata tagging outside the library
  • Upgrade the document management system (revise training materials, more training sessions, embedded tutorials)
  • Federated search
  • Expand the use of blogs, RSS, wikis
  • More librarians supporting communities of practice

Questions:

  • How do you balance organizations doing their own thing (and potentially introducing useful innovations) while you’re trying to standardize?
  • Is the print collection being maintained and is the value of it seen (context: the company is doing a massive digitization initiative)
  • Are you conducting oral histories to get knowledge out of people’s heads? (The answer was that they do both formal interviews available in QuickTime and informal storytelling sessions to pass on tacit knowledge)
  • Any statistics on people downloading materials and printing them out versus viewing them on the screen? (No stats, but people do both)
  • Are people assigned to communities of practice? (Yes, based on existing connections or who’s available)
  • What’s the progress of digitization? (Over 200,000 documents digitized, moving on to things like annual reports which are saved in PDF format)

Technorati Tags: