KMDI Final Summary

Collaborative Filtering

Don Turnbull

Collaborative Filtering fits into our Designing Information Spaces group by admitting it's hard to design an interface to do all of the information display for an application - sometimes it's easier to leverage the intelligence of its users. Typical interface issues are how to display a large amount of data at once and how to show (the perceived) most relevant information first.

Even with the best-designed interface, users will gradually use some features more than others and read certain types of notes more often, filtering helps use this feedback to enhance the application's usability (or the information gathered via the application).

This paper overviews some filtering systems and current interfaces for coordinating and sharing information. The idea behind collaborative filtering is that we can leverage others reading/browsing preferences and actions to make each others' interactions with the system more relevant.

I'm aware that there may be expert help (a human coordinator or an expert system) to help focus group work around a problem and to help fuel collaboration. However, most of these systems either require constant attention (the system) or face-to-face human contact. Different types of filtering and even Collaborative Filtering can help alieve these resource problems while accomplishing some of the goals.

Collaborative Filtering can help us design information spaces better, the main hope is that filtering of any kind can help weed through the data and yield new points of view for users. Filtering should augment navigating the message base much like being in a live conversation (or room full of conversations), you'd get an idea about the tempo, the climate and directions the conversations are going. Essentially, collaborative filtering should go even further and show what's attracting attention, what's agreed upon and the general "zeitgeist" of the group. More specifically, you could see how people you usually agree with are posting and if consensus is being built in the group.

What is Collaborative Filtering?

Taking DeKerchove's ideas of collective and connective mind, CSCW systems can let us work with other's knowledge at a different level than plain group conversation. Collaborative Filtering gives us different views of the body of knowledge as the group (or an individual) sees it.

Roots of Collaborative Filtering

Collaborative Filtering naturally follows from combining various aspects of each of these fields. Their

Information Filtering and Information Retrieval: Two Sides of the Same Coin?

Belkin (1990-96) defines filtering in many contexts:

Belkin is suggesting that Collaborative Filtering is a new implementation of Information Retrieval. What makes it different from IR is that feedback is more important and aggregate feedback (how the information retrieved is used) among groups of users is measured as well.

Filtering, IR the next generation?

These are the key issues that take filtering beyond typical Information Retrieval:

This next addition to IR is the idea of profiles. For example, we might automatically weight viewing or reading notes from others in our own group over others. We might then have a better idea about how our group is responding and thinking about other's topics.

Filtering Methods

A good conferencing system should provide access to all of these features to users. In a way, most systems have some of these features, but things like vector matching (the roots of Collaborative Filtering where a user's queries or accesses are compared against others and similarities are matched) will give us a much more insightful view of what's going on in the database of notes.

What this implies is that a database is fairly rich in notes (I'd think filtering of any kind would grow in importance as the number of notes approaches 100 and becomes downright essential near 400 and above). Of course, this depends on how much time users spend browsing through the notes (do they keep up-to-date so there aren't many new notes coming in? or do they log on only occasionally and have to catch-up quickly and read entire message threads).

A Filtering System Model

Yet another reason for Collaborative Filtering: users can help lead each other through the data. It's hard for the interface designers to know how the system will be used and filtering may help users find the more important information easier. I could log in to the system and see what is hot for the DIS group and read those messages first, then we could all talk about the same things. Message flags and colored links don't give me this information at all of without some digging. Filtering, in conjunction with a good interface make using the systems even easier.

Here is the breadth of a user model domain:

Filtering System Implementations

Filtering on the Web

Rich Information

Some of the ideas here suggest that topic groupings might even be changed if the way users are reading and interacting in the message base has shifted. We might decide that by analyzing usage patterns, we need to develop new groups of messages or even new groups of people because of what they are reading and commenting on.

Services like Yahoo, are doing this the hard way. Humans are digging through vast amounts of user data (users who visit their site) and readjusting their categories (or at least links among the categories) to make more sense to the users. Think of it as an interactive yellow pages based on how you use it. If you look up an obscure term and don't find it, then shift back to looking for a simpler term and then find a links suitable to you (then leave yahoo to go to that site), the yahoos (that's what they call themselves) might add that term into their alias list. (If this sounds like keywords, well, it is - it's just interactive usage of them.)

Collaborative Software

Applications that organize information around a group engaged in a common task (or goal):

Collaborative Filtering

Simply put, Collaborative Filtering involves filtering and users:

Lots of filtering basics can be helped by good database design and smart database reporting. When you combine that with a nice GUI, filtering is just a bonus in helping users surf through information.

CSCW systems

The Internet as a medium

Collaborative Filtering Drawbacks

Here's some of the problems with filtering and Collaborative Filtering, most notably that the filtering rules take time to develop. UL TYPE=DISC>

  • Collaboration takes time
  • People are smarter than algorithms
  • People aren't consistent or perfect communicators
  • All preferences might not be shared or immediately definable
  • Groupthink (too much or over-reliance on filtering)

    Collaborative Filtering Systems

    First Generation

    Newer Collaborative Filtering Systems on the Web

    Mosaic

    Berners-Lee, CERN

    Firefly

    previously HOMR and Ringo - Metral, Shardanand, and Maes, MIT

    Expanded from movies and music to community interests and other media.

    Yahoo

    Point's Top 5% might also be in this category. (Although no one knows exactly how the sites are chosen.)

    Lotus Notes

    Maltz and Ehrlich, Lotus, Corp.

    InterNotes Navigator also adds browsing features to Notes and allows for sharing of bookmarks.

    Occam

    Kwok and Weld, Univ. of Washington

    Interfaces for Collaborative Systems/Browsers

    The following systems also provide some good design examples of how to use filtering and Collaborative Filtering to make navigating the information space easier.

    GroupLens

    GroupLens is the venerable and most talked about system. It focuses on usenet news, a rather simple model of notes and included the following features:

    Of course, most of its benefits are derived mathematically, not with its interface (of the information space). However, its logn success shows that its filtering capapbilities are useful enough that users contiune to use the system.

    SenseMaker

    Baldonado and Winograd, Stanford Digital Libraries Project

    Many prototype systems that can give us good examples of information space design are underway to help users tame digital libraries full of information. The same tasks are relevant when trying to sort through a large (enough) body of conversation notes as well. SenseMaker's real strengths are:

    While many of these ideas are seen in many other examples, the right combination seems to be found here. SenseMaker provides a good interface and view into the data.

    WebBook/Web Forager

    Card, Robertson, and York, Xerox PARC

    Soon to be a commercial offering through Xerox's spin-off, InXight, WebBook/WebForrager basically groups information into virtual books. These books facilitate moving among various books and pages of information (mostly web sites and wb pages) in a more native manner to beginning users. It's main strengths are:

    Unfortunately, it's weakness is trying to use a bad metaphor for a large amount of information. Who would want to be constrained by physical-oriented models when dealing with such a vast amount of information?

    Collaborative Combinations

    Perhaps the best design solutions are to have a good interface, but provide the right combination of computer-supported and human-supported systems. All of the following help would be advantageous in a good information system:

    I think many systems in use today (Lotus Notes, most notably) use this combination of methods to make a usable, realistic collaborative system.

    In the future we will see more of these ideas implemented as push technologies (which I think are really "intelligent pull") that start to deliver information to us directly. We might not even have to go out and seek the information in a notes database, we might have it sent to us based on some pre-programmed preferences.

    Research Problems

    Despite many of the good system designs shown, there is a lot of reseach to be done. This is just a short list:

    Design Problems

    In looking at a number of good examples of information design, we can see some obvious areas where more design work in needed:

    A more fundamental problem is that filtering may heavily bias the information where the most read news gets read even more, etc. until more obscure information isn't seen at all. A good system might have randomly ignored filters or some other device to help get new news out.

    Back to the Future: Personal Filtering

    A possible solution to some of these problems are to experiment with users working through their own personal data. The same system that provide views to groups of others notes can also be used to organize and access a single user's information. Such a system might support:

    Filtering Interfaces

    The interface design for filtering is a challenge. At it's best, it might be totaly transparent. Users may not want to see unfiltered messages. However, some of us might want to see EVERYTHING in the database, but then have the more important message highlighted in some way. Ideas like Joanna's time-based visualization are a good start.

    This leads into other Design of Information Spaces group members' work.

    Conclusion

    Now that a number of information systems have been overviewed and their strengths and weaknesses have been shown, how do we take what we've learned and go further? Here is a first cut at some ideas and questions to address:


    ©1997 Don Turnbull