In this chapter you'll prototype a discussion forum, conduct a usability test, and then refine your system based on what you learned from observing the users.
Aviation in itself is not inherently dangerous. But to an even greater
degree than the sea, it is terribly unforgiving of any carelessness,
incapacity or neglect.
-- Captain A. G. Lamplugh, 1930s |
rec.aviation.soaring
,
where people talk about flying around in airplanes without engines.
In a USENET group the magnet content can be any longish posting from a
recognized expert. Keep in mind that the number of people using a
group such as rec.aviation.soaring
is fairly small—most
people get nervous in little planes and even more nervous in a little
plane with no engine. An analysis of October 2004's activity
by Marc Smith's Netscan service (netscan.research.microsoft.com)
shows that the group had only 174 "Returnees". Thus it will
be fairly straightforward for these core users to recognize each other
by name or email address. A typical magnet content posting in a
newsgroup is the FAQ or frequently asked questions summary in which
each question has an agreed-upon-by-the-group-experts answer.
If the engine stops for any reason, you are due to tumble, and that's
all there is to it!
— Clyde Cessna |
The means of collaboration in the USENET group is the ability for any member to start a new thread or reply to a message within an existing thread. In the early days of USENET, the means of browsing and searching were reasonably good for recent messages, but terrible or non-existent for learning from older exchanges. Starting in the mid-1990s, Web-based search engines such as DejaNews provided fast and easy access to old messages.
USENET has traditionally been weak on the fourth required element ("means of delegation of moderation"). Not enough people have volunteered to moderate, software to divide the effort of moderating a single forum among multiple moderators was non-existent, and the news protocols had security holes that let commercial spam messages through even on moderated groups. For an overview of the circa 2001 state of the art, read http://www.landfield.com/usenet/moderators/handbook/. For a discussion of spam in history, see "Origin of the term 'spam' to mean net abuse" by Brad Templeton at http://www.templetons.com/brad/spamterm.html, a site that contains a lot of other interesting articles on the history of Internet.
Flying is inherently dangerous. We like to gloss that over with clever
rhetoric and comforting statistics, but these facts remain: gravity is
constant and powerful, and speed kills. In combination, they are
particularly destructive.
— Dan Manningham |
Where USENET has fallen tragically short is element 5: "Means of excluding burdensome people." Most USENET clients include "bozo filters" that enable an individual user to filter out messages from a persistently troublesome poster. But there is no collective way for a group to exclude a person who consistently starts irrelevant threads, spams the group, abuses others, or otherwise becomes unwelcome.
With regard to element 6, software extension by community members themselves, USENET has done remarkably well. USENET servers and clients tend to be monolithic C programs where small modifications can have catastrophic consequences. On the other hand, the average user of the early Internet was a skilled software developer. So if not every USENET user was a programmer of USENET tools, it was at least safe to say that every programmer of USENET tools was a user of USENET.
When building our own database-backed discussion forum system, there are some simple improvements that we can add over the traditional USENET system:
As the semester proceeds, you'll discover another advantage of building your own discussion forum, which is that it becomes an integrated part of your service. All of a user's contributions in different areas, including the discussion forum, are queryable from a single database and viewable on a single page.
I certainly had no feeling for harmony, and Schoenberg thought that
that would make it impossible for me to write music. He said, 'You'll
come to a wall you won't be able to get through.' So I said, 'I'll
beat my head against that wall.'
—John Cage |
If something is boring after two minutes, try it for four. If still
boring, then eight. Then sixteen. Then thirty-two. Eventually one
discovers that it is not boring at all.
—John Cage |
It would be easy to justify the creation of 100 separate forums on our
music site. And indeed USENET contains more than
50 rec.music.* groups, including
rec.music.beatles.moderated
, for example. That turns out
to be the tip of the iceberg, for the alternative hierarchy sports more than
700 alt.music.* groups , including
alt.music.celine-dion
and
alt.music.j-s-bach
. If USENET can support nearly 1000
discussion forums, surely a popular comprehensive music site ought to
have at least 100.
Maybe not.
She had a voice like the New Jersey State Anthem played on an electric razor.
— Bright Lights, Big City by Jay McInerney |
When discussion is fragmented, it is hard for a community to get off
the ground. If there are 50 users and 100 forums, how will those
users find each other? The average visit will result in a user
concluding that the community isn't active. Such a user is unlikely
to return or refer a friend to the site. Even when a community is
large enough to support numerous forums, presenting discussion in a
fragmented manner leads to extra work for the user whose interests are
diverse. Suppose that a music scholar comes to USENET looking to see
if there has been any recent discussion of Bach's "Schubler Chorales"
and their influence on later composers. That's as simple as visiting
alt.music.j-s-bach
. If that scholar wants to check up on
recent postings concerning Celine Dion's "My Heart Will Go On", he or
she will have to scan alt.music.celine-dion
separately.
A good example of a thriving community with a single discussion forum is slashdot.org. It is very easy to find the topics being actively discussed on slashdot: look at the front page.
It is possible to take the "one forum" and "many forum" approaches on the same site at the same time. For example, look at http://www.photo.net/bboard/ (static copy at http://philip.greenspun.com/seia/images-discussion/photonet-bboard-original.htm ). There are separate Medium Format, Nature Photography, and Photo Critique forums. For a user to browse the new postings in these three forums will require seven mouse clicks: down into this page, down into Medium Format, back, down into Nature, back, down into Critique. With a different SQL query, however, postings from all these very same forums can be combined on one page, as in http://www.photo.net/bboard/unified/ (static copy at http://philip.greenspun.com/seia/images-discussion/photonet-bboard-unified.htm). Postings from particular forum topics may be distinguished with a special publisher-chosen color or icon. Suppose that the user finds the Photo Critique forum overwhelming and uninteresting. These postings can be excluded from his or her personalized unified view via clicking on the "Customize forums" link at the top (static copy at http://philip.greenspun.com/seia/images-discussion/unified-forum-personalization.htm) and unchecking those forums that are no longer of interest.
Recall from the "User Registration" chapter an important user interface principle to keep in mind: it is more natural for most computer users to pick the noun first and then the verb. For example, the forum moderator might first click on a message's subject line to select it and then, on a subsequent page, select an action to perform to this message: delete, approve, rate, categorize, etc. It is technically feasible to build a system in which the moderator is first asked "Would you like to delete some messages?" and then is prompted for the messages to be deleted. However, this is not how the Apple Macintosh was designed, and therefore anyone who has used the Macintosh user interface or its derivatives, notably Microsoft Windows, will be accustomed to the noun-verb order.
This is your community and these are your users. So in the long run only you can know what administrative actions are most needed. At a minimum, however, you should support the following:
A suggested outline for the presentation is the following:
The user experience gap has grown larger because the users are less
sophisticated while the applications have grown more complex. In 2005
the average Web user is a first-time computer user and the Web
browser may be the only application that he or she knows how to use.
Despite the manifest inability of these users to cope with a complex
user interface, Web sites have been tarted up with JavaScript,
ActiveX, Java, Flash, to the point where they are as hard to use and
different from each other as old Unix applications. Users unable or
unwilling to deal with the horrors of custom user interfaces have voted
with their mice. They buy at Amazon. They search at Google. They
get their information from Yahoo! and nytimes.com.
Idiosyncratic ideas make sense for magazine and television advertisements. Different is good when it takes the user the same 30 seconds to absorb the message. But different is bad if it means the user needs extra time or extra clicks to get to the desired task. Some studies show that on each extra click there is a 50 percent chance that a user will abandon the site altogether.
As an aid to deciding whether to spend your future as an engineer or go on to business school, note that Webvan CEO George Shaheen ran the company into the ground, then resigned shortly before the bankruptcy filing, collecting a $375,000-per-year for life retirement package. |
How is it possible that people follow what they imagine to be their own good taste instead of either copying the successful Internet services (e.g., Yahoo!, Amazon, Google) or listening to the users? And that people continue to believe in the value of their own ideas even as the red ink starts to dominate their financial reports? Justin Kruger and David Dunning, experimental psychologists at Cornell University, wondered the same thing and wrote up their findings in "Unskilled and Unaware of It: How Difficulties in Recognizing One's Own Incompetence Lead to Inflated Self-Assessments" (Journal of Personality and Social Psychology; Vol 77, No. 6, pp 1121-1134; http://www.phule.net/mirrors/unskilled-and-unaware.html). Kruger and Dunning found that people in the 12th percentile of skill estimated themselves to be in the 62nd. Furthermore, these incompetent people failed to recalibrate themselves when shown the range of performance by their peer group. The authors concluded that "those with limited knowledge in a domain suffer a dual burden: Not only do they reach mistaken conclusions and make regrettable errors, but their incompetence robs them of the ability to realize it."
A scientist is someone who measures her results against Nature. An
engineer is someone who measures her results against human needs. A
computer scientist is someone who doesn't measure his results.
— us |
Use the following script of tasks (cut and paste these into a separate document and print it out, after filling in the bracketed sections), with no extra hints:
Stand as far away from the subject as you possibly can while still being able to see the computer screen and hear the subject's comments. Force yourself to remain absolutely silent. If the subject is completely confused and clicking around randomly, let the subject continue until he or she figures it out. Keep track of the number of seconds each subject requires to complete each task.
Post a report on your team server at
/doc/testing/discussion-usability
. This report will
contain a summary of what you learned from this test with average task
times and average total time (we can use these to compare the
efficiency of various teams' solutions). The report should contain
hyperlinks to sub-pages that contain transcripts of individual user
sessions, what each test subject said, and what happened. Link to your
report from your main documentation index page.
Building more structure into a discussion forum is an option worth considering, especially if your discussion forum is supporting an organized class. The Berkman Center at Harvard Law School (HLS) was a pioneer in this area. The teachers at HLS weren't happy with the bias in favor of early responders inherent in a standard discussion forum system. The first response to a question gets the most readers because it is near the top of the page, so it might be more ego-gratifying to be first than to spend more time crafting a thoughtful response. This shortcoming was addressed by writing what they call a semi-synchronous discussion forum. Responses are collected for a period of time, but not made public until the deadline for responses is reached. The system is called the Rotisserie.
An additional capability of the Rotisserie is the ability to randomly assign participants to respond to postings. For example, every student in a class will be required to post an essay in response to a question. After a deadline lapses, those essays are made public. The Rotisserie then assigns to each participant the task of responding to a particular essay. Every student must write an essay. Every essay gets a response. A particularly good or controversial essay might get additional responses. A particularly loudmouthed participant might elect to respond to many essays.
See http://h2o.law.harvard.edu for more information about the Rotisserie, to try it out, or to download the software.
Suppose that your online learning community is more open and fluid. You can't insist that particular people respond at all or that people respond on any kind of schedule. Is there anything that can be done with software to help ensure that all questions get answered appropriately? Yes! Build server-mediated mentoring.
Server-mediated mentoring requires, at a minimum, two things: (1) a mechanism for novice members (mentees) to be connected with more experienced members (mentors), and (2) asking people who post questions whether or not their question has been adequately answered. To make the service as effective as possible, you'll probably want to add at least the following: (3) automated reminders from the server to mentors who have left mentees hanging, and (4) rewards, rankings, and distinguishing typography to recognize community members who are answering a lot of questions and mentoring a lot of novices.
Imagine the following interaction:
Let's start with the data model first. To support requests for and
assignment of mentors, you'll need at least one table,
mentor_mentee_map
with the following columns:
mentee
, mentor
(NULL, if not assigned),
date_of_request
, date_of_assignment
,
mentee_goal
. To support the query "who is the currently
connected member mentoring" and build the workspace subsection page for
Jane, you'll want to add an index on the mentor
column. To
support the query "are there any mentors who should be notified about a
message posted by a member", you would add an index on the
mentee
column. If you were to make this a concatenated
index on mentee, mentor
, it would help the database identify
outstanding requests for mentors (mentor is NULL
)
efficiently for the "be a mentor page".
Attempting to support the open/closed question status display and the
query "Which members have answered a lot of questions well?" might make
you regret some of the data model decisions that you made in the
preceding exercises and/or in the "Content Management" chapter exercises.
In the "Content Management" chapter we have a headline asking "What is
Different about Discussion?" above the suggestion that the
content_raw
table can be used to support forum questions
and answers. If you went down that route and were implementing the
mentoring user experience, this is where discussion would diverge a bit
from the rest of the content on the site. You need a way to represent
in the database management system whether a discussion forum question is
open or closed. If you add a
discussion_forum_question_status
column to the
content_raw
table you'll have a NULL column value whenever
the content item is not a discussion forum question. That's not very
clean. You may also be adding a closed_question_p
boolean
column to indicate that a forum posting had been identified by the
original questioner as having answered the question. This will be NULL
for more than 99 percent of content items. That's not a storage
efficiency problem, but it is sort of ugly.
An alternative to adding columns is to build some sort of bag-on-the-side table recording which questions are open and closed and which answers closed them. To decide whether or not this is a reasonable approach, it is worth starting by asking "In what percentage of queries will the helper table need to be JOINed in?" When presenting articles and comments, you wouldn't need the table. When presenting the discussion forum to a public user, i.e., someone who wasn't logged in, the discussion forum page scripts wouldn't need the table data. You might need these data only when serving workspace pages to members and when serving an individual discussion forum thread to a logged-in member. It might be worth considering a table of the following form:
-- content_id is the primary key here; it is possible to have at most
-- one row in this table for a row in the content_raw table
create table discussion_question_status (
content_id not null primary key references content_raw,
status varchar(10) check (status in ('open', 'closed')),
-- if the question is closed the next column will contain
-- the content_id of the posting that closed it
closed_by references content_raw
);
-- make it fast to figure out whether a posting closed a question
create index discussion_question_status_by_closed_by on
discussion_question_status(closed_by);
As the community gains experience with this system, it will probably
eventually want to give greater prominence to responses from members
with a history of writing good answers. In a fully normalized data
model, for each answer displayed, the server would have to count up
the number of old answers from the author and query the
discussion_question_status
table to figure out what
percentage of those were marked as closing the question. In practice,
you'd probably want to maintain a denormalized metric as an extra
column or columns in the users
table, perhaps columns for
n_answers_posted
and n_answers_closing
,
counts maintained by nightly batch updates or database triggers.
Supporting the "initially show only to my mentor" option for new
content would require the addition of a
show_only_to_mentor
column to the
content_raw
table, where it could be used for discussion
forum postings, comments on articles, and any other content item.
Rather than changing all of the pages that use the content tables it
would be easier to update the SQL views that those tables use, e.g.,
articles_approved
, so as to exclude content that should
be shown only to a mentor.
Some new page scripts would be required, at least the following:
mentor_mentee_map
table
mentor_mentee_map
table
For the purposes of this course, you need not implement all of these grand ideas, and indeed some of them don't make sense when a community is just getting started because the number of members is so small. If, however, some of these ideas strike you as interesting consider adding them to your project implementation plan.
/doc/planning/YYYYMMDD-discussion
. (If
you name files with year-month-day in the beginning, they will sort in
order of creation.)