- Project tools
-
-
- How do I...
-
| Category |
Featured projects |
| scm |
Subversion,
Subclipse,
TortoiseSVN,
RapidSVN
|
| issuetrack |
Scarab |
| requirements |
xmlbasedsrs |
| design |
ArgoUML |
| techcomm |
SubEtha,
eyebrowse,
midgard,
cowiki |
| construction |
antelope,
scons,
frameworx,
build-interceptor,
propel,
phing
|
| testing |
maxq,
aut
|
| deployment |
current |
| process |
ReadySET |
| libraries |
GEF,
Axion,
Style,
SSTree
|
| Over 500 more tools... |
|
summarydesk
Project home
If you were registered and logged in, you could join this project.
What is SummaryDesk?
SummaryDesk is a Web-based interface for writing mailing list
summaries. It takes care of all the bookeeping, and lets humans
concentrate on the non-automatable part: actually writing the
summaries.
| SummaryDesk development is sponsored by: |
 |
What are "summaries", and why do we need a special system to write
them?
A summary is a condensed version of all important traffic
that happens on a project mailing list. Summaries themselves cannot
be automated: a human has to read the emails, decide what's going on,
and write a shorter version for an audience that doesn't have time to
follow the details. However, many of the most time-consuming aspects
of producing summaries can be automated. Some examples:
The summarizer should be able to include URLs to specific
messages or threads with a single click, instead of
cutting-and-pasting manually.
The system should take care of publishing the summaries
automatically. The human summarizer should only be required
to write the summary texts, and flag them as ready for
publication or not. The system should do the rest.
The system should make it easy for multiple humans to
collaborate on summarizing a busy list, by managing
in-progress summaries centrally, in a way that is visible to
all the summarizers.
Quantitative / statistical data about the mailing list (such
as who posted the most, what topics were most popular, etc)
can be tracked entirely by the system; humans should not
have to spend time on that.
In other words, summarizers can benefit from good tools just like
anyone who faces a complex, repeated task.
Unfortunately, there don't seem to be any really good tools out
there. Most summaries today use a system called ktpub, named
after the mailing list summary for which it was invented: the Linux
Kernel Traffic series produced by Zack Brown. Using ktpub is much
better than trying to do summaries entirely by hand (in particular,
ktpub does the statistical analyses mentioned above), but it still
leaves vast room for improvement. The summarizer must do many tasks
manually which could be automated. In ktpub, the summarizer produces
a master XML file containing the week's summary, and then runs tools
to convert that to HTML, text, or whatever consumable format is
desired. The process of producing the master XML, however, is highly
idiosyncratic: it involves lots of dedicated hacks and editor tricks
to save time writing the XML (e.g., special scripts to grab URLs,
etc). The problem is that these tricks are local to Zack Brown, or
whoever the summarizer is. If he has to hand off editorship to
someone else, or get assistance, the new people will have to come up
with their own tricks — even though everyone is
dealing with the exact same set of problems! (See our conversation with Zack Brown about this; it turns out that he'd
been wanting a system like SummaryDesk all along.)
SummaryDesk is intended to solve the summarization problem
completely. We mean it to be the next-generation ktpub: a
centralized, web-based, highly automated system for producing
summaries. It will incorporate every identifiable efficiency that we
can think of a way to implement, so that all users benefit from the
best practices available.
Overview
You configure SummaryDesk to watch a set of mailing lists. For
each mailing list, it keeps track of each thread that takes place on
the list, and associates with each thread a summary, that starts out
empty of course. From time to time, a human visits the SummaryDesk
main page, and selects a list and thread(s) to work on. SummaryDesk
presents the selected threads in a conveniently browseable form, and
by each thread is a text box, where the summarizer can enter that
thread's summary. As she updates the summary, she can save her
work-in-progress at any time. At some point, she marks the thread's
summary as "publishable", meaning that it will be included in the next
scheduled auto-publication of the summary newsletter. Marking a
summary as publishable doesn't mean she has to stop working on that
summary, it just means that whenever the newsletter goes out, the
current state of the summary will be used. The summarizer can also
write a "header" and "footer" summary for the list for that week, to
give an overview of what list activity has been like. Like the
individual thread summaries, these overviews are not published until
marked as publishable.
SummaryDesk stores all its data in a database. It is a
self-updating system: that is, no manual update process is required
when SummaryDesk comes back online after being offline for a while.
SummaryDesk just looks at the mailing list archives and brings itself
up-to-date whenever it is invoked. (Well, actually, it doesn't look
directly at the archives, it looks at the ThreadFind reflection of the
archives; see Dependencies for more on
that).
To-Do List
As you may have guessd by now, SummaryDesk is not a
production-ready system yet. Remaining work, in no particular
order:
- Text and xml formats for the Publication system, and the toggle of the
publishable fields in the database.
- Beautify the html pages.
- List messages in thread order
- More efficient way of displaying threads in the summary-status
page be devised? (This problem becomes apparent as one starts
doing summaries)
- More keybinding (similar to emacs?) so that user does not need to
navigate between the message-list and the summary-editor pages
using the mouse.
Dependencies
SummaryDesk uses ThreadFind to actually gather the messages. ThreadFind is an
independent system whose purposes are beyond the scope of this
document. However, having SummaryDesk watch a mailing list requires
also having ThreadFind watch that list; this is easy to configure and
will be covered in the documentation, which we're still writing.
How to get it working, from scratch.
Create the database user.
Make sure the mysql users summarydeskrw' and 'summarydeskro'
exist, that the first has read/write access to an existing database
named summarydesk, and that the second has read-only access:
$ mysql -u root -p
Password: *******
mysql> grant all on summarydesk.* to summarydeskrw@localhost
identified by 'SECRET';
mysql> grant select on summarydesk.* to summarydeskro@localhost
identified by 'SECRET';
mysql> ^D
$
Create the database.
$ echo "create database summarydesk;" \
| mysql -u summarydeskrw --password=SECRET
$ cat init-summarydesk.sql \
| mysql -u summarydeskrw --password=SECRET summarydesk
Configure an instance of ThreadFind (http://threadfind.tigris.org/).
Configure your Web server for SummaryDesk:
Alias /summarydesk /path/to/summarydesk/folder/ending/with/summarydesk
<Directory /path/to/summarydesk/folder/ending/with/summarydesk>
Options Indexes +ExecCGI
<FilesMatch "^summar">
SetHandler cgi-script
</FilesMatch>
<FilesMatch "^message-list$">
SetHandler cgi-script
</FilesMatch>
<FilesMatch "^publish$">
SetHandler cgi-script
</FilesMatch>
<FilesMatch "^mailing-list-view$">
SetHandler cgi-script
</FilesMatch>
</Directory>
Run summarydesk-ctl -c config-file [-d DD-MM-YYYY] start
Start summarizing at http://yourhosthere/summarydesk/ !
|