CSL 85 9_Walnut_Storing_Electronic_Mail_in_a_Database 9 Walnut Storing Electronic Mail In A Database

CSL-85-9_Walnut_Storing_Electronic_Mail_in_a_Database CSL-85-9_Walnut_Storing_Electronic_Mail_in_a_Database

User Manual: CSL-85-9_Walnut_Storing_Electronic_Mail_in_a_Database

Open the PDF directly: View PDF PDF.
Page Count: 20

Walnut:
·Storing Electronic Mail
in
a Database
James Donahue and Willie-Sue Orr
Walnut: Storing Electronic Mail
in
a
Database
James Donahue and Willie-Sue Orr
CSL-85-9 November 1985 [P85-00028]
© Copyright 1985 Xerox Corporation. All rights reserved.
Abstract:
Walnut is
an
electronic mail storage and retrieval system developed for the Cedar
environment. The most novel aspect of Walnut is its use of a general·purpose relational
database for storage of messages. This paper discusses the design and implementation of
Walnut, with particular attention to the problems of using a database system for this type of
application.
CR
Categories
and
Subject
Descriptors:
H.2
[Database
Management]:
Systems -
transaction processing; H.4.3
[Communications
Applications]:
Electronic mail.
Additional
Keywords
and
Phrases:
Cedar, integrated applications, background
processing.
XEROX
Xerox Corporation
Palo Alto Research Center
3333 Coyote Hill
Road
Palo Alto, California 94304
WALNUT:"
STORING ELECTRONIC MAIL
IN
A
DATABASE
3
designers
was
the emphasis
placed
on
asynchrony.
One
of
the
main
goals
of
the
Cedar
system
was
easy
switching
between
tasks
and
allowing
the"
user
to
work
on
new
tasks
while
previously
started
ones completed. This emphasis
imposed
several
requirements
on
Walnut:
1.
The user
can
view
several
components of a Walnut database
simultaneously.
Thus,
Walnut
operations must
be
prepared
to
keep
the
contents of multiple
windows
up-to-date;
we
guarantee that
the
effect
of
any
user
operation
on
the
database
will
be
consistent
with
the
portion of the database that
is
currently
displayed.
2.
More
importantly, Cedar
users
want
to
leave
Walnut running constantly-they don't
want
to
have
to
restart the
system
to
begin
reading their
mail
again.
Much
of
what
we
do
can
be
characterized
as
responding
to
our
mail,
either
by
answering
messages
directly
or
incorporating information
from
our
mail
into the
programs
and
papers that
we
write.
Because
we
store Walnut databases
on
a shared
file
server,
it
is
important
to
be
able
to
give
up
the connection to the server
when
the
user
is
idle.
Providing
the
consistency
guarantee
stated
above,
while
not
going
to
the
extremes
of either
holding
onto
long
transactions or
redisplaying the contents of
several
Walnut
windows
when
starting
up
after a
long
idle
period,
was
a major
design
problem.
3.
Finally,
having
plenty
of
things
to
do
is
no
help
when
the
only
thing
that
you
really
want
to
do
takes
a
very
long
time.
Walnut
has
two
operations that
are
potentially
much
more
expensive
th~n
all
of
the others: reading
mail"
from
mail
servers
and
expunging
old
mail
from
the
system.
New
mail
retrieval
is
the
more
important,
because
it
is
done
frequently
by
most
users.
We
currently
use
background
processing
to
add
new
mail
to
the
system
and
have
designed (but not
yet
implemented) the background
collection
of
space
consumed
by
expunged
messages.
In
this
way,
we
were
able
to
smooth
out
the
performance
of
the
system
substantially;
however,
as
we
discuss
below,
the
cost
of
these
improvements
in
perceived
performance
is
a rather substantial complication
in
the
internal
processing
of
the
system.
2.2
The Mail Distribution Facilities
Walnut
is
only
a
mail
storage and retrieval
system;
it
relies
on
the
Grapevine
message
transport
system
to
provide the
new
mail
to
store
in
a Walnut database and
to
deliver
the
messages
that a
Walnut user
composes.
Grapevine
is
discussed
in
[Birre1l83].
It
is
used
by
Walnut through a
GrapevineUser
package
that
runs
on
the
client workstation;
for
the
purposes
of
this
paper, the
GrapevineUser
package
provides
four
operations:
1.
Wait until
new
mail
exists
for
the
current user
on
qne
or
more
of
the
Grapevine
servers,
2.
" Enumerate
the
servers
on
which
the
current
user
has
a
mail
box
(in
which
there might
be
new
mail)
and establish a connection
with
each
of
them
in
turn,
3.
Copy
messages
from
the current server
to
a
file,
and
4.
Flush previously copied
messages
from
the
server.
(Copying
and
then
flushing
allows
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
WALNUT: STORING ELECTRONIC MAIL
IN
A
DATABASE
protection against crashes.)
It
is
easy
to write a "new
mail
retrieval" section
of
Walnut that
would
perform correctly:
when
new
mail
exists,
copy
the
new
mail
into the database and then
flush
the
messages
from
the Grapevine
servers. However, there are
some
performance characteristics
of
Grapevine that
have
to
be
considered
when
building a responsive
system.
1.
The Grapevine servers are slower than most
of
our workstations (and the servers
are
generally
heavily
loaded). Thus, fetching
new
mail
is
relatively
slow
-it
can
take several
minutes
to
retrieve
new
messages
when
returning
from
a
vacation
or a
long
weekend.
2.
Grapevine server crashes
are
rare, but
they
do
occur. Unfortunately, a Grapevine server
may
crash between the time at
which
the
user has been told that
new
mail
exists
and the
time at
which
he attempts to
fetch
it.
And
when
this happens, the error indication does
not happen until the connection attempt times out a
few
minutes later.
Both
of
these considerations make the cost of retrieving
new
mail
hard to predict.
If
the servers
are up and lightly loaded, and there aren't
many
messages
to
fetch,
and
you
are
near the server
on
the network, and
...
, it
may
go
quickly.
On
the other hand, there
are
frequently situations
in
which
new
mail
is
going
to
arrive
slowly,
or
possibly
not at
all.
As
we
discuss
later,
we
use
background
processing
in
Walnut
to
attempt to minimize the variance
in
responsiveness this
causes.
2.3 File Storage and Databases
in
Cedar
It generally takes three or more machines to read
mail:
one or more Grapevine servers (where
the user's
mail
boxes
reside), one or more
file
servers
(where the user's
mail
database resides), and
the user's workstation (which
is
running the Walnut program).
"
2.3.1
Alpine
Walnut stores
mail
files
on
Alpine
file
servers
[Brown84].
Alpine
is
a
file
server program (written
in
Cedar) that provides page- and
file-level
locking and transactions: a general-purpose remote
procedure
call
package
is
used to transfer
pages
to
and
from
Alpine
servers.
Walnut
uses
Alpine
transactions
to
prevent inconsistent updates due
to
either concurrent
access
or server crashes.
Although Alpine
can
"run
on
a client workstation (it
will
coexist
with
the normal Cedar
file
system), most Walnut users store their
files
on separate Alpine server
machines.
The reasons
for
storing
files
on
a server are simple: backup
of
the server
is
done
by
the administration and the
Walnut user
is
then
free
to
read
his
mail
from
any
Cedar machine that happens
to
be
available. The
ability to
easily
move
among machines
is
particularly important
for
our summer interns,
who
typically
do not have personal workstations at their
disposal
but must share a
pool
of public machines;
using
a shared server
also
makes it possible
to
have public
mail
databases.
However,
using
a
file
server
machine introduces some interesting failure modes and makes it
necessary
to
manage the connections
with the shared resource.
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
WALNUT:
STORING ELECTRONIC
MAIL
IN A DATABASE 5
2.3.2 Cypress
Walnut manipulates Cypress databases [Catte1l83]. Cypress
is
an entity-relationship database
system developed
in
Cedar.
The
details
of
the Cypress data model are not particularly important;
almost any relational system would have served us as well. Cypress provides a fairly low-level
programmer's interface, which
is
important since
we
use the database to record and to manipulate a
large amount
of
information about the state
of
Walnut processing. The database
is
a natural place
to store state information and the Cypress programmer's interface
is
efficient enough to make this
practical.
Cypress currently runs on the client workstations,
i.e.,
the workstation running Walnut also runs
Cypress. We have plans to change Cypress so that, like Alpine, it can run on a separate server;
when this
is
done,
we
will
be able either to run Alpine and Cypress on the same machine (as a
"database server")
or
to run Alpine on one machine, providing only
file
service and run Cypress on
a separate machine, communicating with both the client .machine (running Walnut) and the Alpine
servers using remote procedure calls.
Cypress uses Alpine transactions in a straightforward manner: instead
of
doing logical locking
of
tuples
or
relations, Cypress simply locks the pages
of
the database
as
tuples are read
or
written.
Because the mapping from Cypress entities and relationships to Alpine pages
is
not under the control
of
the Cypress client, the client cannot depend on logically independent operations to be physically
independent on the Alpine server. (Cypress attempts to perform some co-location
of
tuples for the
same entity on the same page,
but
this
is
done to improve performance rather than to reduce
transaction conflicts.)
3.
The Walnut User Interface
As
we
mentioned above, the user's view
of
a Walnut database
is
relatively straightforward. A
database consists
of
messages with immutable properties and message sets with changeable contents.
This
is
reflected in the user interface
by
having three separate types
of
windows on a Walnut
database:
1.
The
Walnut "control window," which contains the list
of
all
the current message sets (in
Figure
1.
the built-in message sets Active and Deleted are highlighted), buttons for basic
operations such
as
shutdown and retrieval
of
new mail, and a typescript that records the
history
of
user interactions.
2.
A Walnut message set displayer lists the messages in a message set and provides a menu
of
operations for them (Figure
2).
The message names displayed
in
italics have not been read:
the message name displayed
in
bold
is
the "current selection"
in
the message set. (The
choice
of
these fonts
is
a user-specified option.) The operations
in
the menu
use
the
"current selection"
as
an implied argument: the AddTo and MoveTo operations use the
selected message sets
in
the Walnut control window
as
arguments. A message
in
a message
XEROX PARe.
CSL-85-9.
NOVEMBER
1985
6 WALNUT: STORING ELECTRONIC MAIL
IN
A
DATABASE
; , _
1j.i
'
iialnut
6.0 : "
Sender
N
ewMail
CloseAll
IvIs
gSetOps
NumMs
gs
Explnfo
Find
There
is
no
nevI
mail
at
.November
11, 1985 5:01:47
pm
PST
Act
i
ve
De 1
eted
Addresses
ADForum
Alp;
ne
Arpanet
Banyan
Bass
b;
b
1;
ography
Bu
11
et;
nBoard
Cedar6.
0
CedarArch
i
ve
CedarD;
scuss
i
on
CedarLang
CedarLore
CedarLoreNew
CedarP
1
an
Chr
i s
C8L
Cypress
DBP
1
ans
Disks
DocMgmt
Dragon
finger
Grapev
i
ne
Hi
ckory
icons
I
ntegrat;
on
I
nterest
I
nterscr;
pt
Junk
Object8erverDiscussion
Oct17MeetingMinutes
Projects
Recru
i t i ng
Russe
11
8em
i
nars
80ftwareForum
8qu;
rre
1
TechReports
Tr
i
ps
Types
UGrant
UN
I X
~~a
1
nut
Wa
1
nut8tats
~~a
1
nut8uggest;
ons
Wh;
teboards
~~;
shL
;
st
Addin
g
new
mail
to
the
database
Sal v
ador
.ms
delivered
1
message
Msg:
Ending
backup
(file
s
...
:
has
been
Deleted
from
Active
Copyright © 1985 by Xerox Corporation.
All
rights reserved
Figure
1.
The Walnut Control Viewer
set can be displayed simply by pointing the mouse at it and clicking one
of
the mouse
buttons.
3.
A Walnut message displayer. which displays the text
of
a message (Figure
3).
The menu
items in the window include operations to generate message-sending windows to Answer or,
Forward a message. a Print button for producing hardcopy. and operations to manipulate
the contents
of
the message
as
a normal Cedar document (the "Places" and "Levels"
buttons). Note that messages in Walnut can be formatted
documents-the
mail system uses
the same editor as the rest
of
Cedar.
,
V'~jI'hi
te boa.rds IvIessa.ges
'"
"..
..
-'
. >
MoveTo
Display
Delete
AddTo
Places
Levels
MsgOps
7
Aug
85
To: CedarUs...
Whlteboard
for
CedarChest
6.0
13
Aug
85
PeterKessler.pa
WhiteBoardDoc
17
Sep
85
jw@GVAX""
Whiteboards
paper
2.4
Sep
85
To: jw@G... Re:
Whiteboards
paper
2.5
Sep
85
jw@GVAX....
Found
it
...
2.6
Sep
85
jw@GVAXIIII
Version
Starnps
and
The
Promised
Paragraph
2.6
Sep
85
To: jw@G... Re:
Version
Stamps
and
The
Promised
Paragraph
'/7
Oct
85
To.'
Barth
Re.·
Core
discussion
17
Oct
85
Sprei
tzer
.pa
Re: Core discu.ssion
11
Oct
8S
Spreitzer.pa
Re:
Core
discussion
18
Oct
85
Barth.pa
Large
text
boxes
on
whiteboards
18
Oct
85
To:
Barth
Re:
Lar
ge
text
boxes
on
white
boards
25
Oct
85
To.'
CedarUse...
Hthiteboard
for
CedarChest
6.0
Copyright © 1985 by Xerox Corporation. All rights reserved
Figure
2.
A Walnut Message Set Viewer
XEROX
PARC,
CSL-85-9.
NOVEMBER
1985
WALNUT: STORING
ELECTRONIC
MAIL
IN
A
DATABASE
Freeze
Categorles
Answer
Forward
Prmt
gvID Split Places Levels
Date:
Mon~
4 Nov
85
14:34:43
PST
From:
Terry.pa
Suhject:
J.
Ousterhout
on
"The Sprite Network Operating System," December
10
at
2.
p.m.
in
the
CSL
Commons
To:
Compu terResearch t .pa
Reply-to:
Terry.pa
I am
announcing
this
now
for those
of
you
that
may
have
been
planning
to
drive
to
IBM
to
hear
John's
talk. I'll send
out
a reminder
in
a few weeks
...
Tuesday~
December
10~
1985
2.
p.m.~
CSL
Commons
(35-2.2.30)
THE
SPRITE
NETWORK
OPERATING
SYSTEM
John
K.
Ousterhout
Computer Science Division
University
of
California~
Berkeley
Sprite is a new network operating system built by a team
of
graduate students and myself as part
of
the
SPUR
workstation project. The talk will focus
on
three parts
of
Sprite: the filesystem, process
offloading, and the virtual mem.ory system. The filesystem will provide a single· shared Unix-like file
hierarchy distributed across several servers. Clients will use dynamically-constructed prefix tables
to
associate file names with servers. Sprite will include a process offloading mechanism that allows jobs
to
be run on idle workstations in the same way that jobs may be placed in background
in
Unix. The
virtual memory system will be Unix-like with simple extensions
to
permit shared data segments and
synChronization.
I'll
talk about how changes
in
technology have influenced the design
of
Sprite and
try
to
convince you that
a)
a simple network filesystem eliminates the need
for
other network protocols,
RPC,
name servers, and the like; b) magnetic disks will soon be obsolete; and
c)
paging is also about
to
be obsolete (sort
of).
Host: Doug
Terry
Copyright
© 1985
by
Xerox
Corporation.
All
rights
reserved
Figure
3.
A
Walnut
Message
Viewer
7
Something that
is
missing from the user interface for Walnut
is
a more general mail database
querying facility than simply being able to display the contents
of
a message set. A query .
facility has been built for Walnut. but
it
is
not an integral part
of
the system (and currently
is
not widely used). There are two reasons why
we
did not build a general query processor
as
part
of
Walnut:
1.
Our
first goal was to get the underpinnings
of
the system right before adding the substantial
complexity that a queryer would entail. We were conservative in
our
ambitions because
of
the importance
our
users place on high reliability and performance.
2.
More importantly, a good test
of
the programmer's interfaces to Walnut
was
whether
we
could build a query processor on top
of
Walnut, rather than inside it. Being able to do so
made
it
much easier to modify the queryer
as
we
gained experience with it and made it
possible for the Walnut users who did not need
or
want a query processor to avoid paying
any performance penalties. We
will
discuss below the query processor and the other
packages built on top
of
Walnut.
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
8 WALNUT: STORING
ELECTRONIC
MAIL
IN
A
DATABASE
4.
The
Walnut
Internal Structure
4.1
Files
Instead
of
storing
all
of
the information pertaining to messages and message sets
in
a Cypress
database. Walnut uses multiple
files
for storage: a Cypress database
file
and a collection
of
log
files.
Both the logs and the database are stored on Alpine servers. The most important
of
the
log
files
is
the "current log" file, which contains both the contents
of
all
of
the messages and a history
of
all
of
the operations that a Walnut user has performed (expunging the database compresses this history
by
discarding entries for deleted messages); thus, the contents
of
the database can always be reconstructed
by
replaying the current
log.
The database contains the current association
of
messages with message
sets and the information necessary to display messages
(e.g.,
the label for a message
in
a message set
viewer and the position
in
the log
of
the message body). Additional log
files
are used to store
incoming messages and to execute
an
expunge operation -the details
of
their use
is
given below.
This separation
of
the information into several
files
was
motivated
by
several concerns:
1.
Cypress, like most database systems, does not handle large varying length strings
well.
We
knew that Walnut would have to handle large messages,
so
the use
of
a separate
log
to store
the message bodies
was
necessary. Messages
of
approximately half a megabyte have been
sent and received
by
Walnut; the system
was
designed to handle messages much larger.
2.
Initially, there were some worries about the reliability
of
Cypress; indeed, a
few
bugs were
found in Cypress when Walnut use became heavy. Having a separate log meant that one
could always
fall
back to reconstructing the database rather than having to find
ways
of
recovering data from damaged databases. Reliability
was
critical
in
the acceptance
of
the
system; our users were willing to have the system crash
if
the recovery
was
simple. and
certain.
3.
Having a log also meant that
we
had more freedom
in
changing the database schema,
something that
we
have done during the course
of
the Walnut development.
We
can release
a new version with a change
in
the database schema without issuing complicated instructions
to our users. Instead,
we
record
in
the database a "database schema version stamp" that
is
checked by each Walnut release; having the wrong version stamp
is
logically no different
than having a mangled database -
in
each case, recovery
is
done
by
replaying the
log.
(Still,
rebuilding databases
is
not something that
we
do regularly; most Walnut users rebuild only
when forced to do
so
because
of
serious hardware or software failures.
In
particular,
we
do not depend on rebuilding daiabases to reclaim
file
space.)
A "root file"
is
used to provide a logical connection between the log
files
an~
the database.
Again, the root file
is
stored on Alpine. The root
file
contains the name
of
the database
file
and the
list ·of the log files; additionally it records which user
may
add
new
mail to the database and a
user-specified "key" that
is
stored
in
all
of
the
log
files
and
in
the database to help guarantee that
the logs and database mentioned
in
the root really do belong together (this
key
is
usually some
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
WALNUT: STORING ELECTRONIC MAIL
IN
A
DATABASE
9
descriptive phrase, such
as
"Donahue's mail database").
The fact that the information
in
a Walnut database
is
spread over several
files
is
carefully hidden
from both the Walnut client programs and (as best
we
can)
from
the user. The root
file
is
really
the
file
that describes a Walnut
database-it
is
the responsibility
of
the
low
levels
of
Walnut to
interpret the contents
of
the root
file
and to make the collection
of
files
described therein behave
as
if
all
of
the data were stored in a single database.
4.2
Programmer's
Interfaces
Walnut makes available three programmer's interfaces. Together, these capture the semantics
of
a Walnut database
as
a collection
of
"abstract datatypes." The lowest level interface, WalnutOps,
provides all
of
the operations that can be legally performed on a Walnut database. The user level
interface, WalnutWindow, gives a client program access to the same set
of
operations that a Walnut
user can perform through the buttons
on
the screen. Finally, the WalnutRegistry interface provides
a client program with a history
of
the WalnutOps operations that are performed
so
that the state
transformations
of
the database can be monitored.
4.2.1
WalnutOps
The semantics
of
a Walnut database are
completely
characterized for Walnut clients by the
WalnutOps interface. WalnutOps includes
all
of
the operations needed for the Walnut user interface
code (such
as
"get the contents
of
this message," or "move this message to this message set") and
all
of
the operations that are used to synchronize the various Walnut processes (more about this
will
be given below). The operations
in
WalnutOps can be separated into the following categories:
1.
Starting and
stopping
Walnut:
The StartUp operation takes the name
of
a root
file
and
initializes the system to operate on the database and
log
files
named
in
the root; the
ShutDown operation shuts down all
of
the server
c<?nnections
and prohibits
all
other
operations until a subsequent StartUp
is
performed.
2.
Primitive
message
set
operations:
Given the name
of
a message set, the operations
CreateMessageSet, DestroyMessageSet, and
EmptyMessag~Set
do the obvious things;
additional operations are provided to enumerate the messages in a message set, to move
messages between message sets, and to add or remove a message from a message set.
Although message sets are entities in the Walnut database schema, there
is
nothing in the
WalnutOps interface that commits
us
to this
choice-the
interface only uses names. Thus.
we
could change the Walnut schema to use a
diff~rent
encoding
of
message sets without
changing the abstract characterization provided
by
WalnutOps.
3.
Primitive
message
operations:
The only operations currently provided on messages are ones
that produce the contents·
of
a message and the message sets to which a message belongs.
4.
New
Mail:
WalnutOps contains no operations to retrieve messages from the Grapevine
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
10
WALNUT: STORING
ELECTRONIC
MAIL
IN
A
DATABASE
servers.
Instead, it provides operations
to
obtain a
write
lock
on
the
new
mail
log,
to
mark
a portion of the
new
mail.log
as
information
to
be
preserved,
and
to
copy
the
contents
of
the
new
mail
log
onto the current
log
(to
fetch
new
mail).
StartNewMail
opens
the
new
mail
log
for
writing;
EndNewMail
operation
closes
the
new
mail
log.
GetNewMail
first
copies
the contents
of
the
new
mail
log
onto
the
current
log
and
resets
the
new
mail
log
to
be
empty,
then
replays
the
resulting
new
"log
tail"
to
add
the
new
messages
to
the database.
and
finally
returns information about
the
new
messages.
Finally,
the
RecordNewMailInfo
operation
records
in
the
database
the
current
length
of
the
new
mail
log.
After
RecordNewMailInfo
finishes,
messages
that
have
been
written
to
the
new
mail
log
can
be
flushed
from
Grapevine -
they
are
guaranteed
to
eventually
be
entered into
the
Walnut
database.
When
restarting Walnut after a
crash
the
new
mail
log
is
shortened
to
the
length
given
by
the
last
RecordNewMailInfo operation
(since
this
is
the
position
at
which
messages
may
last
have
been
flushed
from
Grapevine); truncating
the
log
prevents
large
numbers of
duplicate
messages
from
being
created
in
the
event of a
crash
while
reading
new
mail.
4.2.2
Walnut
Window
For
many
client programs, WalnutOps
is
at
too
Iowa
level;
for
instance,
the
Cedar calendar
system
simply
needs
to
be
able
to
display
a
named
message.
WalnutWindow
provides
program
access
to
all
of
the
user-level
operations,
such
as
displaying
messages,
reading
new
mail,
etc.
The
most
interesting operation
in
the
interface
is
QueueCall.
which
provides
a
WalnutWindow
client
with
the
means
of
executing a "Walnut
user
level
transaction."
QueueCall
takes
as
its
argument
an
arbitrary procedure.
which
is
called
atomically-no other concurrent Walnut
user
activity
is
allowed.
The procedure provided
can
perform arbitrary
WalnutOps
operations.
so
it
can
change
the
database
in
an
arbitrary
fashion;
when
the
call
is
finished.
WalnutWindow
ensures
that the current
state
of
the
display
is
consistent
with
the
new
state
of
the
d~tabase.
4.2.3
WalnutRegistry
Cypress
provides
no
provision
for
"triggers" to.
be
stored
in
the
database,
yet
we
have
found
need
for
other applications
to
be
notified of
changes
in
a Walnut
database.
For
example.
we
can
attach
voice
annotations
to
electronic
mail;
to
ensure that
the
space
taken
up
on
a
Voice
File
Server
[Swinehart84]
is
reclaimed
when
the
message
to
which
an
annotation
is
attached
is
expunged. it
is
necessary
for
the
voice
mail
program
to
track
the
movement of
messages
through
a Walnut
database.
This
is
done
through the WalnutRegistry interface. WalnutRegistry
allows
a client
program
to
register a
collection
of
procedures
to
be
called
when
events
of interest
occur;
these
events
include
the starting and stopping
of
Walnut. the
reading
of
new
mail.
the
movement
of
messages
among
message
sets,
and
the
creation or deletion of
message
sets.
When
a
WalnutOps
operation
is
performed
that
causes
an
event of interest
to
occur.
the
set
of
procedures registered
for
the
event
are
invoked.
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
WALNUT: STORING
ELECTRONIC
MAIL
IN
A
DATABASE
11
5.
Implementation
Problems
and
Solutions
5.1
Managing Large Objects
The most obvious effect
of
our concern to handle large. messages
was
the use
of
a separate log
to store the bodies
of
the messages. This made it possible for us to implement straightforward
techniques to manage the storage used for message bodies. Also, this decision also made it easier to
build a robust system (for reasons captured nicely in [Lampson84]).
The combination
of
large size and high volatility makes space reclamation for messages a major
concern. Using a separate log
file
makes it easy to collect unused space
by
rewriting the log to
remove entries for messages that no longer exist in the database. The Walnut expunge code (which
does the "garbage collection") goes through the following phases:
1.
Destroy all
of
the messages
in
the database that belong to the Deleted message set.
2.
Read the root
file
to determine which
file
is
to be the "expunge log." Open the expunge
log for writing.
3.
Parse the current log and copy to the expunge log each entry
in
the current log that refers
to a message that still exists in the database. When copying a message, update the indices
stored in the database to refer to the message's new position in the expunge
log.
4.·
Rewrite the root file to swap the expunge log and the current
log.
The expunge log 'now
becomes the current log. Set the length
of
the old current log (the new expunge log) to
zero, since its contents are no longer important.
We spent a good deal
of
time tuning the parsing routines (and designing the format
of
the log
entries themselves) so that expunging is not prohibitively expensive, even though it must scan the
entire log. Currently, expunging takes approximately two minutes per megabyte
of
log (i.e., it takes
about ten minutes to reclaim the space on a log
of
four
or
five
megabytes; there
is
substantial
variation in this figure depending on how busy the Alpine servers. are). This
is
sufficient performance
that people can start expunges when they go to lunch
or
to a meeting and have the expunge
completed with certainty by the time they return. However,
we
also have designed a background
collection algorithm that would do the copying
as
a background process; to handle databases larger
than around
20
or
30
megabytes, background collection would be necessary.
Below,
we
sketch some
of
the basic ideas
of
how this could be done.
The first idea
is
to generalize the single "current log"
of
Walnut to a collection
of
log files that
together represent the current log. This would allow all
of
the log files except that last one to be
rewritten by a background garbage collector without having the collector attempting to rewrite the
file being used to log user actions. The format
of
the "root file"
was
designed with this
in
mind:
while
we
currently only use entries for the current log and the expunge log,
we
could use the same
format to describe a sequence
of
log files.
The hardest part in doing background operations
of
this sort
is
to minimize the
ch~mces
of
XEROX
PARC,
CSL-85-9,
NOVEMBER
1985
12
WALNUT: STORING
ELECTRONIC
MAIL
IN
A
DATABASE
conflict between the background process (which
will
have its own transaction) and the normal
processing.
To
do this for expunging.
we
split the "copy and update" step
of
the current expunge
algorithm into separate copy and update steps: the destruction
of
the messages in the Deleted message
set
is
still performed synchronously, but the cost
of
this operation
is
determined
by
the size
of
the
Deleted message set rather than the size
of
the entire database. The copy step parses a
log
segment
(one
or
more log files) and produces a compressed version
of
it-copying
needs to read the database
but does not write into it. And the reading
of
the database does not need to be done under the
copy transaction. since a "fuzzy dump"
is
adequate:
we
only need to ensure that all existing messages
get copied. Thus, this copying can use the "normal processing" transaction when reading the
database; using the same transaction
as
the normal processing guarantees that no transaction conflicts
can occur with the user's activities.
The update step
is
then performed. The compacted
log
is
again parsed and each time a message
entry
is
found in the
log,
the database
is
updated to point to the new location. Here the updates
need to be done atomically, so it
is
possible that there may be conflict with normal processing.
However.
as
we
discuss below, these transaction conflicts can be effectively hidden from the user.
Finally. the length
of
the old log
file
is
set to zero and the root
file
is
rewritten to make the old log
file the new "expunge log." (This avoids the cost
of
a
file
creation when a
log.
compression is begun.)
Managing large objects also involves another, less obvious, problem:
if
the objects are very
large. even simple operations can no longer be performed
in
a single transaction. For example, the
GetNewMail operation o(WalnutOps copies newly retrieved messages from the
new
mail
log
to the
current log; even
if
the new mail log contains only a single message, it may not be practical to
perform the copy in one transaction (the cost
of
redoing the copy
if
the transaction aborts may be
too high to be practical). The major technique used in Walnut to ensure that the system can be
used to manage large messages (as
well
as
large numbers
of
messages)
is
to break up any potentially
long-running operation into a sequence
of
constant-sized pieces such· that the transaction can be
committed after execution
of
one (or more) piece.
Breaking long transactions into small pieces
is
an idea that
is
widely used in high-performance
systems such
as
airline reservation systems (for instance. [Gifford84]). but it came
as
a surprise to
us
that an electronic mail system would have the same requirements! One corollary
of
breaking all
operations into constant-sized chunks
is
that Walnut must do its own crash recovery. A crash may
occur during a long-running operation that has partially completed; since one
or
more transactions
may have committed during the operation, it
is
necessary to complete the operation before further
processing can be performed.
So
when a long running operation (like copying the
new
mail log)
is
begun. it
is
logged and recorded in the database; the progress·
of
the operation
is
also recorded
as
transactions are committed that complete portions·
of
it. (The progress need not be recorded each
time a transaction commits, but only frequently enough
so
that substantial amounts
of
work are not
lost.) When a restart
is
performed following a crash, Walnut ensures that any in-progress operation·
is
completed before control
is
returned to the higher levels
of
the system that handle user requests.
XEROX
PARC,
CSL-S5-9.
NOVEMBER
1985
WALNUT: STORING ELECTRONIC MAIL
IN·
A
DATABASE
13
5.2 Caching and Transactions
The serializability guarantee provided
by
any transaction system
is
a weak one:
if
the transaction
commits, then it must have been serializable with other activity
in
the system. The user would like
to believe that serializability
is
sufficient to prevent aborts;
in
particular.
if
the user
is
manipulating
a private database, he generally never expects to see an aborted transaction. But there
may
be many
conditions under which the transaction system may abort a transaction for reasons other than
interference with another transaction:
1.
The server may recover from a crash by aborting
all
uncommitted transactions,
2.
Since Alpine pages do not correspond nicely to Cypress entities and relationships, Alpine
may abort a transaction that
is
logically independent
of
other activity, but happens to
manipulate the same physical page
as
another transaction.
3.
The server may time out an idle transaction by aborting it.
4.
The current Alpine backup system aborts all transactions on a
file
before copying it.
While the page-level transactions
of
Alpine make it somewhat more likely that a spurious
transaction abort
will
occur, in any system there
will
be some ,cases
in
which a transaction
will
be
aborted even though there
was
no interference. In a system such
as
Walnut, where most
of
the time
users are browsing essentially private databases, it
is
important to handle aborted transactions to
minimize the possibility that spurious aborts
will
be reflected back to the user.
The other problem that transactions cause for systems like Walnut involves cache management.
The collection
of
messages, message sets, and message set buttons that are displayed
is
a large cache
of
data extracted from the database; when a transaction
is
aborted (for whatever reason), this cache
must be revalidated (at least) or recomputed (at worst). Since Walnut cannot control
how
large this
cache
is
(we can't limit the number
of
message sets that the user has displayed),
we
must ensure that·
the cache revalidation
is
extremely efficient.
It
doesn't do to recompute the contents
of
the screen
when an aborted transaction occurs.
Walnut uses version stamps on each message set and on the set
of
message sets (the buttons
that appear in the control window) to ensure consistency between the display and the database. The
user interface code maintains a set
of
"expected version stamps;" these are checked for consistency
with the database when operations are performed that involve a particular message set (such
as
moving a message from one message set to another) or the set
of
message sets (such
as
adding a
new
message set to the database). In this
way,
we
ensure that the effect
of
operations
as
displayed on
the screen are consistent with the current contents
of
the database. However. changes to the database
that invalidate the contents
of
the display are not noticed until some operation
is
performed that
causes a version stamp to be checked -since most databases' are private,
it
is
generally the case that
this
is
not a problem.
An
aborted transaction
in
Walnut
is
just taken
as
evidence
of
possible
interference; interference
is
certain
only when the expected and actual version stamps disagree.
If
execution
of
a WalnutOps
operation causes a transaction abort and the version stamps agree after opening a
new
transaction,
XEROX
PARCo
CSL-85-9.'NOVEMBER
1985
14
WALNUT:
STORING
ELECTRONIC
MAIL
IN
A
DATABASE
then the operation
is
simply
retried -if it
succeeds,
then
the
transaction abort
must
have
been
spurious. It
is
possible
that
this
retry
logic
on
two
machines
will
produce a
"recovery
deadlock,"
where
each
machine attempts
to
recover,
again
causing
aborted transactions;
however,
we
avoid·
this
by
giving
up
if
the
second
transaction
is
aborted
~
if
this
happens,
the
chances
that
the
interference
is
real
are
very
high.
This
scheme
works
quite
well;
it
is
also
used
in
the
Whiteboards application
[Donahue85b],
where
public
databases
are
more
common.
It
is
also
important
to
note
that
the
version
stamps
used
in
Walnut
cover
many
different
pieces
of
many
different relations; there
is
no
simple
mapping
from
the
underlying structure
of
the
database
to
the collection
of
version
stamps
maintained.
This
suggests
that a database
system
that provided
logical
locking
at the
level
of
an
individual relation
or
tuple
(a
Cypress
"relationship")
might
be
as
difficult
to
deal
with
as
the
page-level
locking
of
Alpine.
For
instance,
it
is
important that
the
system
be
able
to
change
message
pointers without holding
logical
locks.
An
attempt
to
read
a
message
from
the database might
fail
because
another transaction
is
changing
the
message
pointers
in
the
database, but this
is
not a transaction abort that should
be
reflected
back
to
the
user.
Message
pointers
are
not part of
the
user's
view
of
the
database
-
they
are
simply
an
artifact of
the
implementation that should
be
hidden if at
all
possible.
Using
the
version
stamps
to
ensure
cache
consistency
means
that
we
can
also
close
the
transaction
during
long
idle periods;
we
do
not
need
to
hold a
transaction
open
to
ensure
the
consistency
of
the
cached data being
displayed.
The
program
that
implements
the
WalnutOps
interface contains a
simple
watch-dog
process
that
checks
for
long
idle
periods (currently
five
minutes)
and
performs a
shutdown
of
the
system
if it
is
idle.
This
shutdown
is
completely
transparent
to
the
WalnutOps
clients;
when
the next operation
is
executed, the
system
performs
a
normal
startup
before
proceeding~
For
an
application
like
Walnut that
is
idle
most
of
the
time,
this
is
particularly important; it
means
that the utilization
of
the
resources
of
the server
is
determined
solely
by
the
active
Walnut
users,
rather than
by
the
(much
larger)
number of
users
that currently
may
have
a portion of a Walnut
database displayed
on
their
screen.
5.3
Integrating
Applications
We
currently
have
three
fairly
well-developed
database
applications running
in
the Cedar
environment. The
first
is
Walnut.Whiteboards
provides
an
electronic
equivalent of
the
whiteboards
and corkboards that
·we
have
in
our
offices
[Donahue85b].
Finally,
Finger
manages
a
database
of
information
on
machines
and
users
in
our laboratory.
We
also
have
developed
a number of
applications programs that
run
on
top
of
Walnut:
the
two
mostjmportant
are
the
Walnut
query
processor mentioned
above
and a program that
automatically
sorts
incoming
messages
into
message
sets
according
to
user-given
classification
specifications.
Additionally,
Finger
uses
WalnutRegistry
to
record
when
a user
reads
his
mail
and
several
programs
use
WalnutWindow
operations
to
display
data stored
in
a Walnut
database.
This
experience
has
taught
us
some
important
lessons
about
how
databases affect applications development.
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
WALNUT:
STORING ELECTRONIC
MAIL
IN
A DATABASE 15
One
of
the hopes
of
the design
of
Cypress
was
that storing things in a database would make the
integration
of
applications easier. The Cypress report [Catte1l83] argues that the introduction
of
a
common data model should simplify sharing
of
data among multiple applications. The general belief
of
the relational database community seems to be that the simplicity
of
the relational model and the .
elegance
of
relational operators as a common query language should make it very easy for many
programs to share a common database. Unfortunately,
our
experiences with integrating the
applications that
we
have developed,
in
particular, layering other programs on top
of
Walnut, argue
that this
is
not quite true.
The belief that a common database query language can integrate a collection
of
applications
seems founded on the premise that the structure
of
the database
is
semantically significant. The
"universal relation" approach
of
Ullman [Ullman82]
is
one attempt to hide some
of
the underlying
structure
of
a relational database by allowing the user to specify relational queries
as
if
there were a
single "universal" relation in the database.
As
Ullman notes, this
prese~ts
an attractive user interface
by removing some
of
the necessity to deal with the underlying logical structure, which may have
been chosen for reasons other than ease
of
understandability. Working on Walnut has shown
us
two other reasons why the logical structure
of
the database
is
not a reasonable basis for a query
processor; basically the database may contain both too little and too much information for standard
relational queries to be
of
much use.
First, the logical structure
of
the Cypress databases built by Walnut does not contain one
of
the
most important parts
of
an electronic mail database: the messages themselves. These are stored
in
separate log files. And there are good reasons not to store the messages
in
the database: space
reclamation
is
easier and the system
is
far more robust in the face
of
failures
of
the database software.
But.
if
one were to apply a relational query language directly to a database built
by
Walnut. the
connection with the log would be
missing-which
would make the query results rather uninteresting!
Second,
as
we
described above, a Walnut database also contains information about the progress
of
long-running operations.
The
database
is
a natural place to store such information: however, it
is
of
no utility to the normal Walnut user and should be carefully hidden. In fact,
if
the database
currently holds information about an operation in progress, then the database state may be inconsistent
and the entire database should be protected until the operation completes.
What
we
have done
in
Walnut (and
our
other database applications)
is
to define a programmer's
interfaces that give a characterization
of
the database
as
an abstract datatype. The WalnutOps
interface hides the details
of
managing a separate
log
and database, but presents the client with a
view
of
a database containing messages and with no special handling
of
long-running operations.
WalnutOps completely defines the structure
of
a Walnut database (comprising both a Cypress database
and a collection
of
log files)
as
far as client programs are concerned. The use
of
a programmer's
interface, rather than a query language that directly manipulates the structure
of
the database, has
several important benefits:
1.
The operations exported
by
the interface are
all
defined to preserve the invariants that
Walnut depends on. rather than using lower-level operations and run-time checks to ensure
XEROX PARe. CSL-85-9. NOVEMBER
1985
16 WALNUT: STORING ELECTRONIC MAIL
IN
A
DATABASE
consistency. In some cases this means that
we
can use the type-checking
of
the Cedar
programming language to do invariant-checking, with the obvious performance gains
of
static over dynamic checks. More importantly, some
of
the invariants upon which Walnut
depends are very expensive to check.
It
is
far more efficient to prove that:
a.
the WalnutOps operations preserve the invariant "every message belongs to one
or
more message sets;
if
a message belongs to no other message sets, then it belongs to
the Deleted message set," and that
b. only WalnutOps operations are used to manipulate the database
than it
is
to check the database for the truth
of
the invariant.
(It
is
possible in Cedar to
subvert the WalnutOps interface, but
it
is extremely unlikely that this would be
done-it's
very hard to do inadvertently and Cedar programmers believe in the importance
of
interfaces
strongly enough to ask for interface changes rather than trying to access unsafe lower-level
procedures.)
2.
The interface can hide the details
of
transaction management; when writing cooperating
applications this
is
an important benefit. Except in cases where separate programs must
commit changes independently, it
is
better to share a common transaction than to have to
worry about inadvertent conflict caused
by
using separate transactions. WalnutOps hides
transactions inside the implementation -interference (i.e., the notion
of
why
a transaction
might abort) for WalnutOps clients
is
reflected in the mismatch between expected and actual
version stamps. (Separate transactions are used when background new mail processing
is
being performed,
but
there
is
no chance for conflict since new mail
is
written to a separate
log file.) For applications that require a sequence
of
WalnutOps operations to be performed
before another user request
is
serviced, the WalnutWindow interface provides a procedure
that allows "Walnut user level transactions."
3.
Providing a programmer's interface also makes it relatively easy to implement "triggered
procedures" safely and efficiently;
we
have done this for Walnut in the WalnutRegistry
interface.
As
we
described above, WalnutRegistry allows client programs to register
procedures to be called when events
of
interest occur; when a WalnutOps operation
is
performed that causes an event
of
interest to occur, the set
of
procedures registered for the
event are invoked. Because WalnutOps
is
the programmer's interface to the database and
each WalnutOps operation preserves the database invariants. there
is
no
way
a procedure
registered with WalnutRegistry can destroy the invariants
of
a Walnut database. Also,
we
can use the process creation mechanism
of
Cedar to ensure that the procedures called when
an event occurs cannot hold up normal processing. either because they take a long time to
execute
or
(even worse) go into an infinite loop. The implementation
of
WalnutRegistry
uses a rather complicated process structure to guarantee that the procedures registered
by
WalnutRegistry
will
be called in the order in which the events for which they were registered
occurred (this might not be true
if
a new process were simply forked for each procedure
XEROX
PARe.
CSL-85-9.
NOVEMBER
1985
WALNUT: STORING ELECTRONIC MAIL
IN
A
DATABASE
-1-1
call); however, because the invocation
of
a procedure registered with the interface may be
delayed, no guarantees are made about the contents
of
the database at the time
of
the call.
4.
The
final benefit
of
building a programmer's interface as a collection
of
procedures
is
that
it
makes
it
easier to change the structure
of
the underlying database without changing the
applications that access the database -for precisely the same reasons that one uses a database
schema rather than coding access paths into programs. We have taken advantage
of
this
several times in the development
of
Walnut; the low-level components
of
the system have
been reworked many times without having to change the·upper-Ievel pieces.
The
WalnutOps
interface took a long time to design,
but
has served as a very important fixed point in the
development
of
the system.
One
open question
is
how to connect several applications written in this fashion.
One
of
the
useful properties
of
building query processors
at
the database schema level is the uniformity
of
structure that using schemas provides -there
is
much less uniformity in a collection
of
programmer's
interfaces designed by several individuals. How to build query processors that span applications
like Walnut with more traditional database applications (where the schema may reflect the semantics
of
the database)
is
an interesting question for future research.
5.4
Background
Processing
One
of
the major successes
of
Walnut has been its use
of
background processing to retrieve new
mail.
The
implementation
of
background new mail, though, turned
out
to be harder than
we
anticipated;
our
experience identified two problems that need to be considered when designing such
systems: isolation and robustness. We discuss each
of
these below.
5.4.1
Isolation
Background processes need to be carefully isolated from normal· processing; the foreground and
background activities should require very limited synchronization and have no chance for transaction
conflicts.
The
new mail process
of
Walnut synchronizes with the normal processing at three points:
1.
When the new mail log
is
opened. A call
is
made to WalnutOps to open the log and the
WalnutOps transaction
is
used to read the root file to determine which file to open. The
file
is
opened, however, under a new transaction; while new mail
is
written to the log,
it
is
impossible for any conflicts between this transaction and the WalnutOps transaction to
occur.
2.
After the mail from a server has been completely read. A call
is
made to WalnutOps to
record the information about the current length
of
the new mail log. Again, the WalnutOps
transaction
is
used to record the information in the database; the new mail process uses the
completion
of
the call to WalnutOps
as
the
indication that the information has been
successfully written. This
is
also a cheap operation, so the time necessary to synchronize
XEROX
PARCo
CSL-85-9.
NOVEMBER
1985
1e
WALNUT; STORING.'ELECfRONICMAIL:lN A
DATABASE
~)d
\'
the'
twoprocessesds')verY" smaIL(normal'processing
is
not held up for long while this
.J':
synchronization'occurs)~!;:;
'til;
j("r)\
,!fl'
ljfi
11;~;
'3.
When
-the
servers are all, emptiedj', A 'call
is
made to WalnutOps to "return" the
new
mail
log.
This is done because::theimpiementation
of
WalnutOps also maintains a copy
of
the
open file
"handle",to
'the
new~
mail log;
if
the user requests
new
mail while a background
retrieval
is
,underway,'we can abort the retrieval (by aborting the transaction being used to
wpte
,on'
the,log)'and,
then' returning whatever mail happens to exist
of
the new mail
log
from' previous; retrievals. Our users corne first;
we
prefer to give them what
we
have rather
than
make them 'wait for (possibly) more mail. "Returning" the
new
mail
log
causes the
log
file
to' be closed and the open
file
handle to be discarded.
The ability to stop reading mail at any time
by
aborting the
new
mail transaction points out the
second concern
we
had when designing background retrieval-that it be extremely robust in the
face
of-failures.
5.4.2
Robustness
One
of
the responsibilities
of
implementing a background processing algorithm
is
to make the
user unaware
of
its existence. except that things sometimes go much faster than he might expect. In
particular. the implementor must assume that
the
user cannot be relied upon to avoid doing nasty
things (like rebooting his machine) during sensitive periods. The most difficult part
of
getting this
right for Walnut
was
to ensure that the mail
was
flushed from Grapevine only when it
was
certain
that the database recorded its existence on the new mail log. Since this could not be done under a
transaction (there
is
no means for allowing Grapevine and Alpine to
use
a common transaction). a
protocol
was
designed to ensure that the necessary operations would happen in the right order and
the chance
of
unnecessary duplication
of
messages (messages appearing both on Grapevine and in
the new mail log)
was
acceptably
low
and would be properly, handled. Having a "transaction server"
that could be used by all network· resources to agree on transaction commits
or
aborts would have
made the robustness
of
this part
of
Walnut much easier to guarantee.
6.
Conclusions
Walnut
is
definitely an atypical database application; yet systems like Walnut
will
become
increasingly important
as
databases become a more integral component
of
our computing
environments. One conclusion to come out
of
this work
is
that there
is
less
difference between
database applications and database systems than seems obvious at first glance. Walnut uses many
of
the same techniques that appear in Alpine
or
Cypress: logging. caching. version stamps. and crash
recovery are obvious examples. This
is
an example
of
Lampson's dictum:
Use
a
good
idea
again
[Lampson84. pg.
18].
One natural reaction to this
is
to claim that the database system must be poorly
designed for so many things to have to be duplicated. However. much
of
Walnut's complexity
XEROX
PARC,
CSL-85-9,
NOVEMBER
1985
...
D>
3
III
en
15'
~
D>
'3
c:
CD
D>
~
Co
~
co
·
en
c:
CD
~

Navigation menu