Average Case Analysis Of First Fit And Random Bin Packing SRC TN 1997 002

SRC-TN-1997-002 SRC-TN-1997-002

SRC-TN-1997-002 The Eye | File Listing

User Manual: SRC-TN-1997-002

Open the PDF directly: View PDF .
Page Count: 17

SRC Technical Note

1997 - 002

February 21, 1997

Average-Case Analysis of First Fit and Random Fit Bin Packing

Susanne Albers and Michael Mitzenmacher

digital

Systems Research Center

130 Lytton Avenue

Palo Alto, California 94301

http://www.research.digital.com/SRC/

Abstract

We prove that the First Fit bin packing algorithm is stable under the input distribution U{k−2,k}

for all k≥3, settling an open question from the recent survey by Coffman, Garey, and Johnson [3]. Our

proof generalizes the multi-dimensional Markov chain analysis used by Kenyon, Rabani, and Sinclair to

prove that Best Fit is also stable under these distributions [11]. Our proof is motivated by an analysis of

Random Fit, a new simple packing algorithm related to First Fit, that is interesting in its own right. We

show that Random Fit is stable under the input distributions U{k−2,k}, as well as present worst-case

bounds and some results on distributionsU{k−1,k}and U{k,k}for Random Fit.

1 Introduction

In the one-dimensional bin packing problem, one is given a sequence a1,...,a

n∈(0,1] of items to pack

into bins of unit capacity so as to minimize the number of bins used. A great deal of literature has focused

on this problem, perhaps because, as Coffman, Garey, and Johnson [3] observe in their recent survey on bin

packing, “The classical one-dimensional bin packing problem has long served as a proving ground for new

approaches to the analysis of approximation algorithms.” For example, recently the study of Best Fit bin

packing under discrete uniform distributions has led to a novel analysis technique, based on the theory of

multi-dimensional Markov chains. In this paper we extend this approach to analyze First Fit and a new bin

packing algorithm, called Random Fit, under discrete uniform distributions.

First Fit and Best Fit are two classical algorithms for online bin packing. With First Fit, the bins are indexed

in increasing order of their creation. Each item is sequentially placed into the lowest indexed bin into which

it will ﬁt, or into a empty bin if no such bin is available. With the Best Fit algorithm, each incoming item

is placed into the non-empty bin with smallest residual capacity that can contain it; if no such bin exists,

the item is placed in an empty bin. The performance of First Fit and Best Fit in the worst case and uniform

average case has been settled for quite some time. In the worst case, the number of bins used by any of these

algorithms is at most 17

10 times the optimum number of bins, as shown by Johnson et al. [10]. When item

sizes are generated by U(0,1), the continuous uniform distribution on (0,1], then the performance measure

of interest is the expected waste, which is the difference between the number of bins used and the total size of

the items packed so far. Shor [16] showed that the expected waste created by First Fit is 2(n2/3). Shor [16]

and Leighton and Shor [13] proved that Best Fit does better, generating expected waste 2(√nlog3/4n).

Because of these tight bounds, research on First Fit and Best Fit is now focused on analyzing expected waste

when item sizes are generated by discrete uniform distributions. A discrete uniform distribution, denoted

by U{j,k},1≤j≤k, is one where item sizes are chosen uniformly from the set {1/k,2/k,..., j/k}.For

U{k,k},k>1, First Fit and Best Fit achieve expected waste 2(√nk)and O(√nlog k), respectively, (see

Coffman et al. [2]). Similar bounds hold for U{k−1,k}. Of particular interest are distributions for which

the algorithms are stable. We say that a algorithmis stable under a distributionif the expected waste remains

bounded (that is, O(1)), even as the number of items ngoes to inﬁnity. Coffman et al. [2] proved that First

Fit is stable under U{j,k},whenk≥j

, and Best Fit is stable under U{j,k},whenk≥j(j+3)/2.

Later, Coffman et al. [4] introduced a novel method for proving the stability (and instability)of bin packing

algorithms based on multi-dimensional Markov chains. Their methodology allowed them to show that

U{j,k}is stable under Best Fit for several speciﬁc pairs of values for jand k. Kenyon et al. [11] expanded

on this work by proving that Best Fit is stable under the entire family of distributions U{k−2,k},usinga

complex analysis of the underlying Markov chains.

We brieﬂy describe the Markov chain setting used in the results described above. Using the Best Fit al-

gorithm under a discrete uniform distribution, a packing can be represented by the number of bins of each

possible residual capacity. The order of the bins is irrelevant. This packing process can therefore be easily

represented by a Markov chain, where the state at any time is a vector s=(s1,...,s

k−1),ands

iis the

number of bins of residual capacity i/k.

The Best Fit algorithm is well suited to the Markov chain approach, because the order of the bins is irrelevant,

leading to a simple representation of the packing. In contrast, in the First Fit algorithm, the order of the bins

cannot be dismissed. Because of the difﬁculty of representing the state in the First Fit algorithm, until now

these Markov chain techniques have not been successfully applied to the First Fit algorithm.

In this paper, we remedy this problem by demonstrating a Markov chain argument that shows that First Fit

is in fact stable under the family of distributions U{k−2,k}. This result disproves a conjecture made by

Coffman et al. [3], who state that limited experiments suggest that the expected waste may grow unbounded

on U{k−2,k}for sufﬁciently large k. Moreover, it demonstrates that the Markov chain approach may be

more generally applicable than previously believed.

Our proof emerges from an analysis of a new bin packing algorithm, called Random Fit (RF). Random Fit

is a simple randomized variant of First Fit. With Random Fit, each time an item is to be placed in a bin the

bins are indexed in an order determined by a permutation chosen independently and uniformly at random.

Each item is sequentially placed into the lowest indexed bin into which it will ﬁt, or into a empty bin if no

such bin is available.

In Section 2 we introduce Random Fit by analyzing its worst-case behavior. In the following sections we

then concentrate on average-case analysis. Random Fit has the advantage that, like Best Fit, a packing can

be represented by the number of bins of each possible residual capacity. Therefore, in Section 3, we ﬁrst

generalize the analysis of Best Fit shown in [11] to Random Fit. We prove stability of Random Fit under

the input distribution U{k−2,k}and derive some related results for U{k−1,k}and U{k,k}. Using ideas

developed in Section 3, we proceed to prove stability of First Fit under input distribution U{k−2,k}in

Section 4.

2 Worst-case analysis of Random Fit

We compare the behavior of Random Fit with an optimal ofﬂine algorithm. Recall that with Random Fit,

each time an item is to be placed in a bin the bins are indexed in an order determined by a permutation

chosen independently and uniformly at random. Each item is sequentially placed into the lowest indexed

bin into which it will ﬁt, or into a empty bin if no such bin is available.

Given a sequence S=(a1,a2,...,a

n)of items and a bin packing algorithm A,letA(S)denote the number

of bins used by Ato pack S. In particular, OPT(S) is the number of bins used by an optimal ofﬂine algorithm,

i.e., it is the minimum number of bins required to pack S.

Theorem 1 a) For every sequence S, RF(S)≤2·OPT(S)−1.

b) There exist sequences S, with arbitrarily large values of OPT(S), such that with high probability

RF(S)=2·OPT(S)−1.

Proof: Part a) At any time, the sequence of bins used by RF contains at most one bin with residual capacity

of at least 1

2. Thus, for any sequence S, the number of bins used by OPT is at least b1

2RF(S)c+1.

Part b) For any integer n≥2, let Snbe a sequence that contains n large items of size 1

2. In addition, in

between any two large items, n2small items each of size 1

2n3must be inserted. Thus

Sn=(1

2,1

2n3,..., 1

2n3,1

2,1

2n3,..., 1

2n3,1

2,1

2n3,..., 1

2n3,1

2).

Note that the sum of all the small items is 1

2n3n2(n−1)< 1

Clearly, OPT(Sn)=bn

2c+1. We show that with high probability Random Fit uses nbins on this sequence.

More precisely, immediately before an insertion of a large item, the probability that a bin holding a large

item does not contain a small item is bounded by (1−1

n)n2≤e−n. Thus, the probability that at any of the

ninsertions of large items, some open bin having a large item does not contain a small item is bounded by

n2e−n. We conclude that with probability at least 1 −n2

en, RF needs nbins to pack Sn.

While RF has a guaranteed worst-case performance, it does not achieve the same bounds as First Fit and

Best Fit. In the worst case, RF is only as good as Next Fit and Worst Fit.

Motivated by recent work [1, 15], we also consider an extension of Random Fit, called Random-Fit(d), that

is deﬁned for any integer d≥2. Whenever a new item arrives, RF(d) examines bins in the same way as

RF until dbins are found that can hold the item. Among these dbins, the item is inserted into the bin with

smallest residual capacity, i.e., the Best Fit rule is applied. If there are only i,i<d, open bins that can hold

the item, then the item is inserted into one of these ibins, using again the Best Fit strategy. If none of the

open bins can hold the item, then the item is inserted into a new bin.

Interestingly, when making the transition from RF to RF(d), the worst-case performance improves.

For any algorithm A,let

∞

A={r≥1|for some N>0,A(S)/OPT(S)≤rfor all Swith OPT(S)≥N}.

Theorem 2 For every d ≥2,R

∞

RF(d)≤17

10.

Proof: Follows from a result by Johnson [7, 8] because RF(d) belongs to the class of Almost Any Fit

algorithms.

3 Average-case analysis of Random Fit

In this section we prove that Random Fit is stable under the input distributionU{k−2,k}and derive some

related results for U{k−1,k}and U{k,k}.

3.1 Preliminaries

We begin by reviewing some important deﬁnitions and lemmas from [11]. For considering the distribution

U{j,k}, rather than have bins of size 1, we shall instead think of having bins of size kand item sizes chosen

uniformly from {1,..., j}. The two notions are clearly equivalent. We shall model the system using k−1

tokens that move on the non-negative integers. The value of token iat time t, denoted by si(t)represents

the number of bins with residual capacity iafter titems have been placed. The state of the system at time

tis given by a vector s(t)=(s1(t), ..., sk−1(t)). Initially, s(0)=(0,...,0), as there are no open bins with

residual capacity. We shall often drop the explicit reference on twhen the meaning is clear. The waste at

time tis given by Pk−1

i=1isi(t). We wish to show that the expected waste as t→∞remains bounded.

We shall divide the tokens into classes. The token iis called small if 1 ≤i≤dj

eand is called large if

2+2c≤i≤j. In the case where jis even, there is also a middle token, namely dj

2e+1. For convenience,

we shall temporarily restrict ourselves to the case where jis odd, as the case where jis even requires some

additional work to handle the middle token. We shall explain the modiﬁcations necessary for the case where

jis even after the proof of the case where jis odd.

We begin with the following lemma:

Lemma 3 State s is reachable from the initial state s(0)=(0,...,0)only if

1. For distinct indices i and i0with i +i0≥k, either si=0or si0=0.

2. Pinot small si≤1

Proof: By induction; it follows from the fact that we will not open a new bin if an item can be packed in a

current bin.

It is also not hard to see that all states that satisfy the conditions of Lemma 3 are reachable, and hence we

assume hereafter that our state space consists exclusively of all states satisfying the conditions of Lemma 3.

From Lemma 3, if sdj

2e(t)>0, then all large tokens must be 0 at time t. Our proof of stability will rely on

this simple feature of the chain. In particular, this feature allows us to focus on the behavior of the small

tokens, which is considered in the following lemma:

Lemma 4 Using Random Fit, the motion of a small token i has the following properties:

1. For i >1, the motion of siat all positions other than 0 is a random walk on Z+, such that a positive

step is taken with probabilityat least 1

jand a negative step is taken with probabilityat most 1

j+si

si−1+si.

2. The time spent by sion each visit to 0 is stochastically dominated by a random variable D with

constant expectation and variance (that depend only on j).

Proof: For the ﬁrst part, note that, if si>0, then siincreases whenever an element of size k−ienters the

system, by Lemma 3. Hence we need only consider negative steps. If an item of size ienters, then simay

decrease; if an item of size less than ienters, then it is clear that the probability of it landing in a bin of

capacity iit at most si

si−1+si. The result follows.

The second part is almost exactly the same as in Proposition 4 of [11], which we sketch here for complete-

ness. If si=0, and si0=0foralli

0≥k−i, then clearly simoves to 1 with probability at least 1/j.If

0=1forsomei0≥k−i, however, this is not the case. It sufﬁces to note that if two consecutive items have

size k−i,thens

iwill go to 1 even in this case. One may check that this fact sufﬁces to prove the lemma.

3.2 Outline of the proof

We now sketch how we will prove that RF is stable, following the same approach as [11]. We ﬁrst note that

by Lemma 3, the amount of waste from non-small tokens is bounded by a constant. Hence we need only

consider the waste due to small tokens, which we denote by f(t)=Pdj/2e

i=1isi(t).

The proof breaks down into three steps. For the ﬁrst step, we show that if sdj

2e(t)>0, then the expected

change in f(t),orE[f(t+1)−f(t)|f(t)], is negative. For the second step, we show that if we begin a

state where f(t)is large, then for some suitably large T, for almost all of the next Tsteps sdj

2e>0 with

a suitably high probability. Combining these two steps, we ﬁnd that, whenever f(t)is sufﬁciently large,

the expected change in f(t)is negative over a suitably long interval T. The third step is to this fact and

results from the general theory of Markov chains that to show that we may conclude that the expected waste

is bounded.

The challenging part of the proof is the second step, where we must show thatsdj

2e>0 for most of a suitably

large interval. The ﬁrst step is actually a simple lemma, entirely similar to one given in [11]. However, since

the lemma is heavily based on the fact that j=k−2, we present a proof here.

Lemma 5 ([11], Proposition 5) Suppose that sdj

2e(t)>0.ThenE[f(t+1)−f(t)|f(t)]=−1/j.

Proof: Consider the size iof the item inserted at time t+1. If 1 ≤i≤dj/2e, then the new item is assigned

to a bin with remaining capacity l,i≤l≤dj/2e,and fdecreases by i.Ifd

e<i≤j, then, since

sdj/2e>0, Proposition 3 implies that there is no bin with remaining capacity i. Thus, the incoming item is

put into a new bin, i.e., sk−iincreases by 1 and fincreases by k−i. The expected change in fis therefore

j dj/2e

i=1

(−i)+

i=dj/2e+1

(k−i)!=1

j dj/2e

i=1

(−i)+dj/2e

i0=2

i0!,(1)

because k−j=k−(k−2)=2 and, since jis odd, k−(dj/2e+1)=dj/2e.Itiseasytoverifythat

equation (1) evaluates to −1/j.

The third step relies on general conditions for a multi-dimensional Markov chain to be ergodic; we cite the

appropriate lemma from [11], which is derived from [5].

Lemma 6 ([11], Lemma 6, or [5], Corollary 7.1.3) Let be an irreducible, aperiodicMarkov chain with

state space S ⊆Zk, and b a positive integer. Denote by pb

0the transition probability from s to s0in b,

the b-step version of .Let8:S→R

+be a non-negative real-valued function on S which satisﬁes the

following conditions:

1. There are constants C1,µ> 0such that 8(s)>C

||s||µfor all s ∈S.

2. There is a constant C2>0such that pb

0=0whenever |8(s)−8(s0)|>C2, for all s,s0∈S.

3. There is a ﬁnite set B ⊂S and a constant >0such that Ps0∈Spb

0(8(s0)−8(s)) < −for all

s∈S\B.

Then is ergodic with stationary distribution πsatisfying π(s)<Ce

−δ8(s)for all s ∈S, where C and δ

are positive constants.

For the bin-packing problem, we shall use 8(s)=Pdj

i=1isi+k−1=f+k−1, where fis the waste

from small tokens. This is an upper bound on the total waste. One may check that the ﬁrst two conditions

of Lemma 6 are satisﬁed for any choice of b. It remains to ﬁnd appropriate b,B,and ; this is equivalent to

the second step of our proof sketch, which we now focus on.

3.3 Random Fit over long intervals

We now show that, for all but a ﬁnite number of starting states, sdj

2e>0 for most of sufﬁciently large

intervals. We shall often compare the behavior of a token with a random walk over an interval [0,R]. We

shall use p↑(i)to denote the probability that a walk at imoves to i+1 in one step. Similarly p↓(i)is

the probability that a walk at imoves to i−1 in one step, and p→=1−p↑(i)−p↓(i)(the self-loop

probability) is the probability that the walk remains at iwhen at i. We shall drop the iin cases where p↑(i)

is independent of i(except at 0 and R,asp

↓

(0)and p↑(R)are necessarily 0, and the self-loop probabilities

are increased accordingly); this is called the homogeneous case. A random walk is downward biased if

p↑(i)≤p↓(i)for all iin the range of the walk (except the boundaries).

In order to bound the behavior of the random walks we study, we shall require the following lemma, which

is a weak bound derived from Corollary 4.2 of [12]:

Lemma 7 Let λ1<1denote the second largest eigenvalue of the transition matrix for a random walk W

on [0,R].Letπ(A)=Pa∈Aπabe the stationary probabilitythat the walk lies in A ⊂R, and Wl(A)be the

number of steps the walk spends in A during the ﬁrst l time steps. If the walk starts at 0, then for any integer

l≥1and 2≤β<1/π( A),

Pr[Wl(A)≥βπ(A)l]≤β

√π0exp−π(A)2(1−λ1)l.

To use the above lemma we will require the following fact about the eigenvalues:

Lemma 8 For the random walk on [0,R]with p↑=p↓=α,λ1≤1−2α

R2.

We start with a preliminary lemma that provides both the ﬁrst step and the main idea of the proof. In this

lemma, and all that follows, we assume that Tis at least as large as some constant chosen so that the bounds

hold.

Lemma 9 For sufﬁciently large T , if si>T4over the time interval [0,T],thens

i+1≥T1/16 for all but at

most T 15/16 steps with probability at least 1−2

T2.

Proof: By Lemma 4, the behavior of the token si+1at any point on the interval [0,T] can be related to

a random walk over the positive integers, where p↑(i)≥1/jand p↓(i)≤1

j+si

si+si+1(except at i=0).

Furthermore, the probability that si+1≥T1/16 for all but at most T15/16 steps, which we shall hereafter call

z, is clearly minimized if we start si+1at 0. This information is sufﬁcient to prove that z≥2

T2directly;

however, we suggest an easier approach.

We ﬁrst note that, since we are comparing the behavior of si+1to a speciﬁc random walk, zcan only increase

if we restrict the walk (or, equivalently, the token si+1) to the interval [0,T1/4−1]. Bounding the walk in

this way will simplify the analysis. Also, for convenience, we also temporarily ignore the problem of the

waiting time when si+1=0 as described in Lemma 6.

We now split each step, or item arrival, into two phases. In phase one, a random permutation order is chosen

for the open bins. In phase two, an item size is chosen from the distributionU{j,k}, and this item is placed

according to the RF rule.

By breaking each step up in this manner, we see that whenever the permutation chosen in phase one has a

bin with remaining capacity iahead of all bins of remaining capacity i+1, then for phase two, the worst

possible case is that si+1behaves like an unbiased random walk, with p↑=p↓=1/j. (Note that it is

possible that p↓≤1/j, but we maximize the time that si+1≥T1/16 by assuming that p↓=1/j.) In the

alternate case where a bin with remaining capacity i+1 lies ahead of all bins of capacity iin phase one, we

may again overestimate zby assuming that p↑=0andp

↓=1 in phase two. As we now show, by splitting

each step into two phases in this way, we have essentially reduced the problem to an unbiased walk.

We note that, over the interval [0,T] we have enforced the restrictions si+1≤T1/4and si≥T4. Hence,

with probability at least 1

T2,forno steps in this interval do we place a bin of capacity i+1 ahead of all

bins of capacity iin phase one. We call this event . Conditioned on , si+1behaves like an unbiased

random walk on [0,T1/4−1] over the entire interval. In particular, the stationary distribution is uniform, so

πi=T−1/4for all i.LetZbe the number of steps for which si+1≤T1/16. From Lemmas 7 and 8, we ﬁnd

that for sufﬁciently large T,

Pr[Z≥T15/16 |]≤T1/8·T1/8exp−2T1/8

j(2)

≤1

T2.(3)

Using a union bound on probabilities now yields the lemma.

To handle the discrepancy when the walk is at 0, we note that we can explicitly bound the total number of

steps at 0 with sufﬁciently high probability using part 2 of Lemma 4. The bounds given by equations (2)

and (3) can also be tightened so that for sufﬁciently large T, the lemma as stated holds.

We have shown that if siis extremely large over a sufﬁciently long interval, then si+1is also be large over

most of the interval with high probability. Our actual goal is to show that if any siis extremely large (for

i≤dj

e), then sdj

2e>0 over most of the interval. Hence we will require an inductive, but slightly weaker,

version of Lemma 9.

One problem in generalizing Lemma 9 is that if siis large only for most, and not all, of the steps, then

there are several steps where we cannot explicitly say how si+1behaves. Moreover, these steps may affect

the behavior of si+1at any point. We avoid the problem by introducing an adversary model, generalizing a

similar argument from [11]. This adversary model allows us to consider the worst possiblecase for the steps

where siis smaller than we need.

We consider how an adversary can affect a homogeneous downward biased random walk on [0,R]. The

goal of the adversary is to keep the random walk at or below some level l,l≥2, for as many steps as

possible. The adversary may control a ﬁxed number of steps. In a controlled step, the adversary may specify

any probability distribution on the legal moves from the current state; the step of the walk is then made

according to that distribution. In all the other steps, the process behaves like a homogeneous downward

biased random walk.

In the following, given an adversary strategy A,letp

(y,i,n,m,l)denote the probability that a homoge-

neous downward biased random walk of nsteps on the interval [0,R] starting at iwith ycontrolled steps

used according to A, spends at least msteps at or below l.

Lemma 10 For all non-negative integers y,i,n,m and l, with l <R and i <R−1, the exists an adversary

strategy A0

(a) that never uses a controlled step when the walk is below l

(b) that always uses a controlled step as soon as possible when the walk is at or above l +1to push the

walk downwards

such that pA0(y,i,n,m,l)≥pA(y,i,n,m,l)for all adversaries A.

Proof: The case where l=0, the walk is unbiased, and the self-loop probability is 0 corresponds to what

is proven in [11, Lemma 7]; we extend the argument to this more general case. We use induction on n.We

ﬁrst note that any adversary that uses a downward move when the walk is below lcan be replaced by one

that does not. This follows by a simple coupling argument. Compare the strategy where the adversary uses

a downward move below lto one where the adversary waits until the walk is at lby coupling all random

moves; the second strategy will be at the same height or below the ﬁrst after the downward move. (It will

end up below only if the walk reaches 0.) Thus we have shown that there is an optimal adversary strategy

that satisﬁes condition (a).

We now concentrate on adversary strategies that use their moves at or above l+1. Let DyRdenote the

strategy A1which uses the yadversary-controlled steps as soon as possible when the walk is at or above

l+1, and then follows the random walk. Let RDydenote the strategy A2that begins with a random step,

and then uses the adversary-controlled steps as soon as possible when the walk is at or above l+1. Let

pA1(y,i,n,m,l)be the probability of the event that the walk is at or below lfor at least mof the next n

steps after starting at iwhen adversary strategy A1=DyRis used. Similarly, let pA2(y,i,n,m,l)be the

probability of the event that the walk is at or below lfor at least mof the next nsteps after starting at iwhen

adversary strategy A2=RDy. We claim

pA1(y,i,n,m,l)≥pA2(y,i,n,m,l), (4)

and by induction this sufﬁces to prove that there is an optimal strategy satisfying condition (b).

We ﬁrst present two useful propositions.

Proposition 11 pA1(y,l,n,m,l)≥pA1(y,l,n,m+1,l)

Proposition 12 pA1(y,l−1,n,m,l)≥pA1(y+1,l,n+1,m+1,l)

Proposition 11 is easy to verify. We prove Proposition 12. Let Wl−1be the walk that starts at l−1 and follows

strategy DyR; similarly let Wlbe the walk that starts at land follows strategy Dy+1R.LetT

l−1be the time

when Wl−1ﬁrst makes the transition (l−1)→land let Tlbe the time when Wlﬁrst makes the transition

l→(l+1). Clearly, Tl−1=Tlin distribution. We only have to consider the event that Tl−1=Tl≤n+1

and Tl−1=Tl≥m. Then, the remainder of Wl−1is a walk starting at lthat follows strategy DyRand must

be at or below lfor at least m−Tl−1of the next n−Tl−1steps. In the case of Wl, the adversary ﬁrst pushes

thewalkdowntoland the remainder is also a walk that starts at l, follows strategy DyRand must be at or

below lfor at least m+1−Tl=m+1−Tl−1of the next n+1−(Tl+1)=n−Tlsteps. Using Proposition 11

and taking again into account that Tl−1=Tlin distribution, we conclude the probability of the ﬁrst walk is

not smaller than that of the second walk, i.e., pA1(y,l−1,n,m,l)≥pA1(y+1,l,n+1,m+1,l).

We return to the proof of inequality (4). If i≤l, both strategies A1and A2start the same and we are done

by induction. If i>y+l, both strategies give the same distribution after y+1 steps, so the two quantities

pA1(y,i,n,m,l)and pA2(y,i,n,m,l)are equal. The interesting case is when l<i≤y+l. In this case,

strategy A1forces the walk from idown to lusing i−lcontrolled steps. Thus,

pA1(y,i,n,m,l)=pA1(y0,l,n0,m−1,l),

where n0=n−i+land y0=y−i+l.Also

2(y,i,n,m,l)=p

↑·p

1(y

0−1,l,n

0−2,m−1,l)+p

↓·p

1(y

0+1,l,n

,m−1,l)

→·p

1(y

,l,n

0−1,m−1,l)

and

pA1(y0,l,n0,m−1,l)=p↑·pA1(y0−1,l,n0−2,m−2,l)+p↓·pA1(y0,l−1,n0−1,m−2,l)

+p→·pA1(y0,l,n0−1,m−2,l).

Using Proposition 11, we have pA1(y0−1,l,n0−2,m−2,l)≥pA1(y0−1,l,n0−2,m−1,l)and

pA1(y0,l,n0−1,m−2,l)≥pA1(y0,l,n0−1,m−1,l). Thus,

pA1(y,i,n,m,l)−pA2(y,i,n,m,l)≥p↓(pA1(y0,l−1,n0−1,m−2,l)−pA1(y0+1,l,n0,m−1,l)).

Proposition 12 implies that the last term in non-negative.

Lemma 13 Suppose, over a period of T steps, si−1≥Tαover all but T 1−αsteps for some α≤1/16.Then

i≥T

α/16 forallbutatmostT1−α/16 steps with probability at least 1−3T−α/4.

Proof: As in Lemma 9, we may, without loss of generality, restrict sito the interval [0,Tα/4−1]. Then si

behaves like a slightly biased random walk on all but the T1−αsteps for which si−1lies below Tα.Rather

than consider the biased walk, however, we use the same technique as in Lemma 9 to reduce the problem to

an unbiased random walk by splitting each step into two phases. We give the adversary control on all steps

in which a bin with remaining capacity ilies ahead of all bins with capacity i−1 after the ﬁrst phase. On

any step for which si−1≥Tαand si≤Tα/4, the probability that a bin with remaining capacity ilies ahead

of all bins with capacity i−1 after the ﬁrst phase is at most 1

T3α/4. Hence, the expected number of such steps

is at most T1−3α/4, and by Markov’s inequality, the number of such steps is at most T1−αwith probability

at least T−α/4.Let be the event that there are no more than T1−αsuch steps.

Conditioned on , the adversary controls at most 2T1−αsteps: T1−αfrom the above paragraph, and T1−α

from the steps where si−1<Tα. On all other steps the walk behaves like an unbiased random walk with

p↑=p↓=1/j. (Again, this is not quite true when si=0, but this small discrepancy can be easily handled

explicitly as described in Lemma 9; for convenience we dismiss the problem here.) We use this to bound

the probability that silies below Tα/16 for more than T1−α/16 steps.

We ﬁrst consider the moves controlled by the adversary. In the worst case, sibegins at 0. By Lemma 10,

there exists an optimal adversary strategy A0that uses a controlled step whenever sireaches Tα/16 −1or

Tα/16. Hence, to overestimate the effect of the adversary, we assume the following: the adversary uses its

moves whenever sireaches Tα/16; the adversary’s move returns the walk to si=0; and all steps until the

adversary’s moves are used count as steps where si∈[0,Tα/16 −1]. These assumptions can only increase

the time until the adversary’s moves are used. The expected time for sito reach Tα/16 from 0 is cTα/8

for some constant c. Thus the expected number of steps until Ahas used all of its moves it bounded by

cT1−7α/8.LetZ

1be the number of steps until the A0uses all of its moves. Then by Markov’s inequality

PrZ1≥T1−α/16

2|≤2cT−13α/16 ≤T−α/4

for sufﬁciently large T.

After the adversary steps are used, the number of steps that sispends in the interval I=[0,Tα/16 −1] is

stochasticallydominated by that of an unbiased random walkUon [0,Tα/4] that runs for Tsteps and begins

at 0. Let Z2be the number of steps Uspends in I. As in the proof of Lemma 9, the equilibrium distribution

of Uis uniform over [0,Tα/4−1]. Thus π(I)=T−3α/16. Using Lemmas 7 and 8 we obtain

PrZ2≥T1−α/16

2≤Tα/8·Tα/8

2exp −T−3α/8·T−α/2·T

j

≤T−α/4

for sufﬁciently large T.

Taking a union bound, we ﬁnd that the probability that Z1+Z2≥T1−α/16 is at least 1 −3T−α/4,which

proves the lemma.

We are now ready to prove the main theorem:

Theorem 14 Random Fit is stable under the distribution U{k−2,k}for all k ≥3.

Proof: As in our previous calculations we ﬁrst assume that kis odd. As in Theorem 1 of [11], it sufﬁces

to consider the drift of f(s)over a suitably large interval T, and show that it is negative for all but a ﬁnite

number of states. The excluded set of states Gwill be

G={s∈S:∀i,s

i≤T4

where Twill be determined. Consider any starting state outside of this set G. Applying Lemma 9 and then

Lemma 13 inductively, we ﬁnd that with probability at least 1 −(c1/T1),sdj

2e>0 over all but T2of the

steps, for some constants c1and 1,

2<1 dependent only on j.Let be the event that sdj

2e>0 over all

but T2of the steps. As the expected value of fdecreases by 1/jwhenever sdj

2e>0byLemma5,andit

increases by at most jotherwise,

E[f(T)−f(0)|f(0)]≤E[f(T)−f(0)|f(0)∧]+(1−Pr[ ])E[f(T)−f(0)|f(0)∧¬ ](5)

≤−1

jT−T2+jT

2+c1T1−

1j.(6)

By choosing Tsufﬁciently large, we may make this expression smaller than −δfor some constant δ.This

sufﬁces to prove the theorem, by Lemma 6.

If kis even, then there is middle token sdj/2e+1.Ifs

dj/2e+1=0, everything is exactly as in the case where

kis odd. If sdj/2e+1>0, then by Lemma 3 sdj/2e+1=1 and no bins with larger capacity are open. We

consider the time steps when sdj/2e+1=1. In these steps fmight increase because a small item may be

inserted in the bin of capacity dj/2e+1. Lemmas 9 and 13, which apply when kis even, give that with

probability at least 1 −(c1/T1),sdj

2e>T1−2over all but T2of the steps, for some constants c1and

1,

2<1 dependent only on j. Hence it should be a very rare event for a small item to be placed into a bin

of capacity dj/2e+1.

In fact, in exactly the same manner as shown in Lemma 5, one may show the following:

Proposition 15 Suppose that k is even and sdj/2e>Z. ThenE[f(t+1)−f(t)|f(t)]≤−1/j+2/Z.

We conclude that in this case

E[f(T)−f(0)|f(0)]≤E[f(T)−f(0)|f(0)∧]+(1−Pr[ ])E[f(T)−f(0)|f(0)∧¬ ](7)

≤−1

j+2

T1−2T−T2+jT

2+c1T1−

1j.(8)

This expression can also be bounded by −δif Tis chosen large enough.

One may check that from the inductive use of Lemma 13, the 2in Theorem 14 is exponential in j,and

hence our bounds on the expected waste is doubly exponential in j. It is an interesting question whether

better bounds are possible.

It is also worthwhile to note the following:

Theorem 16 Random Fit(d)for d ≥2is stable under the distribution U{k−2,k}for all k ≥3.

The proof is entirely similar to that for Random Fit. Simulations suggest that as dincreases, the behavior of

Random Fit(d)rapidly approaches that of Best Fit, as one might expect.

Theorem 17 Random Fit and Random Fit(d), for d ≥2, have expected waste o(n)under the distributions

U{k−1,k}and U{k,k}, for all k ≥3.

Proof: We only consider the distribution U{k−1,k}, as the waste under the distributionU{k,k}is entirely

similar. Under this distribution,the statement corresponding to Lemma 5 is that if sdj

2e(t)>0, then E[f(t+

1)−f(t)|f(t)]=0. Using the same notation as in the proof of Theorem 14 we obtain

E[f(T)−f(0)|f(0)]≤jT

2+c1T1−

for some constants c1and 1,

2<1 dependent only on j. Hence, once the expected waste reaches a certain

constant, its expected growth is sublinear, proving the theorem.

Whether tighter bounds, more like those known for Best Fit and First Fit, are possible for Random Fit under

these distributions remains an open question.

4 Analysis of First Fit under distribution U{k−2,k}

We now consider how to modify the proof of RF on the distribution U{k−2,k}to work for First Fit. Again

we focus on the case where kis odd; the case where kis even requires some minor additional work, as for

Random Fit, which we omit here.

One way of thinkingabout the difﬁculty in extending the results from RF to FF is to consider the dependence

among the steps. In RF, at each step we have an independent random ordering assigned to the bins, while

in FF, the orders of the bins at different steps are clearly dependent. In particular, the order of the bins at

each step depends on the initial state, over which we have negligible control. The work of this section will

focus on ﬁnding ways to circumvent effect of these dependencies so that we can apply the same ideas that

we used in Section 3.

Let us consider an initial state, given at time t=0. In order to avoid problems caused by the order of bins

in the initial state, we focus on bins that are created after time 0. In fact, we are even more restrictive: let a

single i bin at time tbe a bin created after time 0 that has remaining capacity iand contains only one item,

and denote the number of single ibins by ui(t). Instead of the vector swe considered previously, we shall

primarily the vector u=(u1,...,udj

2e). The following important points about umake it useful:

•If udj

2e>0, then sdj

2e>0 also. Hence, proving udj

2e>0 over most of the steps is sufﬁcient.

•Regardless of the initial state, (u1,...,udj

2e)=(0,...,0)at time 0.

To see how the considering umakes things easier, let us prove a lemma similar to Lemma 9 for First Fit.

Lemma 18 Suppose si(0)≥T. Then when ui+1>0,u

i+1behaves like a random walk with probability

at least 1/j of increasing at each step and probability at most 1/j of decreasing at each step. Also, the

time spent by ui+1on each visit to 0 is stochastically dominated by a random variable D with constant

expectation (that depends only on j). In particular, ui+1≥T1/16 forallbutatmostT

15/16 steps with

probability at least 1−1

T2.

Proof: Since si(0)≥T, over the next Tsteps there is always a bin with remaining capacity iahead of all

single bins with remaining capacity i+1 created after time 0. This implies that ui+1can decrease only when

an item of size i+1 arrives, and hence decreases with probability at most 1/jat each step. When ui+1>0,

then ui+1increases whenever an item of size k−i−1 arrives, and hence it increases with probabilityat least

1/j. The case where ui+1=0 is special, and is handled as in Lemma 4. The ﬁnal result, that ui+1≥T1/16

most of the time, now follows using an argument similar to Lemma 9.

As in the proof for RF, we now want to extend the above lemma inductively. Similar to the RF case, we

would like to say that a bin of size ilies ahead of all single i+1 bins most of the time, whenever the number

of single i+1 bins is sufﬁciently small. In Lemma 13, we accomplished this by splitting each step into two

substeps, with the ﬁrst substep re-ordering the bins randomly. We do not have this luxury for the FF case.

However, it seems intuitive that the bins should be “almost” randomly distributed at each step. This point is

made explicit in the following lemma:

Lemma 19 Let be the event that a single i bin at time t lies ahead of all single i +1bins. Let zb,c

Pr{|ui(t)=b,ui+1(t)=c}.Thenz

b,c

t≤b

b+c

Proof: Consider any sequence a=a1,a2,...,a

tof titems that ends with a single i+1 bin ahead of all

single ibins with ui(t)=band ui+1(t)=c. We center on the steps where the single iand i+1binswere

created. We ﬁrst claim that if a single ibin was created at step gand a single i+1 bin was created at step

h, then switching the entering items at steps gand hswitches the order of these two bins, but has no other

effect on the algorithm. This can easily be proven by induction for all bins behind the ﬁrst single i+1bin,

since there is no way a second item could have been placed in any of these bins. The only difﬁcult case is

that of the ﬁrst single i+1 bin, call it B. The reason that Bis a special case is that it is possible that since

Bis the frontmost single bin, it may be that a second item could have been placed in it if we change its

capacity. However, since switching the appropriate steps gand hwould only lower the capacity of B,itis

clear that if Bhas not obtained a second item in the original sequence, it cannot in the modiﬁed sequence as

well.

We now divide the sequences into equivalence classes. For a sequence a,letY

t(a)be the set of times at

which the single ibins at time twere created. Two sequences aand a0are equivalent if Yi

t(a)∪Yi+1

t(a)=

t(a0)∪Yi+1

t(a0)and ui(t)=b,ui+1(t)=cfor both sequences.

Take any sequence awith a single i+1 bin ahead of all single ibins at time t. From the above paragraph,

permuting the times when a single i+1binandasingleibin were created yields equivalent sequences.

Hence, by taking all ways of splitting Yi

t(a)∪Yi+1

t(a)into two groups of size band c, and using this

division to determine when single iand i+1 bins are created, we ﬁnd that every sequence ahas at least

b+c

bsequences in its equivalence class. Since the probability aand any of these other b+c

bsequences

occurring are equal, it is straightforward to show combinatorially that there are at least b/ctimes as many

sequences with a single ibin ahead of all single i+1 bins as there are with a single i+1 bin ahead of all

single ibin. Hence zb,c

t≤b

b+c.

Lemma 19 suggests that the behavior of FF should not be worse than RF, with the understanding that the

uinow play the role of the si. As in the case of RF, we would like to say the small tokens uibehave like

a unbiased random walk over most of the steps. This leads us to the prove a variant of Lemma 13 in this

setting, which is phrased slightly differently in order to appropriately handle the conditioning.

Lemma 20 Suppose, over a period of T steps, ui−1≥Tαover all but at most T1−αsteps for some α≤1/16

with probability at least 1/2. Then, conditioned on ui−1≥Tαover all but at most T 1−αsteps, ui≥Tα/16

for all but at most T 1−α/16 steps with probability at least 1−4T−α/4.

Proof: As in Lemma 13, we must bound the number of steps for which the behavior of uiis not that of

an unbiased random walk, and then apply an adversary argument. Also as in Lemma 13, we will restrict

our consideration to the behavior of uito the interval [0,Tα/4−1]. (This can be interpreted as though

if ui≥Tα/4, we may assume that a single bin of size i+1 lies ahead of all bins of size i,whichisa

conservative assumption.)

To bound the number of steps the adversary controls, then, we bound the number of steps Xthat satisfy the

following conditions:

•ui−1≥Tα.

•ui≤Tα/4−1.

•A single ibin lies ahead of all single i−1bins.

The value of X, in addition to the number of steps for which ui−1<Tα, bounds the number of steps where

the adversary controls the walk; on all other steps, we either have that ui≥Tα/4or a single i−1 bin lies

in front of all single ibins, and so uibehaves (at worst) as an unbiased random walk with p↑=p↓=1/j.

(As usual, we ignore the discrepancy at ui=0.)

Let ytbe the probability that on the tth step the above conditions hold. Then

E[X]=E[

T−1

t=0

yt]=

T−1

t=0

E[yt]

≤

T−1

t=0

Tα/4

Tα+Tα/4

<T1−3α/4.

Although it would seem this is enough to bound the number of adversary steps, we must be careful. Let

be the event that ui−1≥Tαover all but T1−αsteps. The expected number of additional adversary steps

from single i−1 bins being frontmost is not E[X], but E[X|]. From the hypothesis of the lemma that

Pr()≥1/2, however, we must have E[X|]≤2T1−3α/4. Using Markov’s law, we have

Pr({X|}≥T1−α

)≤2T−α/4.

Hence, conditioned on , the number of steps the adversary controls is at most 2T1−αwith a probability at

least 1 −2T−α/4. The rest of the proof now proceeds as in Lemma 13.

We are now ready to prove the main theorem:

Theorem 21 First Fit is stable under the distribution U{k−2,k}for all k ≥3.

Proof: As in Theorem 14, it sufﬁces to consider the drift of f(s)over a suitably large interval T,andshow

that it is negative for all but a ﬁnite number of states. The excluded set of states Gwill be

G={s∈S:∀i,s

i≤T},

for some suitably large T. We now apply Lemma 18 and Lemma 20 to obtain a bound on E[f(T)−

f(0)|f(0)] similar to that in Theorem 14.

We would then like to apply Lemma 6; however, technically we cannot do so, as the state space of the

underlying Markov chain is not embedded in a ﬁxed dimensional space. Similar results, however, can be

applied in this setting, once we have shown that the expected change in the waste fis negative for a suitably

large T. For example, [14, Theorem 13.0.1] can be used to show that the chain is ergodic, and [6, Theorem

3.1] implies that in the stationary distribution, the distribution of the waste has an exponentially decreasing

tail.

5 Conclusions

We have demonstrated that the First Fit bin packing algorithm is stable on the distributionU{k−2,k}.We

believe that our result demonstrates that the Markov chain approach may be useful, even in situations where

the natural description of a problem does not have a convenient state space. Our analysis made use of insight

gained from a novel packing algorithm, Random Fit, which appears interesting in its own right.

An open question is to tighten the bounds developed in this paper. For both First Fit and Random Fit, our

bounds for the expected waste are doubly exponential in j. Simulations suggest that the expected waste for

First Fit may only be exponential in j[9]. Unfortunately, the simulations for Random Fit seem to suggest

that the expected waste for Random Fit may indeed be doubly exponential in j, in which case it seems that

another approach may be necessary to achieve better bounds for First Fit.

References

[1] Y. Azar, A. Broder, A. Karlin and E. Upfal. Balanced allocations. In Proc. 26 Annual ACM Symposium

on Theory of Computing, pages 593–602, 1994.

[2] E.G. Coffman, C. Courcoubetis, M.R. Garey, D.S. Johnson, L.A. McGeoch, P.W. Shor R.R. Weber

and M. Yannakakis. Fundamental discrepancies between average-case analyses under discrete and

continuous distributions: A bin packing case study. In Proc. 23th Annual ACM Symposium on Theory

of Computing, pages 230–240, 1991.

[3] E.G. Coffman Jr., M.R. Garey and D.S. Johnson. Approximation algorithms for bin packing: A survey.

In Approximation Algorithms for NP-hard Problems, edited by D. Hochbaum, PWS, 1996.

[4] E.G. Coffman Jr., D.S. Johnson, P.W. Shor and R.R. Weber. Markov chains, computer proofs, and

average-case analysis of Best Fit Bin packing. In Proc. 25th Annual ACM Symposium on Theory of

Computing, pages 412–421, 1993.

[5] G. Fayolle, V.A. Malyeshev and M.V. Menshikov. Topics in Constructive Theory of Countable Markov

chains. Cambridge University Press, 1995.

[6] B. Hajek. Hitting-time and occupation-time bounds implied by drift analysis with applications. Ad-

vances in Applied Probability, 14:502–525, 1982.

[7] D.S. Johnson. Near-Optimal Bin Packing Algorithms. PhD Thesis, Massachusetts Institute of Tech-

nology, Department of Mathematics, 1973.

[8] D.S. Johnson. Fast algorithms for bin packing. Journal of Computer and System Sciences, 8:272–314,

1974.

[9] D.S. Johnson. Private communication, 1996.

[10] D.S. Johnson, A. Demers, J.D. Ullman, M.R. Garey and R.L. Graham. Worst case bounds for simple

one-dimensional packing algorithms. SIAM Journal on Computing, 3:299–325, 1974.

[11] C. Kenyon, A. Sinclair, and Y. Rabani. Biased Random Walks, Lyapunov Functions, and Stochastic

Analysis of Best Fit Bin Packing. In Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms,

pages 351–358, 1995. Submitted to Journal of Algorithms.

[12] N. Kahale. Large Deviation Bounds for Markov Chains. DIMACS Technical Report 94-39.

[13] T. Leighton and P. Shor. Tight bounds for minimax grid matching with applications to average case

analysis of algorithms. Combinatorica 9:161–187, 1989.

[14] S. Meyn and R. Tweedie. Markov Chains and Stochastic Stability, Springer-Verlag, 1996.

[15] M. Mitzenmacher. Load balancing and density dependent jump Markov processes. In Proc. 37th IEEE

Annual Symposium on Foundations of Computer Science, 1996.

[16] P.W. Shor. The average-case analysis of some on-line algorithms for bin packing. Combinatorica,

6:179–200, 1986.

Average Case Analysis Of First Fit And Random Bin Packing SRC TN 1997 002

SRC-TN-1997-002 SRC-TN-1997-002

SRC-TN-1997-002 The Eye | File Listing

Navigation menu

Versions of this User Manual:

Views

Navigation