A Practitioner's Guide To Estimation Of Random Coefficients Logit S Demand BLP
Open the PDF directly: View PDF .
Page Count: 36
|Open PDF In Browser||View PDF|
A Practitioner’s Guide to Estimation of Random-Coef cients Logit Models of Demand Aviv Nevo University of California–Berkeley, Berkeley, CA 94720-388 0 and NBER Estimation of demand is at the heart of many recent studies that examine questions of market power, mergers, innovation, and valuation of new brands in differentiated-products markets. This paper focuses on one of the main methods for estimating demand for differentiated products: randomcoefcients logit models. The paper carefully discusses the latest innovations in these methods with the hope of increasing the understanding, and therefore the trust among researchers who have never used them, and reducing the difculty of their use, thereby aiding in realizing their full potential. 1. Introduction Estimation of demand has been a key part of many recent studies examining questions regarding market power, mergers, innovation, and valuation of new brands in differentiated-product industries.1 This paper explains the random-coefcients (or mixed) logit methodology for estimating demand in differentiated-product markets using An earlier version of this paper circulated under the title “A Research Assistant’s Guide to Random Coefcients Discrete Choice Model of Demand.” I wish to thank Steve Berry, Iain Cockburn, Bronwyn Hall, Ariel Pakes, various lecture and seminar participants, and an anonymous referee for comments, discussions, and suggestions. Financial support from the UC Berkeley Committee on Research Junior Faculty Grant is gratefully acknowledged. 1. Just to mention some examples, Bresnahan (1987) studies the 1955 price war in the automobile industry; Gasmi et al. (1992) empirically study collusive behavior in a softdrink market; Hausman et al. (1994) study the beer industry; Berry et al. (1995, 1999) examine equilibrium in the automobile industry and its implications for voluntary trade restrictions; Goldberg (1995) uses estimates of the demand for automobiles to investigate trade policy issues; Hausman (1996) studies the welfare gains generated by a new brand of cereal; Berry et al. (1996) study hubs in the airline industry; Bresnahan et al. (1997) study rents from innovation in the computer industry; Nevo (2000a, b) examines price competition and mergers in the ready-to-eat cereal industry; Davis (1998) studies spatial competition in movie theaters; and Petrin (1999) studies the welfare gains from the introduction of the minivan. © 2000 Massachusetts Institute of Technology. Journal of Economics & Management Strategy, Volume 9, Number 4, Winter 2000, 513–548 514 Journal of Economics & Management Strategy market-level data. This methodology can be used to estimate the demand for a large number of products using market data and allowing for the endogenity of price. While this method retains the benets of alternative discrete-choice models, it produces more realistic demand elasticities. With better estimates of demand, we can, for example, better judge market power, simulate the effects of mergers, measure the benets from new goods, or formulate innovation and competition policy. This paper carefully discusses the recent innovations in these methods with the intent of reducing the barriers to entry and increasing the trust in these methods among researchers who are not familiar with them. Probably the most straightforward approach to specifying demand for a set of closely related but not identical products is to specify a system of demand equations, one for each product. Each equation species the demand for a product as a function of its own price, the price of other products, and other variables. An example of such a system is the linear expenditure model (Stone, 1954), in which quantities are linear functions of all prices. Subsequent work has focused on specifying the relation between prices and quantities in a way that is both exible (i.e., allows for general substitution patterns) and consistent with economic theory.2 Estimating demand for differentiated products adds two additional nontrivial concerns. The rst is the large number of products, and hence the large number of parameters to be estimated. Consider, for example, a constant-elasticity or log–log demand system, in which logarithms of quantities are linear functions of logarithms of all prices. Suppose we have 100 differentiated products; then without additional restrictions this implies estimating at least 10,000 parameters (100 demand equations, one for each product, with 100 prices in each). Even if we impose symmetry and adding up restrictions, implied by economic theory, the number of parameters will still be too large to estimate them. The problem becomes even harder if we want to allow for more general substitution patterns. An additional problem, introduced when estimating demand for differentiated products, is the heterogeneity in consumer tastes: If all consumers are identical, then we would not observe the level of differentiation we see in the marketplace. One could assume that preferences are of the right form [the Gorman form: see Gorman (1959)], so that an aggregate, or average, consumer exists and has a demand function that satises the conditions specied by economic theory.3 2. Examples include the Rotterdam model (Theil, 1965; Barten, 1966), the translog model (Christensen et al., 1975), and the almost ideal demand system (Deaton and Muellbauer, 1980). 3. For an example of a representative consumer approach to demand for differentiated products see Dixit and Stiglitz (1977) or Spence (1976). Random-Coefcients Logit Models of Demand 515 However, the required assumptions are strong and for many applications seem to be empirically false. The difference between an aggregate model and a model that explicitly reects individual heterogeneity can have profound affects on economic and policy conclusions. The logit demand model (McFadden, 1973)4 solves the dimensionality problem by projecting the products onto a space of characteristics, making the relevant size the dimension of this space and not the square of the number of products. A problem with this model is the strong implication of some of the assumptions made. Due to the restrictive way in which heterogeneity is modeled, substitution between products is driven completely by market shares and not by how similar the products are. Extensions of the basic logit model relax these restrictive assumptions, while maintaining the advantage of the logit model in dealing with the dimensionality problem. The essential idea is to explicitly model heterogeneity in the population and estimate the unknown parameters governing the distribution of this heterogeneity. These models have been estimated using both marketand individual-level data.5 The problem with the estimation is that it treats the regressors, including price, as exogenously determined. This is especially problematic when aggregate data is used to estimate the model. This paper describes recent developments in methods for estimating random-coefcients discrete-choice models of demand [Berry, 1994; Berry et al., 1995 (henceforth BLP)]. The new method maintains the advantage of the logit model in handling a large number of products. It is superior to prior methods because (1) the model can be estimated using only market-level price and quantity data, (2) it deals with the endogeneity of prices, and (3) it produces demand elasticities that are more realistic—for example, cross-price elasticities are larger for products that are closer together in terms of their characteristics. The rest of the paper is organized as follows. Section 2 describes a model that encompasses, with slight alterations, the models previously used in the literature. The focus is on the various modeling assumptions and their implications for estimation and the results. In Section 3 I discuss estimation, including the data required, an outline of the algorithm, and instrumental variables. Many of the nitty-gritty details of estimation are described in an appendix (available from 4. A related literature is the characteristic s approach to demand, or the address approach (Lancaster, 1966, 1971; Quandt, 1968; Rosen, 1974). For a recent exposition of it and a proof of its equivalence to the discrete choice approach see Anderson et al. 5. For example, the generalized extreme-value model (McFadden, 1978) and the random-coefcients logit model (Cardell and Dunbar, 1980; Boyd and Mellman, 1980; Tardiff, 1980; Cardell, 1989; and references therein). The random-coefcients model is often called the hedonic demand model in this earlier literature; it should not be confused with the hedonic price model (Court, 1939; Griliches, 1961). 516 Journal of Economics & Management Strategy http://elsa.berkeley.edu/~ nevo). Section 4 provides a brief example of the type of results the estimation can produce. Section 5 concludes and discusses various extensions and alternatives to the method described here. 2. The Model In this section I discuss the model with an emphasis on the various modeling assumptions and their implications. In the next section I discuss the estimation details. However, for now I want to stress two points. First, the method I discuss here uses (market-level) price and quantity data for each product, in a series of markets, to estimate the model. Some information regarding the distribution of consumer characteristics might be available, but a key benet of this methodology is that we do not need to observe individual consumer purchase decisions to estimate the demand parameters.6 Second, the estimation allows prices to be correlated with the econometric error term. This will be modeled in the following way. A product will be dened by a set of characteristics. Producers and consumers are assumed to observe all product characteristics. The researcher, on the other hand, is assumed to observe only some of the product characteristics. Each product will be assumed to have a characteristic that inuences demand but that either is not observed by the researcher or cannot be quantied into a variable that can be included in the analysis. Examples are provided below. The unobserved characteristics will be captured by the econometric error term. Since the producers know these characteristics and take them into account when setting prices, this introduces the econometric problem of endogenous prices.7 The contribution of the estimation method presented below is to transform the model in such a way that instrumental-variable methods can be used. 2.1 The Setup Assume we observe t 5 1, . . . , T markets, each with i 5 1, . . . , It consumers. For each such market we observe aggregate quantities, 6. If individual decisions are observed, the method of analysis differs somewhat from the one presented here. For clarity of presentation I defer discussion of this case to Section 5. 7. The assumption that when setting prices rms take account of the unobserved (to the econometrician) characteristic s is just one way to generate correlation between prices and these unobserved variables. For example, correlation can also result from the mechanics of the consumer’s optimizatio n problem (Kennan, 1989). 517 Random-Coefcients Logit Models of Demand average prices, and product characteristics for J products.8 The denition of a market will depend on the structure of the data. BLP use annual automobile sales over a period of twenty years, and therefore dene a market as the national market for year t, where t 5 1, . . . , 20. On the other hand, Nevo (2000a) observes data in a cross section of cities over twenty quarters, and denes a market t as a city–quarter combination, with t 5 1, . . . , 1124. Yet a different example is given by Das et al. (1994), who observe sales for different income groups, and dene a market as the annual sales to consumers of a certain income level. The indirect utility of consumer i from consuming product j in market t, U (x j t , »j t , p j t , ¿i ; h ), 9 is a function of observed and unobserved (by the researcher) product characteristics, x j t and »j t respectively; price, p j t ; individual characteristics, ¿i ; and unknown parameters, h . I focus on a particular specication of this indirect utility,10 u ijt 5 a i (yi i5 pjt ) 1 xjt b 1, . . . , I t , i 1 j 5 »j t 1 e ij t , 1, . . . , J , t5 1, . . . , T , (1) where yi is the income of consumer i, p j t is the price of product j in market t, x j t is a K-dimensional (row) vector of observable characteristics of product j , »j t is the unobserved (by the econometrician) product characteristic, and e ij t is a mean-zero stochastic term. Finally, a i is consumer i’s marginal utility from income, and b i is a K-dimensional (column) vector of individual-specic taste coefcients. Observed characteristics vary with the product being considered. BLP examine the demand for cars, and include as observed characteristics horsepower, size and air conditioning. In estimating demand for ready-to-eat cereal Nevo (2000a) observes calories, sodium, and ber content. Unobserved characteristics, for example, can include the impact of unobserved promotional activity, unquantiable factors (brand equity), or systematic shocks to demand. Depending on the structure of the data, some components of the unobserved characteristics can be captured by dummy variables. For example, we can model »j t 5 »j 1 »t 1 D »j t and capture »j and »t by brand- and market-specic dummy variables. Implicit in the specication given by equation (1) are three things. First, this form of the indirect utility can be derived from a 8. For ease of exposition I have assumed that all products are offered to all consumers in all markets. The methods described below can easily deal with the case where the choice set differs between markets and also with different choice sets for different consumers. 9. This is sometimes called the conditional indirect utility, i.e., the indirect utility conditional on choosing this option. 10. The methods discussed here are general and with minor adjustments can deal with different functional forms. 518 Journal of Economics & Management Strategy quasilinear utility function, which is free of wealth effects. For some products (for example, ready-to-eat cereals) this is a reasonable assumption, but for other products (for example, cars) it is an unreasonable one. Including wealth effects alters the way the term yi p j t enters equation (1). For example, BLP build on a Cobb-Douglas utility function to derive an indirect utility that is a function of log(y i p j t ). In principle, we can include f (yi p j t ), where f ( ? ) is a exible functional form (Petrin, 1999). Second, equation (1) species that the unobserved characteristic, which among other things captures the elements of vertical product differentiation, is identical for all consumers. Since the coefcient on price is allowed to vary among individuals, this is consistent with the theoretical literature of vertical product differentiation. An alternative is to model the distribution of the valuation of the unobserved characteristics, as in Das et al. (1994). As long as we have not made any distributional assumptions on consumer-specic components (i.e., anything with subscript i), their model is not more general. Once we make such assumptions, their model has slightly different implications for some of the normalizations usually made. An exact discussion of these implications is beyond the scope of this paper. Finally, the specication in equation (1) assumes that all consumers face the same product characteristics. In particular, all consumers are offered the same price. Depending on the data, if different consumers face different prices, using either a list or average transaction price will lead to measurement error bias. This just leads to another reason why prices might be correlated with the error term and motivates the instrumental-variable procedure discussed below.11 The next component of the model describes how consumer preferences vary as a function of the individual characteristics, ¿i . In the context of equation (1) this amounts to modeling the distribution of consumer taste parameters. The individual characteristics consist of two components: demographics, which I refer to as observed, and additional characteristics, which I refer to as unobserved, denoted Di and v i respectively. Given that no individual data is observed, neither component of the individual characteristics is directly observed in the choice data set. The distinction between them is that even though we do not observe individual data, we know something about the distribution of the demographics, Di , while for the additional characteristics, vi , we have no such information. Examples of demographics are income, age, family size, race, and education. Examples of the type of 11. However, as noted by Berry (1994), the method proposed below can deal with measurement error only if the variable measured with error enters in a restrictive way. Namely, it only enters the part of utility that is common to all consumers, d j in equation (3) below. 519 Random-Coefcients Logit Models of Demand information we might have is a large sample we can use to estimate some feature of the distribution (e.g., we could use Census data to estimate the mean and standard deviation of income). Alternatively, we might have a sample from the joint distribution of several demographic variables (e.g., the Current Population Survey might tell us about the joint distribution of income, education, and age in different cities in the US). The additional characteristics, m i , might include things like whether the individual owns a dog, a characteristic that might be important in the decision of which car to buy, yet even very detailed survey data will usually not include this fact. Formally, this will be modeled as a i b i 5 b a 1 ÕDi 1 åm i , m i ~ Pm * (m ) , Di ~ PÃD* (D), (2) where Di is a d ´ 1 vector of demographic variables, m i captures the additional characteristics discussed in the previous paragraph, Pm * (? ) is a parametric distribution, PÃ D* (? ) is either a nonparametric distribution known from other data sources or a parametric distribution with the parameters estimated elsewhere, Õ is a (K 1 1) ´ d matrix of coefcients that measure how the taste characteristics vary with demographics, and å is a ( K 1 1) ´ (K 1 1) matrix of parameters.12 If we assume that Pm * (? ) is a standard multivariate normal distribution, as I do in the example below, then the matrix å allows each component of m i to have a different variance and allows for correlation between these characteristics. For simplicity I assume that m i and Di are independent. Equation (2) assumes that demographics affect the distribution of the coefcients in a fairly restrictive linear way. For those coefcients that are most important to the analysis (e.g., the coefcients on price), relaxing the linearity assumption could have important implications [for example, see the results reported in Nevo (2000a, b)]. As we will see below, the way we model heterogeneity has strong implications for the results. The advantage of letting the taste parameters vary with the observed demographics, Di is twofold. First, it allows us to include additional information, about the distribution of demographics, in the analysis. Furthermore, it reduces the reliance on parametric assumptions. Therefore, instead of letting a key element of the method, the distribution of the random coefcients, be determined by potentially arbitrary distributional assumptions, we bring in additional information. 12. To simplify notation I assume that all characteristics have random coefcients. This need not be the case. I return to this in the appendix, when I discuss the details of estimation. 520 Journal of Economics & Management Strategy The specication of the demand system is completed with the introduction of an outside good: the consumers may decide not to purchase any of the brands. Without this allowance, a homogenous price increase (relative to other sectors) of all the products does not change quantities purchased. The indirect utility from this outside option is u i0t 5 a i yi 1 »0t 1 ¼0 Di 1 ¾0 v i0 1 e i0t . The mean utility from the outside good, »0t , is not identied (without either making more assumptions or normalizing one of the inside goods). Also, the coefcients ¼0 and ¾0 are not identied separately from coefcients on an individual-specic constant term in equation (1). The standard practice is to set »0t , ¼0 , and ¾0 to zero, and since the term a i yi will eventually vanish (because it is common to all products), this is equivalent to normalizing the utility from the outside good to zero. Let h 5 (h 1 , h 2 ) be a vector containing all the parameters of the model. The vector h 1 5 (a , b ) contains the linear parameters, and the vector h 2 5 (Õ, å) the nonlinear parameters. 13 Combining equations (1) and (2), we have u ijt 5 a i yi 1 d jt 5 xjt b j t (x j t , d a pjt 1 p j t , »j t ; h 1 ) 1 »j t , u ijt 5 u ijt ( x j t , p j t , v i , Di ; h 2 ) 1 [ p j t , x j t ] ( ÕDi 1 åm i ), e ijt , (3) where [ p j t , x j t ] is a 1 ´ (K 1 1) (row) vector. The indirect utility is now expressed as a sum of three (or four) terms. The rst term, a i yi , is given only for consistency with equation (1) and will vanish, as we will see below. The second term, d j t , which is referred to as the mean utility, is common to all consumers. Finally, the last two terms, l ijt 1 e ijt , represent a mean-zero heteroskedastic deviation from the mean utility that captures the effects of the random coefcients. Consumers are assumed to purchase one unit of the good that gives the highest utility.14 Since in this model an individual is de13. The reasons for the names will become apparent below. 14. A comment is in place here about the realism of the assumption that consumers choose no more than one good. We know that many households own more than one car, that many of us buy more than one brand of cereal, and so forth. We note that even though many of us buy more than one brand at a time, less actually consume more than one at a time. Therefore, the discreteness of choice can be sometimes defended by dening the choice period appropriately. In some cases this will still not be enough, in which case the researcher has one of two options: either claim that the above model is an approximation, or reduced-form, to the true choice model, or model the choice of multiple products, or continuous quantities, explicitly [as in Dubin and McFadden (1984) or Hendel (1999)]. 521 Random-Coefcients Logit Models of Demand ned as a vector of demographics and product-specic shocks, ( Di , m i , e i 0t , . . . , e iJ t ), this implicitly denes the set of individual attributes that lead to the choice of good j . Formally, let this set be A j t (x ? t , p ? t , d ? t; h 2) 5 © (Di , v i , e " l5 i 0t , ...,e iJ t )|u ij t ³ u ilt ª 0, 1, . . . , J , where x ? t 5 ( x lt , . . . , x J t ) ¢ , p ? t 5 (p lt , . . . , p J t ) ¢ , and d ? t 5 (d lt , . . . , d J t ) ¢ are observed characteristics, prices, and mean utilities of all brands, respectively. The set A j t denes the individuals who choose brand j in market t. Assuming ties occur with zero probability, the market share of the j th product is just an integral over the mass of consumers in the region A j t . Formally, it is given by s j t ( x? t , p ? t , d ? t; h 2) 5 5 5 dP * (D, v, e ) HA jt HA jt HA jt dP * (e |D, m ) dP * (m |D) dPD* (D) dPe * (e ) dPm * (m ) d PÃD* (D) , (4) where P * ( ? ) denotes population distribution functions. The second equality is a direct application of Bayes’ rule, while the last is a consequence of the independence assumptions previously made. Given assumptions on the distribution of the (unobserved) individual attributes, we can compute the integral in equation (4), either analytically or numerically. Therefore, for a given set of parameters equation (4) predicts of the market share of each product in each market, as a function of product characteristics, prices, and unknown parameters. One possible estimation strategy is to choose parameters that minimize the distance (in some metric) between the market shares predicted by equation (4) and the observed shares. This estimation strategy will yield estimates of the parameters that determine the distribution of individual attributes, but it does not account for the correlation between prices and the unobserved product characteristics. The method proposed by Berry (1994) and BLP, which is presented in detail in the Section 3, accounts for this correlation. 2.2 Distributional Assumptions The assumptions on the distribution of individual attributes made in order to compute the integral in equation (4) have important implications for the own- and cross-price elasticities of demand. In this section I discuss some possible assumptions and their implications. 522 Journal of Economics & Management Strategy Possibly the simplest distributional assumption one can make in order to evaluate the integral in equation (4) is that consumer heterogeneity enters the model only through the separable additive random shock, e ij t . In our model this implies h 2 5 0, or b i 5 b and a i 5 a for all i, and equation (1) becomes u ijt 5 pjt ) 1 a (yi i5 xjt b 1 1, . . . , It , »j t 1 j5 e ijt , 1, . . . , J , t5 1, . . . , T . (5) At this point, before we specify the distribution of e ij t , the model described by equation (5) is as general as the model given in equation (1).15 Once we assume that e ij t is i.i.d., then the implied substitution patterns are severely restricted, as we will see below. If we also assume that e ijt are distributed according to a Type I extremevalue distribution, this is the (aggregate) logit model. The market share of brand j in market t, dened by equation (4), is sj t 5 a pjt 1 exp(x j t b 11 S J k5 1 exp(x kt b »j t ) a p kt 1 »kt ) . (6) Note that income drops out of this equation, since it is common to all options. Although the model implied by equation (5) and the extremevalue distribution assumption is appealing due to its tractability, it restricts the substitution patterns to depend only on the market shares. The price elasticities of the market shares dened by equation (6) are ´j kt 5 @sj t p kt @p kt sj t 5 ( a p j t (1 a p kt skt sj t ) if j 5 k, otherwise. There are two problems with these elasticities. First, since in most cases the market shares are small, the factor a (1 s j t ) is nearly constant; hence, the own-price elasticities are proportional to own price. Therefore, the lower the price, the lower the elasticity (in absolute value), which implies that a standard pricing model predicts a higher markup for the lower-priced brands. This is possible only if the marginal cost of a cheaper brand is lower (not just in absolute value, but as a percentage of price) than that of a more expensive product. For some products this will not be true. Note that this problem is a direct implication of the functional form in price. If, for example, indirect utility was a function of the logarithm of price, rather than price, then the implied elasticity would be roughly constant. In other 15. To see this compare equation (5) with equation (3). Random-Coefcients Logit Models of Demand 523 words, the functional form directly determines the patterns of ownprice elasticity. An additional problem, which has been stressed in the literature, is with the cross-price elasticities. For example, in the context of RTE cereals the cross-price elasticities imply that if Quaker CapN Crunch (a childern’s cereal) and Post Grape Nuts (a wholesome simple nutrition cereal) have similar market shares, then the substitution from General Mills Lucky Charms (a children’s cereal) toward either of them will be the same. Intuitively, if the price of one children’s cereal goes up, we would expect more consumers to substitute to another children’s cereal than to a nutrition cereal. Yet, the logit model restricts consumers to substitute towards other brands in proportion to market shares, regardless of characteristics. The problem in the cross-price elasticities comes from the i.i.d. structure of the random shock. In order to understand why this is the case, examine equation (3). A consumer will choose a product either because the mean utility from the product, d j t , is high or because the consumer-specic shock, l ijt 1 e ij t , is high. The distinction becomes important when we consider a change in the environment. Consider, for example, the increase in the price of Lucky Charms discussed in the previous paragraph. For some consumers who previously consumed Lucky Charms, the utility from this product decreases enough so that the utility from what was the second choice is now higher. In the logit model different consumers will have different rankings of the products, but this difference is due only to the i.i.d. shock. Therefore, the proportion of these consumers who rank each brand as their second choice is equal to the average in the population, which is just the market share of the each product. In order to get around this problem we need the shocks to utility to be correlated across brands. By generating correlation we predict that the second choice of consumers that decide to no longer buy Lucky Charms will be different than that of the average consumer. In particular, they will be more likely to buy a product with a shock that was positively correlated to Lucky Charms, for example CapN Crunch. As we can see in equation (3), this correlation can be generated either through the additive separable term e ij t or through the term l ij t , which captures the effect the demographics, Di and v i . Appropriately dening the distributions of either of these terms can yield the exact same results. The difference is only in modeling convenience. I now consider models of the two types. Models are available that induce correlation among options by allowing e ij t to be correlated across products rather than independently distributed, are available (see the generalized extreme-value model, McFadden, 1978). One such example is the nested logit model, 524 Journal of Economics & Management Strategy in which all brands are grouped into predetermined exhaustive and mutually exclusive sets, and e ijt is decomposed into an i.i.d. shock plus a group-specic component.16 This implies that correlation between brands within a group is higher than across groups; thus, in the example given above, if the price of Lucky Charms goes up, consumers are more likely to rank CapN Crunch as their second choice. Therefore, consumers that currently consume Lucky Charms are more likely to substitute towards Grape Nuts than the average consumer. Within the group the substitution is still driven by market shares, i.e., if some children’s cereals are closer substitutes for Lucky Charms than others, this will not be captured by the simple grouping.17 The main advantage of the nested logit model is that, like the logit model, it implies a closed form for the integral in equation (4). As we will see in Section 3, this simplies the computation. This nested logit can t into the model described by equation (1) in one of two ways: by assuming a certain distribution of e ij t , as was motivated in the previous paragraph, or by assuming one of the characteristics of the product is a segment-specic dummy variable and assuming a particular distribution on the random coefcient of that characteristic. Given that we have shown that the model described by equations (1) and (2) can also be described by equation (3), it should not be surprising that these two ways of describing the nested logit are equivalent. Cardell (1997) shows the distributional assumptions required for this equivalence to hold. The nested logit model allows for somewhat more exible substitution patterns. However, in many cases the a priori division of products into groups, and the assumption of i.i.d. shocks within a group, will not be reasonable, either because the division of segments is not clear or because the segmentation does not fully account for the substitution patterns. Furthermore, the nested logit does not help with the problem of own-price elasticities. This is usually handled by assuming some “nice” functional form (i.e., yield patterns that are consistent with some prior), but that does not solve the problem of having the elasticities driven by the functional-form assumption. In some industries the segmentation of the market will be multilayered. For example, computers can be divided into branded versus generic and into frontier versus nonfrontier technology. It turns 16. For a formal presentation of the nested logit model in the context of the model presented here, see Berry (1994) or Stern (1995). 17. Of course, one does not have to stop at one level of nesting. For example, we could group all children’s cereals into family-acceptabl e and not acceptable. For an example of such grouping for automobiles see Goldberg (1995). 525 Random-Coefcients Logit Models of Demand out that in the nested logit specication the order of the nests matters.18 For this reason Bresnahan et al. (1997) build on a the general extreme-value model (McFadden 1978) to construct what they call the principles-of-differentiation general extreme-value (PD GEV) model of demand for computers. In their model they are able to use two dimensions of differentiation, without ordering them. With the exception of dealing with the problem of ordering the nests, this model retains all the advantages and disadvantages of the nested logit. In particular it implies a closed-form expression for the integral in equation (4). In principle one could consider estimating an unrestricted variance-covariance matrix of the shock, e ijt . This, however, reintroduces the dimensionality problem discussed in the Introduction.19 If in the full model, described by equations (1) and (2), we maintain the i.i.d. extreme-value distribution assumption on e ijt . Correlation between choices is obtained through the term l ij t . The correlation will be a function of both product and consumer characteristics: the correlation will be between products with similar characteristics, and consumers with similar demographics will have similar rankings of products and therefore similar substitution patterns. Therefore, rather than having to estimate a large number of parameters, corresponding to an unrestricted variance-covariance matrix, we only have to estimate a smaller number. The price elasticities of the market shares, sj t , dened by equation (4) are ´j kt 5 @sj t p kt @p kt sj t 5 ( pj t p kt sj t sj t Ha Ha i s ij t (1 sij t ) d PÃ D* (D) dPm * (v) Ã* i s ijt s ikt d PD (D) dPm (v) * if j 5 k, otherwise, where sij t 5 exp(d j t 1 l ijt )/ [11 S Kk5 1 exp(d kt 1 l ikt )] is the probability of individual i purchasing product j . Now the own-price elasticity will not necessary be driven by the functional form. The partial derivative of the market shares will no longer be determined by a single parameter, a . Instead, each individual will have a different price sensitivity, which will be averaged to a mean price sensitivity using the individual specic probabilities of purchase as weights. The price sensitivity 18. So, for example, classifying computers rst into branded/nonbranded and then into frontier/nonfrontier technology implies different substitution patterns than classifying rst into frontier/nonfrontier technology and then into branded/nonbranded, even if the classication of products does not change. 19. See Hausman and Wise (1978) for an example of such a model with a small number of products. 526 Journal of Economics & Management Strategy will be different for different brands. So if, for example, consumers of Kellogg’s Corn Flakes have high price sensitivity, then the own-price elasticity of Kellogg’s Corn Flakes will be high despite the low prices and the fact that prices enter linearly. Therefore, substitution patterns are not driven by functional form, but by the differences in the price sensitivity, or the marginal utility from income, between consumers that purchase the various products. The full model also allows for exible substitution patterns, which are not constrained by a priori segmentation of the market (yet at the same time can take advantage of this segmentation by including a segment dummy variable as a product characteristic). The composite random shock, l ij t 1 e ijt , is then no longer independent of the product and consumer characteristics. Thus, if the price of a brand goes up, consumers are more likely to switch to brands with similar characteristics rather than to the most popular brand. Unfortunately, these advantages do not come without cost. Estimation of the model specied in equation (3) is not as simple as that of the logit, nested logit, or GEV models. There are two immediate problems. First, equation (4) no longer has an analytic closed form [like that given in equation (6) for the Logit case]. Furthermore, the computation of the integral in equation (4) is difcult. This problem is solved using simulation methods, as described below. Second, we now require information about the distribution of consumer heterogeneity in order to compute the market shares. This could come in the form of a parametric assumption on the functional form of the distribution or by using additional data sources. Demographics of the population, for example, can be obtained by sampling from the CPS. 3. Estimation 3.1 The Data The market-level data required to consistently estimate the model previously described consists of the following variables: market shares and prices in each market, and brand characteristics. In addition information on the distribution of demographics, PÃ D* , is useful, 20 as are marketing mix variables (such as advertising expenditures or the availability of coupons or promotional activity). In principle, some of the parameters of the model are identied even with data on one 20. Recall that we divided the demographic variables into two types. The rst were those variables for which we had some information regarding the distribution. If such information is not available we are left with only the second type, i.e., variables for which we assume a parametric distribution. Random-Coefcients Logit Models of Demand 527 market. However, it is highly recommended to gather data on several markets with variation in relative prices of the products and/or products offered. Market shares are dened using a quantity variable, which depends on the context and should be determined by the specics of the problem. BLP use the number of automobiles sold, while Nevo (2000a, b) converts pounds of cereal into servings. Probably the most important consideration in choosing the quantity variable is the need to dene a market share for the outside good. This share will rarely be observed directly, and will usually be dened as the total size of the market minus the shares of the inside goods. The total size of the market is assumed according to the context. So, for example, Nevo (2000a, b) assumes the size of the market for ready-to-eat cereal to be one serving of cereal per capita per day. Bresnahan et al. (1997), in estimating demand for computers, take the potential market to be the total number of ofce-based employees. In general I found the following rules useful when dening the market size. You want to make sure to dene the market large enough to allow for a nonzero share of the outside good. When looking at historical data one can use eventual growth to learn about the potential market size. One should check the sensitivity of the results to the market denition; if the results are sensitive, consider an alternative. There are two parts to dening the market size: choosing the variable to which the market size is proportional, and choosing the proportionality factor. For example, one can assume that the market size is proportional to the size of the population with the proportionality factor equal to a constant factor, which can be estimated (Berry et al., 1996). From my own (somewhat limited) experience, getting the right variable from which to make things proportional is the harder, and more important, component of this process. An important part of any data set required to implement the models described in Section 2 consists of the product characteristics. These can include physical product characteristics and market segmentation information. They can be collected from manufacturer’s descriptions of the product, the trade press, or the researcher’s prior. In collecting product characteristics we recall the two roles they play in the analysis: explaining the mean utility level d (? ) in equation (3), and driving the substitution patterns through the term l (? ) in equation (3). Ideally, these two roles should be kept separate. If the number of markets is large enough relative to the number of products, the mean utility can be explained by including product dummy variables in the regression. These variables will absorb any product characteristics that are constant across markets. A discussion of the issues arising from including brand dummy variables is given below. 528 Journal of Economics & Management Strategy Relying on product dummy variables to guide substitution patterns is equivalent to estimating an unrestricted variance-covariance matrix of the random shock e ijt in equation (1). Both imply estimating J (J 1)/ 2 parameters. Since part of our original motivation was to reduce the number of parameters to be estimated, this is usually not a feasible option. The substitution patterns are explained by the product characteristics, and in deciding which attributes to collect the researcher should keep this in mind. The last component of the data is information regarding the demographics of the consumers in different markets. Unlike market shares, prices, or product characteristics, this estimation can proceed without demographic information. In this case the estimation will rely on assumed distributional assumptions rather than empirical distributions. The Current Population Survey (CPS) is a good, widely available source for demographic information. 3.2 Identi cation This section discusses, informally, some of the identication issues. There are several layers to the argument. First, I discuss how in general a discrete choice model helps us identify substitution patterns, using aggregate data from (potentially) a small number of markets. Second, I ask what in the data helps us distinguish between different discrete choice models—for example, how we can distinguish between the logit and the random-coefcients logit. A useful starting point is to ask how one would approach the problem of estimating price elasticities if a controlled experiment could be conducted. The answer is to expose different consumers to randomly assigned prices and record their purchasing patterns. Furthermore, one could relate these purchasing patterns to individual characteristics. If individual purchases could not be observed, the experiment could still be run by comparing the different aggregate purchases of different groups. Once again, in principle, these patterns could be related to the difference in individual characteristics between the groups. There are two potential problems with mapping the data described in the previous section into the data that arises from the ideal controlled experiment, described in the previous paragraph. First, prices are not randomly assigned; rather, they are set by prot-maximizing rms that take into account information that, due to inferior knowledge, the researcher has to include in the error term. This problem can be solved, at least in principle, by using instrumental variables. The second, somewhat more conceptual difculty arises because discrete choice models, for example the logit model, can be estimated Random-Coefcients Logit Models of Demand 529 using data from just one market; hence, we are not mimicking the experiment previously described. Instead, in our experiment we ask consumers to choose between products, which are perceived as bundles of attributes. We then reveal the preferences for these attributes, one of which is price. The data from each market should not be seen as one observation of purchases when faced with a particular price vector; rather, it is an observation on the relative likelihood of purchasing J different bundles of attributes. The discrete choice model ties these probabilities to a utility model that allows us to compute price elasticities. The identifying power of this experiment increases as more markets are included with variation both in the characteristics of products and in the choice set. The same (informal) identication argument holds for the nested logit, GEV, and random-coefcients models, which are generalized forms of the logit model. There are two caveats to the informal argument previously given. If one wants to tie demographic variables to observed purchases [i.e., allow for ÕDi in equation (2)], several markets, with variation in the distribution of demographics, have to be observed. Second, if not all the product characteristics are observed and these unobserved attributes are correlated with some of the observed characteristics, then we are faced with an endogeneity problem. The problem can be solved by using instrumental variables, but we note that the formal requirements from these instrumental variables depend on what we believe goes into the error term. In particular, if brandspecic dummy variables are included, we will need the instrumental variables to satisfy different requirements. I return to this point in Section 3.4. A different question is: What makes the random-coefcients logit respond differently to product characteristics? In other words, what pins down the substitution patterns? The answer goes back to the difference in the predictions of the two models and can be best explained with an example. Suppose we observe three products: A, B, and C. Products A and B are very similar in their characteristics, while products B and C have the same market shares. Suppose we observe market shares and prices in two periods, and suppose the only change is that the price of product A increases. The logit model predicts that the market shares of both products B and C should increase by the same amount. On the other hand, the random-coefcients logit allows for the possibility that the market share of product B, the one more similar to product A, will increase by more. By observing the actual relative change in the market shares of products B and C we can distinguish between the two models. Furthermore, the degree of change will allow us to identify the parameters that govern the distribution of the random coefcients. 530 Journal of Economics & Management Strategy This argument suggests that having data from more markets helps identify the parameters that govern the distribution of the random coefcients. Furthermore, observing the change in market shares as new products enter or as characteristics of existing products change provides variation that is helpful in the estimation. 3.3 The Estimation Algorithm In this subsection I outline how the parameters of the models described in Section 2 can be consistently estimated using the data described in Section 3.1. Following Berry’s (1994) suggestion, a GMM estimator is constructed. Given a value of the unknown parameters, the implied error term is computed and interacted with the instruments to form the GMM objective function. Next, a search is performed over all the possible parameter values to nd those values that minimize the objective function. In this subsection I discuss what the error term is, how it can be computed, and some computational details. Discussion of the instrumental variables is deferred to the next section. As previously pointed out, a straightforward approach to the estimation is to solve Min d s(x, p, d (x, p, »; h 1 ) ; h 2 ) h Sd , (7) where s(? ) are the market shares given by equation (4), and S are the observed market shares. However, this approach is usually not taken, for several reasons. First, all the parameters enter the minimization in equation (7) in a nonlinear fashion. In some applications the inclusion of brand and time dummy variables results in a large number of parameters and a costly nonlinear minimization problem. The estimation procedure suggested by Berry (1994), which is described below, avoids this problem by transforming the minimization problem so that some (or all) of the parameters enter the objective function linearly. Fundamentally, though, the main contribution of the estimation method proposed by Berry (1994) is that it allows one to deal with correlation between the (structural) error term and prices (or other variables that inuence demand). As we saw in Section 2.1, there are several variables that are unobserved by the researcher. These include the individual-level characteristics, denoted (Di , v i , e i ), as well as the unobserved product characteristics, »j . As we saw in equation (4), the unobserved individual attributes ( Di , vi , e i ) were integrated over. Therefore, the econometric error term will be the unobserved product characteristics, »j t . Since it is likely that prices are correlated with this term, the econometric estimation will have to take account of this. The Random-Coefcients Logit Models of Demand 531 standard nonlinear simultaneous-equations model [see, for example, Amemiya (1985, Chapter 8)] allows both parameters and variables to enter in a nonlinear way, but requires a separable additive error term. Equation (7) does not meet this requirement. The estimation method proposed by Berry (1994), and described below, shows how to adapt the model described in the previous section to t into the standard (linear or) nonlinear simultaneous-equations model. Formally, let Z 5 [z 1 , . . . , z M ] be a set of instruments such that E[Z m x (h * ) ] 5 0, m5 1, . . . , M , (8) where x , a function of the model parameters, is an error term dened below, and h * denotes the “true” values of the parameters. The GMM estimate is hÃ 5 argmin x (h ) ¢ ZF 1 Z ¢ x (h ), (9) h where F is a consistent estimate of E[Z¢ x x ¢ Z]. The logic driving this estimate is simple enough. At the true parameter value, h * , the population moment, dened by equation (8), is equal to zero. So we choose our estimate such that it sets the sample analog of the moments dened in equation (8), i.e., Z ¢ x Ã , to zero. If there are more independent moment equations than parameters [i.e., dim(Z) > dim(h )], we cannot set all the sample analogs exactly to zero and will have to set them as close to zero as possible. The weight matrix, F 1 , denes the metric by which we measure how close to zero we are. By using the inverse of the variance-covariance matrix of the moments, we give less weight to those moments (equations) that have a higher variance. Following Berry (1994), the error term is not dened as the difference between the observed and predicted market shares, as in equation (7); rather it is dened as the structural error, »j t . The advantage of working with a structural error is that the link to economic theory is tighter, allowing us to think of economic theories that would justify various instrumental variables. In order to use equation (9) we need to express the error term as an explicit function of the parameters of the model and the data. The key insight, which can be seen in equation (3), is that the error term, »j t , only enters the mean utility level, d (? ). Furthermore, the mean utility level is a linear function of »j t ; thus, in order to obtain an expression for the error term we need to express the mean utility as a 532 Journal of Economics & Management Strategy linear function of the variables and parameters of the model. In order to do this, we solve for each market the implicit system of equations s(d ? t ; h 2 ) 5 t5 S? t , 1, . . . , T , (10) where s(? ) are the market shares given by equation (4), and S are the observed market shares. The intuition for why we want to do this is given below. In solving this system of equations we have two steps. First, we need a way to compute the left-hand side of equation (10), which is dened by equation (4). For some special cases of the general model (e.g., logit, nested logit, and PD GEV) the market-share equation has an analytic formula. For the full random-coefcients model the integral dening the market shares has to be computed by simulation. There are several ways to do this. Probably the most common is to approximate the integral given by equation (4) by sj t (p ? t , x ? t , d 5 ? t, Pns ; h 2 ) 1 ns s 5 ns i 5 1 j ti ´ S 1 ns exp d 11 S J m5 1 S ns i5 1 jt 1 exp d S x jkt (¾k v ik 1 K k5 1 mt 1 S K k5 1 ¼k1 Di 1 1 k x mt (¾k v ik 1 ? ? ? 1 ¼k1 Di 1 1 ¼kd Di d ) ? ? ? 1 ¼kd Di d ) , (11) where (m 1i , . . . , m Ki ) and (Di1 , . . . , Did ), i 5 1, . . . , ns, are draws from PÃ m * (v) and PD* (D) , respectively, while x jkt , k 5 1, . . . , K, are the variables that have random slope coefcients. Note the we use the extreme-value distribution Pe * (e ), to integrate the e ¢ s analytically. Issues regarding sampling from Pm * v and PÃ D* (D), alternative methods to approximate the market shares, and their advantages are discussed in detail in the appendix (available from http://elsa.berkeley.edu/~ nevo). Second, using the computation of the market share, we invert the system of equations. For the simplest special case, the logit model, this inversion can be computed analytically by d j t 5 ln Sj t ln S0t , where S0t is the market share of the outside good. Note that it is the observed market shares that enter this equation. This inversion can also be computed analytically in the nested logit model (Berry, 1994) and the PD GEV model (Bresnahan et al., 1997). For the full random-coefcients model the system of equations (10) is nonlinear and is solved numerically. It can be solved by using Random-Coefcients Logit Models of Demand 533 the contraction mapping suggested by BLP (see there for a proof of convergence), which amounts to computing the series 5 h1 1 ?t d h ?t d 1 ln S? t t5 ln S(p ? t , x ? t , d ?ht , Pns ; h 2 ), 1, . . . , T, h5 0, . . . , H , (12) where s(? ) are the predicted market shares computed in the rst step, H is the smallest integer such that ||d H? t d ?Ht 1 || is smaller than some tolerance level, and d H? t is the approximation to d ? t . Once the inversion has been computed, either analytically or numerically, the error term is dened as x jt 5 d j t ( S? t ; h 2) (x j t b 1 a pj t ) º »j t . (13) Note that it is the observed market shares, S, that enter this equation. Also, we can now see the reason for distinguishing between h 1 and h 2 : h 1 enters this term, and the GMM objective, in a linear fashion, while h 2 enters nonlinearly. The intuition to the denition is as follows. For given values of the nonlinear parameters h 2 , we solve for the mean utility levels d ? t (? ), that set the predicted market shares equal to the observed market shares. We dene the residual as the difference between this valuation and the one predicted by the linear parameters a and b . The estimator, dened by equation (9), is the one the minimizes the distance between these different predictions. Usually,21 the error term, as dened by equation (13), is the unobserved product characteristic, »j t . However, if enough markets are observed, then brand-specic dummy variables can be included as product characteristics. The coefcients on these dummy variable capture both the mean quality of observed characteristics that do not vary over markets, b x j , and the overall mean of the unobserved characteristics, »j . Thus, the error term is the market-specic deviation from the main valuation, i.e., D »j t º »j t »j . The inclusion of brand dummy variables introduces a challenge in estimating the taste parameters, b , which is dealt with below. In the logit and nested logit models, with the appropriate choice of a weight matrix,22 this procedure simplies to two-stage least squares. In the full random-coefcients model, both the computation of the market shares and the inversion in order to get d j t (? ) have to be done numerically. The value of the estimate in equation (9) is then computed using a nonlinear search. This search is simplied by noting 21. See for example Berry (1994), BLP, Berry et al. (1996), and Bresnahan et al. (1997). 22. That is, F 5 Z¢ Z, which is the “optimal” weight matrix under the assumption of homoskedastic errors. 534 Journal of Economics & Management Strategy that the rst-order conditions of the minimization problem dened in equation (9) with respect to h 1 are linear in these parameters. Therefore, these linear parameters can be solved for (as a function of the other parameters) and plugged into the rest of the rst-order conditions, limiting the nonlinear search to the nonlinear parameters only. The details of the computation are given in the appendix. 3.4 Instruments The identifying assumption in the algorithm previously given is equation (8), which requires a set of exogenous instrumental variables. As is the case with many hard problems, there is no global solution that applies to all industries and data sets. Listed below are some of the solutions offered in the literature. A precise discussion of how appropriate each set of assumptions has to be done on a case-by-case basis, but several advantages and problems are mentioned below. The rst set of variables that comes to mind are the instrumental variables dened by ordinary (or nonlinear) least squares, namely the regressors (or more generally the derivative of the moment function with respect to the parameters). As previously discussed, there are several reasons why these are invalid. For example, a variety of differentiated-products pricing models predict that prices are a function of marginal cost and a markup term. The markup term is a function of the unobserved product characteristic, which is also the error term in the demand equation. Therefore, prices will be correlated with the error term, and the estimate of the price sensitivity will be biased. A standard place to start the search for demand-side instrumental variables is to look for variables that shift cost and are uncorrelated with the demand shock. These are the textbook instrumental variables, which work quite well when estimating demand for homogeneous products. The problem with the approach is that we rarely observe cost data ne enough that the cost shifters will vary by brand. A restricted version of this approach uses whatever cost information is available in combination with some restrictions on the demand specication [for example, see the cost variables used in Nevo (2000a)]. Even the restricted version is rarely feasible, due to lack of any cost data. The most popular identifying assumption used to deal with the above endogeneity problem is to assume that the location of products in the characteristics space is exogenous, or at least determined prior to the revelation of the consumers’ valuation of the unobserved product characteristics. This assumption can be combined with a specic model of competition and functional-form assumptions to generate an implicit set of instrumental variables (as in Bresnahan, 1981, 1987). Random-Coefcients Logit Models of Demand 535 BLP derive a slightly more explicit set of instrumental variables, which build on a similar economic assumption. They use the observed product characteristics (excluding price and other potentially endogenous variables), the sums of the values of the same characteristics of other products offered by that rm (if the rm produces more than one product), and the sums of the values of the same characteristics of products offered by other rms.23 Instrumental variables of this type have been quite successful in the study of many industries, including automobiles, computers, and pharmaceutical drugs. One advantage of this approach is that the instrumental variables vary by brand. The main problem is that in some cases the assumption that observed characteristics are uncorrelated with the unobserved components is not valid. One example is when certain types of products are better characterized by observed attributes. Another example is if the time required to change the observed characteristics is short and therefore changes in characteristics could be reacting to the same sort of shocks as prices. Finally, once a brand dummy variable is introduced, a problem arises with these instrumental variables: unless there is variation in the products offered in different markets, there is no variation between markets in these instruments. The last set of instrumental variables I discuss here was introduced by Hausman et al. (1994) and Hausman (1996) and was used in the context of the model described here by Nevo (2000a, b). The essential ideal is to exploit the panel structure of the data. This argument is best demonstrated by an example. Nevo (2000a, b) observes quantities and prices for ready-to-eat cereal in a cross section of cities over twenty quarters. Following Hausman (1996), the identifying assumption made is that, controlling for brand-specic intercepts and demographics, the city-specic valuations of the product, D »j t 5 »j t »j , are independent across cities but are allowed to be correlated within a city over time. Given this assumption, the prices of the brand in other cities are valid instruments; prices of brand j in two cities will be correlated due to the common marginal cost, but due to the independence assumption will be uncorrelated with the market-specic valuation of the product. There are several plausible situations in which the independence assumption will not hold. Suppose there is a national (or regional) demand shock, for example, discovery that ber may reduce the risk 23. Just to be sure, suppose the product has two characteristics : horsepower (HP) and size (S), and assume there are two rms producing three products each. Then we have six instrumental variables: The values of HP and S for each product, the sum of HP and S for the rm’s other two products, and the sum of HP and S for the three products produced by the competition. 536 Journal of Economics & Management Strategy of cancer. This discovery will increase the unobserved valuation of all ber-intensive cereal brands in all cities, and the independence assumption will be violated. Alternatively, suppose one believes that local advertising and promotions are coordinated across city borders and that these activities inuence demand. Then the independence assumption will be violated. The extent to which the assumptions needed to support any of the above instrumental variables are valid in any given situation is an empirical issue. Resolving this issue beyond any reasonable doubt is difcult and requires comparing results from several sets of instrumental variables, combing additional data sources, and using the researcher’s knowledge of the industry. 3.5 Brand-Speci c Dummy Variables As previously pointed out, I believe that brand-specic xed effects should be used whenever possible. There are at least two good reasons to include these dummy variables. First, in any case where we are unsure that the observed characteristics capture the true factors that determine utility, xed effects should be included in order to improve the t of the model. We note that this helps t the mean utility level d j while substitution patterns are driven by observed characteristics (either physical characteristics or market segmentation), as is the case if we do not include a brand xed effect. Furthermore, the major motivation (Berry, 1994) for the estimation scheme previously described is the need to instrument for the correlation between prices and the unobserved quality of the product, »j t . A brand-specic dummy variable captures the characteristics that do not vary by market and the product-specic mean of unobserved components, namely, x j b 1 »j . Therefore, the correlation between prices and the brand-specic mean of unobserved quality is fully accounted for and does not require an instrument. In order to introduce a brand dummy variable we require observations on more than one market. However, even without brand dummy variables, tting the model using observations from a single market is difcult (see BLP, footnote 30). Once brand dummy variables are introduced, the error term is no longer the unobserved characteristics. Rather, it is the marketspecic deviation from this unobserved mean. This additional variance was not introduced by the dummy variables; it is present in all models that use observations from more than one market. The use of brand dummy variables forces the researcher to discuss this additional variance explicitly. Random-Coefcients Logit Models of Demand 537 There are two potential objections to the use of brand dummy variables. First, as previously mentioned, a major difculty in estimating demand in differentiated product markets is that the number of parameters increases proportionally to the square of the number of products. The main motivation for the use of discrete choice models was to reduce this dimensionality problem. Does the introduction of parameters that increase in proportion to the number of brands defeat the whole purpose? No. The number of parameters increases only with J (the number of brands) and not J 2 . Furthermore, the brand dummy variables are linear parameters and do not increase the computational difculty. If the number of brands is large, the size of the design matrix might be problematic, but given the computing power required to run the full model, this is unlikely to be a serious difculty. A more serious objection to the use of brand dummy variables is that the taste coefcients b cannot be identied. Fortunately, this is not true. The taste parameters can be retrieved by using a minimumdistance procedure (as in Chamberlain, 1982). Let d 5 ( d 1 , . . . , d j ) ¢ denote the J ´ 1 vector of brand dummy coefcients, X be the J ´ K (K < J ) matrix of product characteristics that are xed across markets, and » 5 (»1 , . . . , »j ) ¢ be the J ´ 1 vector of unobserved product qualities. Then from equation (1), d5 Xb 1 ». If we assume that E[» | X] 5 bÃ 5 X ¢ Vd 1 X 1 X ¢ Vd 1 d,Ã 0, 24 then the estimates of b and » are »Ã 5 dÃ Xb Ã , where dÃ is the vector of coefcients estimated from the procedure described in the previous section, and Vd is the variance-covariance matrix of these estimates. This is simply a GLS regression where the independent variable consists of the estimated brand effects, estimated using the GMM procedure previously described and the full sample. The number of “observations” in this regression is the number of brands. The correlation in the values of the dependent variable is treated by weighting the regression by the estimated covariance matrix, Vd , which is the estimate of this correlation. The coefcients on the brand dummy variables provide an unrestricted estimate of the mean utility. The minimum-distance procedure project these estimate onto a lower-dimensional space, which is implied by a restricted 24. Note that this is the assumption required to justify the use of observed product characteristics as instrumental variables. Here, however, this assumption is only used to recover the taste parameters. If one is unwilling to make it, the price sensitivity can still be recovered using the other assumptions discussed in the previous section. 538 Journal of Economics & Management Strategy model that sets » to zero. Chamberlain (1982) provides a chi-square test to evaluate these restrictions. 4. AN APPLICATION In this section I briey present the type of estimates one can obtain from the random-coefcients Logit model. The data used for this demonstration were motivated by real scanner data (from the readyto-eat industry). However, the data is not real and should not be used for any analysis. The focus is on providing a data set that can easily be used to learn the method. Therefore, I estimate a somewhat restricted version of the model (with a limited amount of data). For a more detailed and realistic use of the models presented here see either BLP or Nevo (2000a, b). The data set used to generate the results and the Matlab code used to perform the computation, is available from http://elsa.berkeley.edu/~ nevo. The data used for the analysis below consists of quantity and prices for 24 brands of a differentiated product in 47 cities over 2 quarters. The data was generated from a model of demand and supply.25 The marginal cost and the parameters required to simulate this model were motivated by the estimates of Nevo (2000b). I use two product characteristics: Sugar, which measures sugar content, and Mushy, a dummy variable equal to one if the product gets soggy in milk. Demographics were drawn from the Current Population Survey. They include the log of income (Income), the log of income squared, (Income Sq), Age, and Child, a dummy variable equal to one if the individual is less than sixteen. The unobserved demographics, vi , were drawn from a standard normal distribution. For each market I draw 20 individuals [i.e., ns 5 20 in equation (11)]. The results of the estimation can be found in Table I. The means of the distribution of marginal utilities (b ¢ s) are estimated by a minimum-distance procedure described above and presented in the rst column. The results suggest that for the average consumer more sugar increases the utility from the product. Estimates of heterogeneity around these means are presented in the next few columns. The column labeled “Standard Deviations” captures the effects of the unobserved demographics. The effects are insignicant, both economically and statistically. I return to this below. The last four columns present the effect of demographics on the slope parameters. The point estimates are economically signicant; I return to statistical signicance below. The estimates suggest that while the average consumer 25. The demand model of Section 2 was used. Supply was modeled using a standard multiproduct-rms differentiated-product Bertrand model. 32.433 (7.743) 1.841a (0.258) 0.148a (0.258) 0.788a (0.013) Price Constant Sugar Mushy Standard 0.081 (0.205) 0.004 (0.012) 0.377 (0.129) 1.848 (1.075) Deviations ¾ 14.9 (7) 1.468 (0.697) 0.193 (0.005) 3.089 (1.213) 16.598 (172.334) Income — — — 0.659 (8.955) IncomeSq 1.514 (1.103) 0.029 (0.036) 1.186 (1.016) — Age Interactions with Demographic Variables — — — 11.625 (5.207) Child Based on 2256 observations. Except where noted, parameters are GMM estimates. All regressions include brand and time dummy variables. Asymptotically robust standard errors are given in parentheses. a Estimates from a minimum-distance procedure. GMM objective (Degrees of freedom) b Variable Means Results from the Full Model TABLE I. Random-Coefcients Logit Models of Demand 539 540 Journal of Economics & Management Strategy might like a soggy cereal, the marginal valuation of sogginess decreases with age and income. In other words, adults are less sensitive to the crispness of a cereal, as are the wealthier consumers. The distribution of the Mushy coefcient can be seen in Figure 1. Most of the consumers value sogginess in a positive way, but approximately 31% of consumers actually prefer a crunchy cereal. The mean price coefcient is negative. Coefcients on the interaction of price with demographics are economically signicant, while the estimate of the standard deviation suggests that most of the heterogeneity is explained by the demographics (an issue we shall return to below). Children and above-average-income consumers tend to be less price-sensitive. The distribution of the individual price sensitivity can be seen in Figure 2. It does not seem to be normal, which is a result of the empirical distribution of demographics. In principle, the tail of the distribution can reach positive values—implying that the higher the price, the higher the utility. However, this is not the case for these results. Most of the coefcients are not statistically signicant. This is due for the most part to the simplications I made in order to make this example more accessible. If the focus were on the real use of these estimates, their efciency could be greatly improved by using more data, increasing the number of simulations (ns), improving the simulation methods (see the appendix on the web), or adding a supply side (see Section 5). For now I focus on the economic signicance. FIGURE 1. FREQUENCY SOGGINESS DISTRIBUTION OF TASTE FOR Random-Coefcients Logit Models of Demand 541 FIGURE 2. FREQUENCY DISTRIBUTION OF PRICE COEFFICIENT As noted above, all the estimates of the standard deviations are economically insignicant,26 suggesting that the heterogeneity in the coefcients is mostly explained by the included demographics. A measure of the relative importance of the demographics and random shocks can be obtained from the ratios of the variance explained by the demographics to the total variation in the distribution of the estimated coefcients; these are over 90%. This result is somewhat at odds with previous work.27 The results here do not suggest that observed demographics can explain all heterogeneity; they only suggest that the data rejects the assumed normal distribution. An alternative explanation for this result has to do with the structure of the data used here. Unlike other work (for example, BLP), by construction I have no variation across markets in the choice set. As I mentioned in Section 3.2, this sort of variation in the choice set helps identify the variance of the random shocks. This explanation, however, does not explain why the point estimates are low (as opposed to the standard errors being high) and why the effect of demographic variables is signicant. Table II presents a sample of estimated own- and cross-price elasticities. Each entry i, j , where i indexes row and j column, gives the elasticity of brand i with respect to a change in the price of j . Since the model does not imply a constant elasticity, this matrix will depend 26. Unlike the interactions with demographics, even after taking measures to improve the efciency of the estimates, they will still stay statisticall y insignicant. 27. Rossi et al. (1996) nd that using previous purchasing history helps explain heterogeneity above and beyond what is explained by demographics alone. Berry et al. (1998) reach a similar conclusion using second-choice data. 2 18 4 3 12 14 3 4 14 1 11 4 3 13 13 16 10 3 20 7 14 6 12 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 1 1 0 0 Mushy 0.0303 0.0773 0.1500 0.0854 0.2249 0.1430 0.1138 0.1900 0.0944 0.1420 0.0967 0.2711 0.2827 0.0834 0.0889 0.0699 0.2828 0.1743 0.0735 0.2109 2.1689 0.1119 0.2715 0.1868 0.0965 1 Brand 0.2433 0.1706 0.0950 0.1561 0.0548 0.2282 0.1399 0.0717 0.2691 0.0999 0.1556 0.0691 0.0597 0.1825 0.1763 0.1642 0.0606 0.0789 0.1704 0.0643 0.0562 3.4505 0.0674 0.0736 0.1540 2 Brand 0.0396 0.0741 0.1283 0.0815 0.1731 0.1523 0.1005 0.1542 0.1130 0.1270 0.0890 0.2339 0.2390 0.0786 0.0814 0.0698 0.2390 0.1476 0.0717 0.1682 0.2429 0.1302 2.9208 0.1501 0.0947 3 Brand 0.0552 0.1160 0.1602 0.1307 0.1619 0.0873 0.1466 0.1656 0.0771 0.1546 0.1377 0.1121 0.0983 0.1247 0.1256 0.1123 0.1035 0.1622 0.1128 0.1665 0.1161 0.0924 0.1083 3.2709 0.1326 4 Brand 0.2279 0.2534 0.1797 0.2559 0.1110 0.1447 0.2172 0.1362 0.2163 0.1791 0.2426 0.0735 0.0632 0.2697 0.2491 0.2518 0.0648 0.1524 0.2596 0.1224 0.0645 0.1905 0.0709 0.1470 5.5116 5 Brand Elasticity 0.3349 0.3573 0.2096 0.3484 0.1044 0.1442 0.2867 0.1508 0.2618 0.2074 0.3322 0.0662 0.0554 3.8393 0.3542 0.3628 0.0563 0.1620 0.3586 0.1199 0.0596 0.2305 0.0627 0.1557 0.2852 14 Brand 0.3716 0.3794 0.2085 0.3753 0.1114 0.1702 0.2912 0.1578 0.3107 0.2080 0.3427 0.0704 0.0582 0.3922 4.3982 0.3971 0.0598 0.1656 0.3966 0.1298 0.0611 0.2606 0.0684 0.1564 0.3085 15 Brand 0.3896 0.3530 0.1530 0.3154 0.0733 0.3270 0.2498 0.1026 3.3335 0.1628 0.2899 0.0985 0.0866 0.3380 0.3383 0.3513 0.0865 0.1180 0.3443 0.0817 0.0759 0.4380 0.0944 0.1037 0.2988 19 Brand 0.0399 0.1416 0.2311 0.1608 4.1215 0.1264 0.1875 0.2660 0.1013 0.2251 0.1769 0.2260 0.2189 0.1600 0.1582 0.1369 0.2225 0.2555 0.1333 0.2809 0.2375 0.1165 0.2237 0.2670 0.1715 24 Brand Cell entries i, j , where i indexes row and j column, give the percent change in market share of brand i with a one-percent change in price of j . Each entry represents the median of the elasticities from the 94 markets. Outside good Sugar Brand Characteristics TABLE II. Median Own- and Cross-Price Elasticities 542 Journal of Economics & Management Strategy Random-Coefcients Logit Models of Demand 543 on the values of the variables used to evaluate it. Rather than choosing a particular value (say the average, or a value at a particular market), I present the median of each entry over the 94 markets in the sample. The results demonstrate how the substitution patterns are determined in this model. Products with similar characteristics will have larger substitution patterns, all else equal. For example, brands 14 and 15 have identical observed characteristics, and therefore their cross-price elasticities are essentially identical. A diagnostic of how far the results are from the restrictive form imposed by the logit model is given by examining the variation in the cross-price elasticities in each column. As discussed in Section 2, the logit model restricts all elasticities within a column to be equal. Therefore, an indicator of how well the model has overcome these restrictions is to examine the variation in the estimated elasticities. One such measure is given by examining the ratio of the maximum to the minimum cross-price elasticity within a column (the logit model implies that all cross-price elasticities within a column are equal and therefore a ratio of one). This ratio varies from 9 to 3. Not only does this tell us the results have overcome the logit restrictions, but more importantly it suggests for which brands the characteristics do not seem strong enough to overcome the restrictions. This test therefore suggests which characteristics we might want to add.28 5. CONCLUDING REMARKS This paper has carefully discussed recent developments in methods of estimating random-coefcients (mixed) logit models. The emphasis was on simplifying the exposition, and as a result several possible extensions were not discussed. I briey mention these now. 5.1 Supply Side In the above presentation the supply side was used only in order to motivate the instrumental variables; it was not fully specied and estimated. In some cases we will want to fully specify a supply relationship and estimate it jointly with the demand-side equations (for example, see BLP). This ts into the above model easily by adding moment conditions to the GMM objective function. The increase in computational and programming complexity is small for standard static supply-side models. As usual, estimating demand and supply 28. A formal specication test of the logit model [in the spirit of Hausman and McFadden (1984)] is the test of the hypothesis that all the nonlinear parameters are jointly zero. This hypothesis is easily rejected. 544 Journal of Economics & Management Strategy jointly has the advantage of increasing the efciency of the estimates, at the cost of requiring more structure. The cost and benets are specic to each application and data set. 5.2 Consumer-Level Data This paper has assumed that the researcher does not observe the purchase decisions of individuals. There are many cases where this is not true. In cases where only consumer data is observed, usually estimation is conducted using either maximum likelihood or the simulated method of moments [for recent examples and details see Goldberg, (1995), Rossi et al. (1996), McFadden and Train (2000), or Shum (1999)]. The method discussed here can be applied in such cases by using the consumer-level data to estimate the mean utility level d j t . The estimated mean utility levels can now be treated in a similar way to the mean utility levels computed from the inversion of the aggregate market shares. Care has to be taken when computing the standard errors, since the mean utility levels are now measured with error. In most studies that use consumer-level data, the correlation between the regressors and the error term, which was the main motivation for the method discussed here, is usually ignored [one notable exception is Villas-Boas and Winer (1999)]. This correlation might still be present, for at least two reasons. First, even though consumers take prices and other product characteristics as given, their optimal choice from a menu of offerings could imply that econometric endogeneity might still exist (Kennan, 1989). Second, unless enough control variables are included, common unobserved characteristics, »j t , could still bias the estimates. The method proposed here could, in principle, deal with the latter problem. Potentially, one could observe both consumer and aggregate data. In such cases the analysis proposed here could be enriched. Petrin (1999) observes, in addition to the aggregate market shares of automobile models, the probability of purchase by consumers of different demographic groups. He uses this information in the form of additional moment restriction (thus forcing the estimated probabilities of purchase to predict the observed probabilities). Although technically somewhat different, the idea is similar to using multiple observations on the same product in different markets (i.e., different demographic groups). Berry et al. (1998) generalize this strategy by tting three sets of moments to their sample counterparts: (1) the market shares, as above, (2) the covariance of the product characteristics and the observed demographics, and (3) the covariance of rst and second choice (they have a survey that describes what the consumer’s second choice was). As Berry et al. (1998) point out, the algorithm they use is Random-Coefcients Logit Models of Demand 545 very similar to the one they introduced in BLP, which was the basis for the discussion above. 5.3 Alternative Methods An alternative to the discrete-choice methods discussed here is a multilevel demand model. The essential idea is to use aggregation and separability assumptions to justify different levels of demand [see Gorman (1959, 1971), or Deaton and Muellbauer (1980) and references therein]. Originally these methods were developed to deal with demand for broad categories like food, clothing, and shelter. Recently, however, they have been adapted to demand for differentiated products [see Hausman et al. (1994) or Hausman (1996)]. The top level is the overall demand for the product category (for example, RTE cereal). Intermediate levels of the demand system model substitution between various market segments (e.g., between children’s and natural cereals). The bottom level is the choice of a brand within a segment. Each level of the demand system can be estimated using a exible functional form. This segmentation of the market reduces the number of parameters in inverse proportion to the number of segments. Therefore, with either a small number of brands or a large number of (a priori) reasonable segments, this methods can use exible functional forms [for example, the almost ideal demand system of Deaton and Muellbauer (1980)] to give good rst-order approximations to any demand system. However, as the number of brands in each segment increases beyond a handful, this method becomes less feasible. For a comparison between the methods described below and these multilevel models see Nevo (1997, Chapter 6). 5.4 Dynamics The model presented here is static. However, it has close links to several dynamic models. The rst class of dynamic models are models of dynamic rm behavior. The links are twofold. The model used here can feed into the dynamic model as in Pakes and McGuire (1994). On the other hand, the dynamic model can be used to characterize the endogenous choice of product characteristics, therefore supplying more general identifying conditions. An alternative class of dynamic models examine demand-side dynamics (Erdem and Keane, 1996; Ackerberg, 1996). These models generalize the demand model described here and are estimated using consumer-level data. Although in principle these models could also be estimated using aggregate (high-frequency) data, consumer-level data is better suited for the task. 546 5.5 Journal of Economics & Management Strategy Instruments and Additional Applications As was mentioned in Sections 3.2–3.4, the identication of parameters in these models relies heavily on having an adequate set of exogenous instrumental variables. Finding such instrumental variables is crucial for any consistent estimation of demand parameters, and in models of demand for differentiated products this problem is further complicated by the fact that cost data are rarely observed and proxies for cost will rarely exhibit much cross-brand variation. Some of the solutions available in the literature have been presented, yet all suffer from potential drawbacks. It is important not to get carried away in the technical reworks and to remember this most basic, yet very difcult, identication problem. This paper has surveyed some of the growing literature that uses the methods described here. The scope of application and potential of use are far from exhausted. Of course, there are many more potential applications within the study of industrial economics, both in studying new industries and in answering different questions. However, the full scope of these methods is not limited to industrial organization. It is my hope that this paper will facilitate further application of these methods. REFERENCES Ackerberg, D., 1996, “Empirically Distinguishing Informative and Prestige Effects of Advertising,” Mimeo, Boston University. Anderson, S., A. de Palma, and J.F. Thisse, 1992, Discrete Choice Theory of Product Differentiation, Cambridge, MA: The MIT Press. Amemiya, T., 1985, Advanced Econometrics, Cambridge, MA: Harvard University Press. Barten, A.P., 1966, “Theorie en Empirie van een Volledig Stelsel van Vraagvergelijkingen,” Doctoral Dissertation, Rotterdam: University of Rotterdam. Berry, S., 1994, “Estimating Discrete-Choice Models of Product Differentiation,” Rand Journal of Economics, 25, 242–262. , J. Levinsohn, and A. Pakes, 1995, “Automobile Prices in Market Equilibrium,” Econometrica, 63, 841–890. , M. Carnall, and P. Spiller, 1996, “Airline Hubs: Costs and Markups and the Implications of Consumer Heterogeneity,” Working Paper No. 5561, National Bureau of Economic Research. , J. Levinsohn, and A. Pakes, 1998, “Differentiated Products Demand Systems from a Combination of Micro and Macro Data: The New Car Market,” Working Paper No. 6481, National Bureau of Economic Research; also available at http://www. econ. yale.edu/~ steveb. , , and , 1999, “Voluntary Export Restraints on Automobiles: Evaluating a Strategic Trade Policy,” American Economic Review, 89 (3), 400–430. Boyd, H.J. and R.E. Mellman, 1980, “The Effect of Fuel Economy Standards on the U.S. Automotive Market: An Hedonic Demand Analysis,” Transportation Research, 14A, 367–378. Random-Coefcients Logit Models of Demand 547 Bresnahan, T., 1981, “Departures from Marginal-Cost Pricing in the American Automobile Industry,” Journal of Econometrics, 17, 201–227. , 1987, “Competition and Collusion in the American Automobile Oligopoly: The 1955 Price War,” Journal of Industrial Economics, 35, 457–482. , S. Stern, and M. Trajtenberg 1997, “Market Segmentation and the Sources of Rents from Innovation: Personal Computers in the Late 1980’s” RAND Journal of Economics, 28, S17–S44. Cardell, N.S., 1989, “Extensions of the Multinominal Logit: The Hedonic Demand Model, The Non-independent Logit Model, and the Ranked Logit Model,” Ph.D. Dissertation, Harvard University. , 1997, “Variance Components Structures for the Extreme-Value and Logistic Distributions with Application to Models of Heterogeneity,” Econometric Theory, 13 (2), 185–213. , and F. Dunbar, 1980, “Measuring the Societal Impacts of Automobile Downsizing,” Transportation Research, 14A, 423–434. Chamberlain, G., 1982, “Multivariate Regression Models for Panel Data,” Journal of Econometrics, 18 (1), 5–46. Christensen, L.R., D.W. Jorgenson, and L.J. Lau, 1975, “Transcendental Logarithmic Utility Functions,” American Economic Review, 65, 367–383. Court, A.T., 1939, “Hedonic Price Indexes with Automotive Examples,” in Anon., The Dynamics of Automobile Demand, New York: General Motors. Das, S., S. Olley, and A. Pakes, 1994, “Evolution of Brand Qualities of Consumer Electronics in the U.S.,” Mimeo, Yale University. Davis, P., 1998, “Spatial Competition in Retail Markets: Movie Theaters,” Mimeo, Yale University. Deaton, A., and J. Muellbauer, 1980, “An Almost Ideal Demand Sustem,” American Economic Review, 70, 312–326. Dixit, A., and J.E. Stiglitz, 1977, “Monopolistic Competition and Optimum Product Diversity,” American Economic Review, 67, 297–308. Dubin, J. and D. McFadden, 1984, “An Econometric Analysis of Residential Electric Appliance Holding and Consumption,” Econometrica, 52, 345–362. Erdem, T. and M. Keane, 1996, “Decision-making under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets,” Marketing Science, 15 (1), 1–20. Gasmi, F., J.J. Laffont, and Q. Vuong, 1992, “Econometric Analysis of Collusive Behavior in a Soft-Drink Market,” Journal of Economics & Strategy, 1 (2), 277–311. Goldberg, P., 1995, “Product Differentiation and Oligopoly in International Markets: The Case of the Automobile Industry,” Econometrica, 63, 891–951. Gorman, W.M., 1959, “Separable Utility and Aggregation,” Econometrica, 27, 469–481. , 1971, Lecture Notes, Mimeo, London School of Economics. Griliches, Z., 1961, “Hedonic Price Indexes for Automobiles: An Econometric Analysis of Quality Change,” in The Price Statistics of the Federal Government, hearing before the Joint Economic Committee of the U.S. Congress, 173-76, Pt. 1, 87th Cong., 1st sess. Reprinted in Z. Griliches, ed., 1971, Price Indexes and Quality Change: Studies in New Methods in Measurement, Cambridge, MA: Harvard University Press. Hausman, J., 1996, “Valuation of New Goods under Perfect and Imperfect Competition,” in T. Bresnahan and R. Gordon, eds., The Economics of New Goods, Studies in Income and Wealth, Vol. 58, Chicago: National Bureau of Economic Research. , and D. McFadden, 1984, “Specication Tests for the Multinominal Logit Model,” Econometrica, 52 (5), 1219–1240. 548 Journal of Economics & Management Strategy , and D. Wise, 1978, “A Conditional Probit Model for Qualitative Choice: Discrete Decisions Recognizing Interdependence and Heterogeneous Preferences,” Econometrica, 49, 403–426. , G. Leonard, and J.D. Zona, 1994, “Competitive Analysis with Differentiated Products,” Annales d’Economie et de Statistique, 34, 159–180 Hendel, I., 1999, “Estimating Multiple Discrete Choice Models: An Application to Computerizatio n Returns,” Review of Economic Studies, 66, 423–446. Lancaster, K., 1966, “A New Approach to Consumer Theory,” Journal of Political Economy, 74, 132–157. , 1971, Consumer Demand: A New Approach, New York: Columbia University Press. Kennan, J., 1989, “Simultaneous Equations Bias in Disaggregated Econometric Models,” Review of Economic Studies, 56, 151–156. McFadden, D., 1973, “Conditional Logit Analysis of Qualitativ e Choice Behavior,” in P. Zarembka, ed., Frontiers of Econometrics, New York: Academic Press. , 1978, “Modeling the Choice of Residential Location,” in A. Karlgvist et al., eds., Spatial Interaction Theory and Planning Models, Amsterdam: North-Holland. , and K. Train, 2000, “Mixed MNL Models for Discrete Response,” Journal of Applied Econometrics, forthcoming; available from http://emlab.berkeley.edu/~ train. Nevo, A., 1997, “Demand for Ready-to-Eat Cereal and Its Implications for Price Competition, Merger Analysis, and Valuation of New Goods.” Ph.D. Dissertation, Harvard University. , 2000a, “Measuring Market Power in the Ready-to-Eat Cereal Industry,” Econometrica, forthcoming; also available from http://emlab.berkeley.edu/~ nevo. , 2000b, “Mergers with Differentiated Products: The Case of the Ready-to-Eat Cereal Industry,” Rand Journal Economics, forthcoming, 31 (Autumn). Pakes, A. and P. McGuire, 1994, “Computation of Markov Perfect Equilibria: Numerical Implications of a Dynamic Differentiated Product Model,” Rand Journal of Economics, 25 (4), 555–589. Petrin, A., 1999, “Quantifying the Benets of New Products: The Case of the Minivan,” Mimeo, University of Chicago; available from http://gsbwww.uchicago.edu/fac/ amil.petrin. Quandt, R.E., 1968, “Estimation of Model Splits,” Transportation Research, 2, 41–50. Rosen, S., 1974, “Hedonic Prices and Implicit Markets,” Journal of Political Economy, 82, 34–55. Rossi, P., R.E. McCulloch, and G.M. Allenby, 1996, “The Value of Purchase History Data in Target Marketing,” Marketing Science, 15 (4), 321–340. Shum, M., 1999, “Advertising and Switching Behavior in the Breakfast Cereal Market,” Mimeo, University of Toronto; available from http://www.chass.utoronto.ca/eco/ eco.html. Spence, M., 1976, “Product Selection, Fixed Costs, and Monopolistic Competition,” Review of Economic Studies, 43, 217–235. Stern, S., 1995, “Product Demand in Pharmaceutical Markets,” Mimeo, Stanford University. Stone, J., 1954, “Linear Expenditure Systems and Demand Analysis: An Application to the Pattern of British Demand,” Economic Journal, 64, 511–527. Tardiff, T.J., 1980, “Vehicle Choice Models: Review of Previous Studies and Directions for Further Research,” Transportation Research, 14A, 327–335. Theil, H., 1965, “The Information Approach to Demand Analysis,” Econometrica, 6, 375–380. Villas-Boas, M. and R. Winer, 1999, “Endogeneity in Brand Choice Models,” Management Science, 45, 1324–1338.
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.2 Linearized : No Title : A Practitioner's Guide to Estimation of Random-Coefficients Logit Models of Demand Creator : RealPage PDF Generator 1.0 Author : Create Date : 2002:07:07 18:22:51 Producer : RealPage data converter Page Count : 36EXIF Metadata provided by EXIF.tools