A Structural Model of Dense Network Formation , May 2017
Econometrica, vol. 85, issue 3, pages 825-850
published version

Abstract. This paper proposes an empirical model of network formation, combining strategic and random networks features. Payoffs depend on direct links, but also link external- ities. Players meet sequentially at random, myopically updating their links. Under mild assumptions, the network formation process is a potential game and converges to an exponential random graph model (ERGM), generating directed dense networks. I provide new identification results for ERGMs in large networks: if link externalities are non-negative, the ERGM is asymptotically indistinguishable from an Erdos-Renyi model with independent links. We can identify the parameters only when at least one of the externalities is negative and sufficiently large. However, the standard estimation methods for ERGMs can have exponentially slow convergence, even when the model has asymptotically independent links. I thus estimate parameters using a Bayesian MCMC method. When the parameters are identifiable, I show evidence that the esti- mation algorithm converges in almost quadratic time.

R Package for replication : Contains functions for replications.

Appendix A : Model equilibrium: proofs
Appendix B : Computational Details
Appendix C : Unobserved Heterogeneity
Appendix D : Large Networks Analysis and Convergence
Appendix E : Additional Simulation Results

Old Version (March 2015): Second Revision for Econometrica
Old Version (July 2013): First Revision for Econometrica

Replication files for 2013 version: Includes R codes, Fortran 90 codes and results.
Old Version (Nov 2011): Includes additional results and policy experiments.

Viral Altruism? Generosity and social contagion in online networks, March 2016, Sociological Science
(with Mario Macis and Nicola Lacetera)

Abstract. How do the social media affect the success of charitable promotional campaigns? We use individual-level longitudinal data and experimental data from a social-media application that facilitates donations while broadcasting donors' activities to their contacts. We find that broadcasting is positively associated with donations, although some individuals appear to opportunistically broadcast a pledge, and then delete it. Furthermore, broadcasting a pledge is associated with more pledges by a user's contacts. However, results from a field experiment where broadcasting of the initial pledges was randomized suggest that the observational findings were likely due to homophily rather than genuine social contagion effects. The experiment also shows that, although our campaigns generated considerable attention in the forms of clicks and "likes," only a small number of donations (30 out of 6.4 million users reached) were made. Finally, an online survey experiment showed that both the presence of an intermediary and a fee contributed to the low donation rate. Our findings suggest that online platforms for charitable giving may stimulate costless forms of involvement, but have a smaller impact on actual donations, and that network effects might be limited when it comes to contributing real money to charities.

Structural Models of Complementary Choices
Marketing Letters, 25, 3, pp: 245-256 Springer
(with Steve Berry, Ahmed Khwaja, Vineet Kumar, Andres Musalem, Ken Wilbur, Greg Allenby, Bharat Anand, Pradeep Chintagunta, Michael Hanemann, Przemek Jeziorski)
Choice Symposium Invited Submission from Marketing Letters

Poisson Indices of Segregation
Regional Science and Urban Economics 43 (2013) 65–85

Abstract. Existing indices of residential segregation are based on a partition of the city in neighborhoods: given a spatial distribution of racial groups, the index measures different segregation levels for different partitions. I propose a spatial approach, which estimates segregation at the individual level and produces the entire spatial distribution of segregation. This method provides different rankings of cities in terms of segregation and new insights on the effect of segregation on socioeconomic outcomes. Using Census data and controlling for endogeneity using instrumental variables, I show that reduced form estimates of the impact of segregation on socioeconomic outcomes are not robust to the spatial approach.

Example: Includes R code, R package, data for New York, readme files.

Spatial Dissimilarity 1990           Spatial Dissimilarity 2000
Spatial Exposure 1990                Spatial Exposure 2000
Spatial Fragmentation 1990       Spatial Fragmentation 2000


A Structural Model of Homophily and Clustering in Social Networks, September 2017, submitted

Abstract. Social networks display homophily and clustering, and are usually sparse. I develop and estimate a structural model of strategic network formation with heterogeneous players and latent community structure, whose equilibrium networks are sparse and exhibit homophily and clustering. Each player belongs to a community unobserved by the econometrician. Players' payoffs vary by community and depend on the composition of direct links and common neighbors, allowing preferences to have a bias for similar people. Players meet sequentially and decide whether to form bilateral links, after receiving a random matching shock. The probability of meeting people in different communities is smaller than the probability of meeting people in the same community, and it decreases with the size of the network. The model converges to an exponential family random graph, with weak dependence among links. As a consequence the equilibrium networks are sparse and the sufficient statistics of the network are asymptotically normal. The posterior distribution of structural parameters and unobserved heterogeneity is estimated with school friendship network data from Add Health, using a Bayesian exchange algorithm. The estimates detect high levels of racial homophily, and heterogeneity in both costs of links and payoffs from common friends. The posterior predictions show that the model is able to replicate the homophily levels and the aggregate clustering of the observed network, in contrast with standard exponential family network models.

Approximate Variational Estimation for a Model of Network Formation, February 2017 (with Lingjiong Zhu), submitted

Abstract. We study an equilibrium model of sequential network formation with heterogeneous players. The payoffs depend on the number and composition of direct connections, but also the number of indirect links. We show that the network formation process is a potential game and in the long run the model converges to an exponential random graph (ERGM). Since standard simulation-based inference methods for ERGMs could have exponentially slow convergence, we propose an alternative deterministic method, based on a variational approximation of the likelihood. We compute bounds for the approximation error for a given network size and we prove that our variational method is asymptotically exact, extending results from the large deviations and graph limits literature to allow for covariates in the ERGM. A simple Monte Carlo shows that our deterministic method provides more robust estimates than standard simulation based inference.

mfergm: Replication package on Github

Older Version: June 2015

Approximate Variational Inference for a model of social interactions (NET Institute Working paper 13-16) New version coming soon

Abstract. This paper proposes approximate variational inference methods for estimation of a strategic model of social interactions. Players interact in an exogenous network and sequentially choose a binary action. The utility of an action is a function of the choices of neighbors in the network. I prove that the interaction process can be represented as a potential game and it converges to a unique stationary equilibrium distribution. However, exact inference for this model is infeasible because of a computationally intractable likelihood, which cannot be evaluated even when there are few players. To overcome this problem, I propose variational approximations for the likelihood that allow approximate inference. This technique can be applied to any discrete exponential family, and therefore it is a general tool for inference in models with a large number of players. The methodology is illustrated with several simulated datasets and compared with MCMC methods.

Segregation in Social Networks: Monte Carlo Maximum Likelihood Estimation , November 2011

Racial Segregation and Public School Expenditure (with Eliana La Ferrara)
(New version coming soon)

Abstract. This paper explores the effect of racial segregation on public school expenditure in US metropolitan areas and school districts. Our starting point is the literature that relates public good provision to the degree of racial fragmentation in the community. We argue that looking at fragmentation alone may be misleading and that the geographic distribution of different racial groups needs to be taken into account. Greater segregation is associated with more homogeneity in some sub-areas and more heterogeneity in others, and this matters if decisions on spending are taken at aggregation levels lower than the MSA. For given fragmentation, the extent of segregation conveys information on households' possibility to sort into relatively more or less homogeneous jurisdictions. We account for the potential endogeneity of racial segregation and find that the latter has a positive impact on average public school expenditure both at the MSA and at the district level. At the same time, increased segregation leads to more inequality in spending across districts of the same MSA, thus worsening the relative position of poorer districts.


The Effects of FDA Recalls on The Network of Strategic Alliances in The Medical Devices Industry (with Shweta Gaonkar)

Creativity and Market Success: Evidence from the movie industry (with Sharon Kim and Shweta Gaonkar)


Who's Afraid of the Big Bad Wolf? Do Pedophiles live close to schools?

Abstract. I analyze the geographic distribution of sex offenders in Urbana and Champaign and test if they live closer to schools than generic residents. If the spatial distribution of sex offenders is the same as that of population as a whole, we should not observe a sex offender systematically closer to schools than a random resident. Using a simple statistical model I test if, conditioning on the distance from schools, the spatial distributions of sex offenders and residents are the same. The results show that this is not the case: on average sex offenders are less likely to locate close to schools than other people. However, I show evidence that there is some heterogeneity among individual schools, some being sex offenders "magnets".

The Determinants of Social Networks in Developing Countries: Some Evidence from Ghana

Abstract. I explore the determinants of social network formation in a developing economy, using data from Ghana. The main contribution is the use of an empirical model of network formation that takes into account the interdependence of the linking decisions among agents. The interdependence implies a simultaneity problem that I solve by estimating the joint distribution of the model instead of a conditional specification. The computational complexity increases and I use MCMC Maximum Likelihood methods for the estimation of relevant parameters. I impose functional forms and restrictions of the parameters that guarantee identification. The results show that the network formation process is driven mainly by gender and wealth. In several specifications religion and clan are also important determinants of the network shape.