PUBLICATIONS

A Structural Model of Dense Network Formation , May 2017
Econometrica, vol. 85, issue 3, pages 825-850
published version

Abstract. This paper proposes an empirical model of network formation, combining strategic and random networks features. Payoffs depend on direct links, but also link external- ities. Players meet sequentially at random, myopically updating their links. Under mild assumptions, the network formation process is a potential game and converges to an exponential random graph model (ERGM), generating directed dense networks. I provide new identification results for ERGMs in large networks: if link externalities are non-negative, the ERGM is asymptotically indistinguishable from an Erdos-Renyi model with independent links. We can identify the parameters only when at least one of the externalities is negative and sufficiently large. However, the standard estimation methods for ERGMs can have exponentially slow convergence, even when the model has asymptotically independent links. I thus estimate parameters using a Bayesian MCMC method. When the parameters are identifiable, I show evidence that the esti- mation algorithm converges in almost quadratic time.

R Package for replication : Contains functions for replications.

Appendix A : Model equilibrium: proofs
Appendix B : Computational Details
Appendix C : Unobserved Heterogeneity
Appendix D : Large Networks Analysis and Convergence
Appendix E : Additional Simulation Results

Old Version (March 2015): Second Revision for Econometrica
Old Version (July 2013): First Revision for Econometrica

Replication files for 2013 version: Includes R codes, Fortran 90 codes and results.
Old Version (Nov 2011): Includes additional results and policy experiments.

Viral Altruism? Generosity and social contagion in online networks, March 2016, Sociological Science
(with Mario Macis and Nicola Lacetera)

Abstract. How do the social media affect the success of charitable promotional campaigns? We use individual-level longitudinal data and experimental data from a social-media application that facilitates donations while broadcasting donors' activities to their contacts. We find that broadcasting is positively associated with donations, although some individuals appear to opportunistically broadcast a pledge, and then delete it. Furthermore, broadcasting a pledge is associated with more pledges by a user's contacts. However, results from a field experiment where broadcasting of the initial pledges was randomized suggest that the observational findings were likely due to homophily rather than genuine social contagion effects. The experiment also shows that, although our campaigns generated considerable attention in the forms of clicks and "likes," only a small number of donations (30 out of 6.4 million users reached) were made. Finally, an online survey experiment showed that both the presence of an intermediary and a fee contributed to the low donation rate. Our findings suggest that online platforms for charitable giving may stimulate costless forms of involvement, but have a smaller impact on actual donations, and that network effects might be limited when it comes to contributing real money to charities.

Structural Models of Complementary Choices
Marketing Letters, 25, 3, pp: 245-256 Springer
(with Steve Berry, Ahmed Khwaja, Vineet Kumar, Andres Musalem, Ken Wilbur, Greg Allenby, Bharat Anand, Pradeep Chintagunta, Michael Hanemann, Przemek Jeziorski)
Choice Symposium Invited Submission from Marketing Letters

Poisson Indices of Segregation
Regional Science and Urban Economics 43 (2013) 65–85

Abstract. Existing indices of residential segregation are based on a partition of the city in neighborhoods: given a spatial distribution of racial groups, the index measures different segregation levels for different partitions. I propose a spatial approach, which estimates segregation at the individual level and produces the entire spatial distribution of segregation. This method provides different rankings of cities in terms of segregation and new insights on the effect of segregation on socioeconomic outcomes. Using Census data and controlling for endogeneity using instrumental variables, I show that reduced form estimates of the impact of segregation on socioeconomic outcomes are not robust to the spatial approach.

Example: Includes R code, R package, data for New York, readme files.

Spatial Dissimilarity 1990           Spatial Dissimilarity 2000
Spatial Exposure 1990                Spatial Exposure 2000
Spatial Fragmentation 1990       Spatial Fragmentation 2000

WORKING PAPERS

A Structural Model of Homophily and Clustering in Social Networks, December 2018, revision requested Journal of Business and Economic Statistics

Abstract. I develop and estimate a structural model of network formation with heterogeneous players and latent community structure, whose equilibria exhibit levels of sparsity, homophily and clustering that match those usually observed in real-world social networks. Players belong to communities unobserved by the econometrician and have community-specific payoffs, allowing preferences to have a bias for similar people. Players meet sequentially and decide whether to form bilateral links, after receiving a random matching shock. The model converges to a sparse hierarchical exponential family random graph, with weakly dependent links. Using school friendship network data from Add Health, I estimate the posterior distribution of parameters and unobserved heterogeneity, detecting high levels of racial homophily and payoff heterogeneity across communities. The posterior predictions show that the model is able to replicate the homophily levels and the aggregate clustering of the observed network, in contrast with standard exponential family network models.

Older Version: September 2017

Approximate Variational Estimation for a Model of Network Formation, April 2019 (with Lingjiong Zhu)
Re-submitted, Review of Economics and Statistics

Abstract. We develop approximate estimation methods for exponential random graph models (ERGMs), whose likelihood is proportional to an intractable normalizing constant. The usual approach approximates this constant with Monte Carlo simulations, however convergence may be exponentially slow. We propose a deterministic method, based on a variational mean-field approximation of the ERGM’s normalizing constant. We compute lower and upper bounds for the approximation error for any network size, using nonlinear large deviations results. This translates into bounds on the distance between true likelihood and mean-field likelihood, as well as bounds on the distance between approximate parameter estimates from the MLE, assuming the likelihood is not very flat. In small networks, a simple Monte Carlo exercise shows that our deterministic method provides similar estimates as the simulation-based methods with the advantage of converging in quadratic time.

mfergm: Replication package on Github

Older Version: September 2017
Older Version: February 2017
Older Version: June 2015

Does school desegregation promote diverse interactions? An equilibrium model of segregation within schools, April 2019
Re-submitted, American Economic Journal: Economic Policy (previously circulated as: Segregation in Social Networks: A structural approach)

Abstract. This paper studies racial segregation in schools using data on student friendships from Add Health. I estimate a structural equilibrium model of friendship formation among students, with preferences that allow for both homophily (a bias for similar people) and heterophily (a bias for different people) on different characteristics. Preferences also depend on link externalities, such as having common friends or reciprocated links. I find that students tend to interact with similar people. Homophily goes beyond direct links: students also prefer a racially homogeneous set of indirect friends. However, I find heterophily in parental income levels and for the group of hispanic students. I simulate several re-allocation programs, showing that policies that transport minorities to other schools have nonlinear effects on within-school segregation and other network features such as clustering and centrality. In some instances, these interventions increase segregation within schools. Policies that increase racial diversity by re-allocation of student according to their parental income have less impact on racial segregation with schools.

Older Version: October 2017

Approximate Variational Inference for a model of social interactions (NET Institute Working paper 13-16) New version coming soon

Abstract. This paper proposes approximate variational inference methods for estimation of a strategic model of social interactions. Players interact in an exogenous network and sequentially choose a binary action. The utility of an action is a function of the choices of neighbors in the network. I prove that the interaction process can be represented as a potential game and it converges to a unique stationary equilibrium distribution. However, exact inference for this model is infeasible because of a computationally intractable likelihood, which cannot be evaluated even when there are few players. To overcome this problem, I propose variational approximations for the likelihood that allow approximate inference. This technique can be applied to any discrete exponential family, and therefore it is a general tool for inference in models with a large number of players. The methodology is illustrated with several simulated datasets and compared with MCMC methods.

Racial Segregation and Public School Expenditure (with Eliana La Ferrara)
(New version coming soon)

Abstract. This paper explores the effect of racial segregation on public school expenditure in US metropolitan areas and school districts. Our starting point is the literature that relates public good provision to the degree of racial fragmentation in the community. We argue that looking at fragmentation alone may be misleading and that the geographic distribution of different racial groups needs to be taken into account. Greater segregation is associated with more homogeneity in some sub-areas and more heterogeneity in others, and this matters if decisions on spending are taken at aggregation levels lower than the MSA. For given fragmentation, the extent of segregation conveys information on households' possibility to sort into relatively more or less homogeneous jurisdictions. We account for the potential endogeneity of racial segregation and find that the latter has a positive impact on average public school expenditure both at the MSA and at the district level. At the same time, increased segregation leads to more inequality in spending across districts of the same MSA, thus worsening the relative position of poorer districts.

WORK IN PROGRESS

The Effects of FDA Recalls on The Network of Strategic Alliances in The Medical Devices Industry (with Shweta Gaonkar)

Creativity and Market Success: Evidence from the movie industry (with Sharon Kim and Shweta Gaonkar)

OLD STUFF

Who's Afraid of the Big Bad Wolf? Do Pedophiles live close to schools?

Abstract. I analyze the geographic distribution of sex offenders in Urbana and Champaign and test if they live closer to schools than generic residents. If the spatial distribution of sex offenders is the same as that of population as a whole, we should not observe a sex offender systematically closer to schools than a random resident. Using a simple statistical model I test if, conditioning on the distance from schools, the spatial distributions of sex offenders and residents are the same. The results show that this is not the case: on average sex offenders are less likely to locate close to schools than other people. However, I show evidence that there is some heterogeneity among individual schools, some being sex offenders "magnets".

The Determinants of Social Networks in Developing Countries: Some Evidence from Ghana

Abstract. I explore the determinants of social network formation in a developing economy, using data from Ghana. The main contribution is the use of an empirical model of network formation that takes into account the interdependence of the linking decisions among agents. The interdependence implies a simultaneity problem that I solve by estimating the joint distribution of the model instead of a conditional specification. The computational complexity increases and I use MCMC Maximum Likelihood methods for the estimation of relevant parameters. I impose functional forms and restrictions of the parameters that guarantee identification. The results show that the network formation process is driven mainly by gender and wealth. In several specifications religion and clan are also important determinants of the network shape.