The investigators of this pilot are developing a computational method that uses egocentric network samples to learn about the structure of the underlying unobserved network. They use this method to estimate the expected overall impact of a behavioral intervention that is applied to a subset of a population and expected to spread to others in the population. This pilot is focused on methodological innovations in network science and uses anonymized mobile phone communication records to construct a large weighted social network. Aims of the project are: (1) to develop a scalable computational method that enables us to estimate the structure of the underlying sociocentric (un-sampled) social network from a collection of egocentric samples; (2) to simulate the intervention on a set of estimated sociocentric networks and to determine the size and uncertainty of the overall effect of an intervention in a social network; and (3) to investigate what type of additional information would need to be collected as part of the standard egocentric survey design in order to more allow more accurate inferences regarding the sociocentric network structure to be drawn from egocentric network samples. Results from this pilot are expected to yield important insights into designing and measuring the effects of behavioral interventions.
Essentially all studies that collect empirical network data from subjects using any type of survey approach may suffer from truncation bias. During the current reporting period, we investigated the impact of degree truncation, i.e., limiting how many social contacts a respondent could provide details on, in a survey (the “fixed choice design”). Specifically, we modelled how different degrees of truncation (at twice, once, or half the population mean number of contacts) affected the structure of reported social networks, and then how differences in these structures affected estimates of how a generic process (disease, information) might spread across the network. Simulated truncation and process spread was conducted on both synthetic networks with particular properties (e.g. assortativity, clustering) and on contact network data collected in rural Indian villages. In this reporting period, we have built a significant body of code for the project. We expect to have output data ready by the end of April 2015. Additionally, we have conducted a literature review on (i) how fixed choice designs affect measured network structure, and (ii) how network structure affects estimates of process spread dynamics.
This work will be significant for those planning surveys that will be used to predict the spread of dynamic processes (e.g. behavior change, epidemics) by allowing such persons to make informed decisions about whether the reduced resource requirements associated with fixed choice designs are worth the reduced validity of conclusions drawn from such studies.
Pilot Leader: Jukka-Pekka Onnela, PhD, Harvard T.H. Chan School of Public Health