Actor and item non-response in social networks: Are different non-response treatments able to reveal underlying blockmodel structure?

Anja Žnidaršič

University of Maribor, Faculty of Organizational Sciences

Anuška Ferligoj

University of Ljubljana, Faculty of Social Sciences

Patrick Doreian

University of Pittsburgh, Department of Sociology and

University of Ljubljana, Faculty of Social Sciences

Abstract

Social networks are fundamental to social life. In an intuitive way, a social network can be defined as follows: ”Social network consist of a finite set or sets of actors and the relation or relations defined on them” (Wasserman and Faust, 1998, pg. 21). One purpose of social network analyses is to detect from large and seemingly incoherent networks simple and useful descriptions of the fundamental structures of relationships. One widely used technique for finding such structural patterns is generalized blockmodeling (Doreian et al., 2005). Theresults of a blockmodeling procedure is a partition of actors determining positions of actors and the ties into blocks. The result of blockmodeling can be presented with a reduced graph or with an image matrix that represents the relationship among the obtained positions.

Social network data are gathered by using different techniques and one of the most used data collection method is a survey (Marsden, 2005; Wasserman and Faust, 1998). Surveys, and more generally the whole research design, are sources of errors which can be roughly classified into three categories: errors introduced by miss-specified boundaries of a network, errors introduced by the questionnaire format and errors created by actors when they respond.

In this paper, two types of errors from the last category will be discussed and its impact on obtained blockmodel(s) will be studied. Errors due to actors can be further classified into three subcategories: non-response of an actor (Stork and Richards, 1992; Costenbader and Valente, 2003; Kossinets, 2006; Knoke and Yang 2008; and Huisman, 2009)., non-response on item or tie (Rumsey, 1993; Borgatti et al., 2006; Huisman and Steglich, 2008; Huisman, 2009), and measurement error (Holland and Leinhardt, 1973).

Actor non-response leads to (n-1) missing ties, where n is the number of actors in a network. Let m denote the m non-respondents in a network. The actor response rate is 1-m/n and is equal to 'relational' response rate. The number of missing ties between non-respondents is equal to (n−1)m (Knoke and Yang, 2008).

In the case of actor non-response, there are several ways of treating networks with actor non-response. If there is a non-respondent in a network, no outgoing ties from that actor are recorded. The result is a row of missing ties in a matrix representation. In most cases, described in the literature, the complete case approach is used, which means that beside the rows of missing ties also all ingoing ties of an actor are deleted. The row and column of each non-respondent are deleted and the result is a smaller network. Robins et al. (2004) argue that this approach leads in fact to respecifying of network boundaries. We consider also approaches that take observed incoming ties of a non-respondent actor (partially described ties between respondent and non-respondent) into consideration. The first approach is an available case approach, also called reconstruction, where a row of missing ties is replaced by the corresponding column (Stork and Richards, 1992; Huisman, 2009). The result is that ties between non-respondents and respondents became symmetric. A second option where partially described ties are take into account are imputations. Missing ties are replaced by estimates to create an apparently full data set. One option is to impute the average value of incoming ties of an actor, which is known as ‘item mean’ (Huisman, 2009). For binary network this implies that ties (ones) are imputed in case of popular actors according to their receiving ties. More precisely, the mean value of all available incoming ties for each non-respondent is imputed. For binary network, this implies that in practice, the mode value of incoming ties is imputed and therefore the term ‘imputation based on the mode value’ is used. In the reconstruction procedure, the unobserved ties between two non-respondents cannot be replaced without additional imputations. Therefore, we combine the reconstruction procedure with imputations based on mode values for ties between non-respondents. In case when we retain the non-respondents in the network and do not use any missing data treatment, the matrix has rows of 0s for each non-respondent. This treatment is named as the 'null tie imputation' and is used in our study for comparison to other approaches, because it presents the situation where nothing is done with missing data.

Although, the actor non-response seems to be more natural in the social networks context, also the item non-response is possible. Non-reported data my occur when a roster is used and respondents do not report the presence or absence of a tie (Husiman, 2009). Similar ways of treating missing data which were presented above for actor non-response can be applied also to item non-response (e.g. reconstruction, imputations based on mode).

The problem we study is the impact of different non-response treatments on the identified blockmodel structures. We start with a whole (or known) network, impose different regimes of actor or item non-response on it, treat the missing data with different treatments described above, establish the blockmodel of new treated network and compare the blockmodel structures. The blockmodeling structures for the known and treated network were compared with two indices. The first one is the Adjusted Rand Index which measures the differences between pairs of partitions and the second index compares block types in the image matrix to compute the proportion of incorrect blocks.

The impact of different non-respondent treatments in case of actor and item non-response on the results of blockmodeling will be presented through real networks and through simulations. According to results we determined which treatment of actor and item non-response in a network is the most appropriate when a blockmodeling procedure is used. There is no general answer, because the best the non-response treatment depend on the network and/or blockmodel structure. The concordance is, that the ‘do nothing approach’, simulated with null tie imputations, which appears to be the default and widely used approach in blockmodeling performs the worse.

References:

Borgatti, Stephen P., Kathleen M. Carley and David Krackhardt. 2006. On the robustness of centrality measures under conditions of imperfect data. Social Networks 28 (2):124-126.

Costenbader, E. and Valente T.W., 2003. The stability of centrality measures when networks are sampled. Social Networks 25, 283-307.

Doreian, P., Batagelj, V. and Ferligoj, A., 2005. Generalized Blockmodeling. Cambridge University Press, New York, NY.

Holland, P.W. and Leinhardt, S., 1973. The structural implications of measurement error in sociometry. The Journal of Mathematical Sociology 3,85-111.

Huisman, M. 2009. Effects of missing data in social networks. Journal of Social Structure 10. Available at: http://www.cmu.edu/joss/content/articles/volume10/huisman.pdf.

Huisman, M. and Steglich, C., 2008. Treatment of non-response in longitudinal network studies. Social networks 30, 297-308.

Knoke, D. and Yang, S., 2008. Social networks analysis. Sage Publications, Los Angeles. 2nd edition.

Kossinets, G., 2006. Effects of missing data in social networks. Social networks 28, 247-268.

Marsden, P.V., 2005. Recent Developments in Network Measurement. In: Carrington P.J., Scott J. and Wasserman S. (Eds.), Models and Methods In Social Network Analysis. Cambridge University Press, New York, pp. 8-30.

Robins, G., Pattison, P. and Woolcock, J., 2004. Missing data in networks: exponential random graph (p*) models for networks with non-respondents. Social networks 26, 257-283.

Rumsey, D.J., 1993. Nonresponse models for social network stochastic processes. Ph.D. thesis. The Ohio State University.

Stork, D. and Richards, W.D., 1992. Nonrespondents in communication network studies: problems and possibilities. Group and Organization Management 17, 193-209.

Wasserman, Stanley and Katherine Faust. 1998. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge. 2nd edition.