Getting Started with the testinterference Package

Introduction

The testinterference package provides tools to test SUTVA and several hypotheses about spillover effects. Specifically, the package enables to perform randomization tests on whether or not the following null hypotheses are plausible:

Fisher: One’s outcome does not depend on one’s own treatment (i.e., Fisher’s sharp null hypothesis).
SUTVA: One’s outcome is determined by one’s own treatment (i.e., the stable unit treatment value assumption).
Exposure 1: One’s outcome is determined by whether there is at least one treated unit in one’s neighborhood including oneself.
Exposure 2: One’s outcome is determined by one’s own treatment and whether there is at least one treated peer.

The testing procedures are developed by Hoshino and Yanagi (2023) “Randomization test for the specification of interference structure”.

Installation

Get the package from GitHub:

# install.packages("devtools") # if needed
devtools::install_github("tkhdyanagi/testinterference", build_vignettes = TRUE)

Package Function

The testinterference package provides the following function:

testinterference(): Randomization tests for the following null hypotheses: (1) Fisher, (2) SUTVA, (3) Exposure 1, (4) Exposure 2.

Arguments

The testinterference() function has the following arguments:

Y: The \(n\)-dimensional outcome vector
Z: The \(n\)-dimensional binary treatment assignment vector
A: The \(n \times n\), binary, possibly asymmetric, adjacency matrix. The diagonal elements must be zero. This argument can be NULL if hypothesis = Fisher. Default is NULL.
hypothesis: A character specifying the null hypothesis of interest. Options are “Fisher”, “SUTVA”, “exposure1”, and “exposure2”. “Fisher” stands for the sharp null hypothesis in the canonical Fisher randomization test. “SUTVA”, “exposure1”, and “exposure2” correspond to the null hypotheses a, b, and c in Table 2 of Hoshino and Yanagi (2023). Default is “SUTVA”. If hypothesis = "exposure2", the user can specify the argument kappa.
method: A character specifying how to find focal units. Options are “3-net”, “2-net”, “random”, and “manual”, which stand for the 3-net (MIS-G) method, the 2-net (MIS-A) method, the random selection, and the manual selection. Default is “3-net”. If method = "random", the user can specify the argument num_focal_unit. If method = "manual", the user must specify the argument focal_unit.
design: A character specifying the randomization design of the original experiment. Options are “Bernoulli”, “complete”, “stratified”, and “other”, which stand for Bernoulli randomization, complete randomization, stratified randomization, and other experimental designs. If design = "Bernoulli", the user must specify the argument prob. If design = "stratified",the user must specify the argument strata. If design = "other", the user must specify the argument zmatrix.
prob: NULL, a scalar indicating the homogeneous probability of being treatment eligible, or an \(n\)-dimensional vector indicating the heterogeneous probabilities for treatment eligibility (i.e., the \(i\)-th element of prob indicates \(\Pr(Z_i = 1)\)). This argument is used only when design = "Bernoulli". Default is NULL.
focal_unit: NULL or an \(n\)-dimensional logical vector specifying whether each unit is focal. This argument is used only when method = "manual". Default is NULL.
num_focal_unit: NULL or a positive scalar specifying the number of focal units. This argument is used only when method = "random". Default is NULL.
num_randomization: A large positive integer specifying the number of randomization. Default is 9999. This argument is not used when design = "other".
strata: NULL or an \(n\)-dimensional numerical vector indicating the stratum to which each unit belongs. This argument is used only when design = "stratified". Default is NULL.
zmatrix: NULL or a large matrix of realizations of treatment assignments. This argument must be specified by the user only when design = "other". The number of rows must equal to the sample size \(n\). The number of columns is the number of realizations given by the user. Default is NULL.
kappa: NULL or a positive integer no less than 2. This argument is used only when hypothesis = "exposure2". Default is NULL. If kappa = NULL, kappa is automatically chosen to maximize the number of focal units.
cores: A positive integer specifying the number of cores to use for parallel computing. Default is 1.

Returns

The testinterference() function returns a list containing the following elements:

pval: The vector of p-values obtained from Kruskal-Wallis (KW), average cross difference (ACD), and ordinary least squares (OLS) test statistics.
Simes: The vector of testing results by Simes’ correction under significance levels 10%, 5%, and 1%. TRUE (resp. FALSE) indicates the rejection (resp. acceptance) of the null hypothesis.
stat: A matrix of test statistics computed with focal assignments.
focal_unit: The logical vector indicating whether each unit is focal.
focal_asgmt: The matrix of focal assignments.
num_focal_unit: The number of focal units.
num_focal_asgmt: The number of focal assignments.

Example 1: A Single Large Network

We begin by generating artificial data using the datageneration() function. In the datageneration() function, there are two options to generate the adjacency matrix: (1) Erdos-Renyi model and (2) pairs. In addition, there are three options for the experimental design: (a) Bernoulli randomization, (b) complete randomization, and (c) stratified randomization. Here we consider the case (1)-(a) with:

# Load the package
library(testinterference)

# Sample size
n <- 200

# Generate artificial data (Erdos-Renyi model)
set.seed(1)
data1 <- datageneration(n      = n,
                        design = "Bernoulli",
                        A      = "Erdos-Renyi",
                        beta_s = 1)

Here, beta_s specifies the value of a coefficient for spillover effects in the outcome equation. beta_s = 0 means that there are no spillovers. In this example, we consider the case where SUTVA does not hold.

Run the testinterference() function to test SUTVA:

set.seed(1)
RT1 <- testinterference(Y                 = data1$Y,
                        Z                 = data1$Z,
                        A                 = data1$A,
                        hypothesis        = "SUTVA",
                        method            = "3-net",
                        design            = "Bernoulli",
                        prob              = 0.5,
                        focal_unit        = NULL,
                        num_focal_unit    = NULL,
                        num_randomization = 999,
                        strata            = NULL,
                        zmatrix           = NULL,
                        kappa             = NULL,
                        cores             = 1)

Here, we set num_randomization = 999 to reduce the computation time, but in realistic situations the number of randomization should be larger (e.g., num_randomization = 99999).

The p-values obtained from Kruskal-Wallis (KW), average cross difference (ACD), and ordinary least squares (OLS) test statistics are:

RT1$pval
#>    KW   ACD   OLS 
#> 0.012 0.019 0.012

The null hypothesis is rejected at the standard significance level. In other words, we find statistical evidence that SUTVA is implausible.

The results of Simes’ correction are:

RT1$Simes
#>   10%    5%    1% 
#>  TRUE  TRUE FALSE

Here, TRUE indicates the rejection of the null hypothesis. Again, the results suggest that SUTVA does not hold.

We can confirm the numbers of focal units and assignments with:

RT1$num_focal_unit
#> [1] 36
RT1$num_focal_asgmt
#> [1] 1000

Next, we turn to test whether exposure 2 is correct.

set.seed(1)
RT2 <- testinterference(Y                 = data1$Y,
                        Z                 = data1$Z,
                        A                 = data1$A,
                        hypothesis        = "exposure2",
                        method            = "3-net",
                        design            = "Bernoulli",
                        prob              = 0.5,
                        focal_unit        = NULL,
                        num_focal_unit    = NULL,
                        num_randomization = 999,
                        strata            = NULL,
                        zmatrix           = NULL,
                        kappa             = NULL,
                        cores             = 1)

The testing results are:

RT2$pval
#>    KW   ACD   OLS 
#> 0.001 0.001 0.005
RT2$Simes
#>  10%   5%   1% 
#> TRUE TRUE TRUE
RT2$num_focal_unit
#> [1] 34
RT2$num_focal_asgmt
#> [1] 1000

We find statistical evidence that exposure 2 is incorrect.

Example 2: Pairs

We turn to the case with possible pairwise interactions and complete randomization. To consider the case without interference, we specify beta_s = 0.

# Generate artificial data (Erdos-Renyi model)
set.seed(1)
data2 <- datageneration(n      = n,
                        design = "complete",
                        A      = "pairs",
                        beta_s = 0)

Perform the randomization test:

set.seed(1)
RT3 <- testinterference(Y                 = data2$Y,
                        Z                 = data2$Z,
                        A                 = data2$A,
                        hypothesis        = "SUTVA",
                        method            = "3-net",
                        design            = "complete",
                        prob              = NULL,
                        focal_unit        = NULL,
                        num_focal_unit    = NULL,
                        num_randomization = 999,
                        strata            = NULL,
                        zmatrix           = NULL,
                        kappa             = NULL,
                        cores             = 1)

The testing results are:

RT3$pval
#>    KW   ACD   OLS 
#> 0.206 0.238 0.500
RT3$Simes
#>   10%    5%    1% 
#> FALSE FALSE FALSE
RT3$num_focal_unit
#> [1] 100
RT3$num_focal_asgmt
#> [1] 1000

The null hypothesis is not rejected at the standard significance level. For the results of Simes’ correction, FALSE indicates the acceptance of the null hypothesis. Thus, the testing results suggest that SUTVA might hold.

In the case with pairwise interactions, it is impossible to perform randomization tests for hypothesis = "exposure2". This is because each unit has only one peer so that \(\mathcal{N}(\kappa)\) is empty for any \(\kappa \ge 2\). See the vignette vignette("exposure2", package = "testinterference") for more details.

References

Hoshino, T. and Yanagi, T., 2023. Randomization test for the specification of interference structure arXiv preprint arXiv:2301.05580. Link

Tadao Hoshino (thoshino@waseda.jp), Takahide Yanagi (yanagi@econ.kyoto-u.ac.jp)