Getting Started with the latenetwork Package • latenetwork

Introduction

The latenetwork package provides tools for causal inference under noncompliance with treatment assignment and network interference of unknown form. The package enables to implement the instrumental variables (IV) estimation for the local average treatment effect (LATE) type parameters via inverse probability weighting (IPW) using the concept of instrumental exposure mapping (IEM) and the framework of approximate neighborhood interference (ANI).

The parameters of interest are as follows.

The average direct effect (ADE) parameters:
- The ADE of the IV on the outcome.
- The ADE of the IV on the treatment receipt.
- The local average direct effect (LADE).
The average indirect effect (AIE) parameters:
- The AIE of the IV on the outcome.
- The AIE of the IV on the treatment receipt.
- The local average indirect effect (LAIE).
The average overall effect (AOE) parameters:
- The AOE of the IV on the outcome.
- The AOE of the IV on the treatment receipt.
- The local average overall effect (LAOE).
The average spillover effect (ASE) parameters:
- The ASE of the IV on the outcome.
- The ASE of the IV on the treatment receipt.
- The local average spillover effect (LASE).

For more details on the identification and estimation methods, see the “Review of Causal Inference with Noncompliance and Unknown Interference” vignette with: vignette("review", package = "latenetwork").

Installation

Get the package from CRAN:

install.packages("latenetwork")

or from GitHub:

# install.packages("devtools") # if needed
devtools::install_github("tkhdyanagi/latenetwork", build_vignettes = TRUE)

Functions

The latenetwork package provides the following functions:

direct(): Estimation and statistical inference for the ADE parameters.
indirect(): Estimation and statistical inference for the AIE parameters.
overall(): Estimation and statistical inference for the AOE parameters.
spillover(): Estimation and statistical inference for the ASE parameters.

Arguments

All package functions have the following arguments:

Y: An n-dimensional outcome vector.
D: An n-dimensional binary treatment vector. Set D to the same argument as Z if you would like to perform the intention-to-treat analysis only.
Z: An n-dimensional binary instrumental vector.
S: An n-dimensional logical vector of indicating whether each unit belongs to the sub-population on which the parameters of interest are defined.
A: An n times n symmetric binary adjacency matrix whose diagonal elements are 0.
K: A scalar of indicating the range of neighborhood used for constructing interference sets. Default is 1.
bw: A scalar of bandwidth used for the HAC estimation and the wild bootstrap. If bw = NULL, the rule-of-thumb bandwidth proposed by Leung (2022) is used. Default is NULL.
B: The number of bootstrap repetitions. If B = NULL, the wild bootstrap is skipped. Default is NULL.
alp: The significance level. Default is 0.05.

The direct() function has the following additional arguments:

IEM: An n-dimensional instrumental exposure vector. If t = NULL, the constant IEM is used. Default is NULL.
t: A scalar of the evaluation point of the IEM. If t = NULL, the constant IEM is used. Default is NULL.

The spillover() function has the following additional arguments:

IEM: An n-dimensional instrumental exposure vector.
z: A scalar of the evaluation point of the IV.
t0: A scalar of the evaluation point of the IEM (from).
t1: A scalar of the evaluation point of the IEM (to).

Returns

Each function returns a data.frame with the following elements:

est: The estimate of each parameter.
HAC_SE: The standard error computed by the network HAC estimation.
HAC_CI_L: The lower bound of the confidence interval computed by the network HAC estimation.
HAC_CI_U: The upper bound of the confidence interval computed by the network HAC estimation.
wild_SE: The standard error computed by the wild bootstrap.
wild_CI_L: The lower bound of the confidence interval computed by the wild bootstrap.
wild_CI_U: The upper bound of the confidence interval computed by the wild bootstrap.
bw: The bandwidth used for the HAC estimation and the wild bootstrap
size: The size of the subpopulation S:

Example

To run the following example, install the igraph package if needed.

# if needed --------------------------------------------------------------------
install.packages("igraph")

Generate artificial data from the datageneration() function.

# Generate artificial data from a ring network----------------------------------
set.seed(1)
n <- 2000
data <- latenetwork::datageneration(n = n)

Perform the causal inference with:

# Arguments --------------------------------------------------------------------
Y   <- data$Y
D   <- data$D
Z   <- data$Z
A   <- data$A
IEM <- ifelse(A %*% Z > 0, 1, 0)
S   <- rep(TRUE, n)
K   <- 1
z   <- 1
t   <- 0
t0  <- 0
t1  <- 1
bw  <- NULL
B   <- NULL
alp <- 0.05

# Causal inference -------------------------------------------------------------

# The ADE parameters defined by IEM = (A %*% Z > 0)
result_direct1 <- latenetwork::direct(Y = Y,
                                      D = D,
                                      Z = Z,
                                      IEM = IEM,
                                      S = S,
                                      A = A,
                                      K = K,
                                      t = t,
                                      bw = bw,
                                      B = B,
                                      alp = alp)

# The ADE parameters defined by the constant IEM
result_direct2 <- latenetwork::direct(Y = Y,
                                      D = D,
                                      Z = Z,
                                      IEM = NULL,
                                      S = S,
                                      A = A,
                                      K = K,
                                      t = NULL,
                                      bw = bw,
                                      B = B,
                                      alp = alp)

# The AIE parameters defined by K = 1
result_indirect <- latenetwork::indirect(Y = Y,
                                         D = D,
                                         Z = Z,
                                         S = S,
                                         A = A,
                                         K = K,
                                         bw = bw,
                                         B = B,
                                         alp = alp)

# The AOE parameters defined by K = 1
result_overall <- latenetwork::overall(Y = Y,
                                       D = D,
                                       Z = Z,
                                       S = S,
                                       A = A,
                                       K = K,
                                       bw = bw,
                                       B = B,
                                       alp = alp)

# The ASE parameters defined by IEM = (A %*% Z > 0)
result_spillover <- latenetwork::spillover(Y = Y,
                                           D = D,
                                           Z = Z,
                                           IEM = IEM,
                                           S = S,
                                           A = A,
                                           K = K,
                                           z = z,
                                           t0 = t0,
                                           t1 = t1,
                                           bw = bw,
                                           B = B,
                                           alp = alp)

You can see the estimation results with:

result_direct1
#>            est     HAC_SE  HAC_CI_L  HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> ADEY 0.4008916 0.09871458 0.2074146 0.5943686      NA        NA        NA  8
#> ADED 0.2499606 0.03485485 0.1816464 0.3182749      NA        NA        NA  8
#> LADE 1.6038190 0.36023112 0.8977789 2.3098590      NA        NA        NA  8
#>      size
#> ADEY 2000
#> ADED 2000
#> LADE 2000

result_direct2
#>            est     HAC_SE  HAC_CI_L  HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> ADEY 0.5632636 0.05254325 0.4602807 0.6662465      NA        NA        NA  8
#> ADED 0.3551812 0.02213500 0.3117974 0.3985650      NA        NA        NA  8
#> LADE 1.5858485 0.12418001 1.3424602 1.8292368      NA        NA        NA  8
#>      size
#> ADEY 2000
#> ADED 2000
#> LADE 2000

result_indirect
#>            est     HAC_SE  HAC_CI_L  HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> AIEY 0.2924892 0.08785062 0.1203051 0.4646732      NA        NA        NA  8
#> AIED 0.2897227 0.03205981 0.2268866 0.3525587      NA        NA        NA  8
#> ADED 0.3551812 0.02213500 0.3117974 0.3985650      NA        NA        NA  8
#> LAIE 0.8234928 0.25796895 0.3178830 1.3291027      NA        NA        NA  8
#>      size
#> AIEY 2000
#> AIED 2000
#> ADED 2000
#> LAIE 2000

result_overall
#>            est     HAC_SE  HAC_CI_L  HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> AOEY 0.8557528 0.09429867 0.6709308 1.0405748      NA        NA        NA  8
#> AOED 0.6449039 0.03744014 0.5715226 0.7182852      NA        NA        NA  8
#> ADED 0.3551812 0.02213500 0.3117974 0.3985650      NA        NA        NA  8
#> LAOE 2.4093413 0.27637076 1.8676646 2.9510181      NA        NA        NA  8
#>      size
#> AOEY 2000
#> AOED 2000
#> ADED 2000
#> LAOE 2000

result_spillover
#>            est     HAC_SE  HAC_CI_L  HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> ASEY 0.5750447 0.08065202 0.4169696 0.7331197      NA        NA        NA  8
#> ASED 0.3920457 0.03401795 0.3253718 0.4587197      NA        NA        NA  8
#> LASE 1.4667795 0.18557907 1.1030512 1.8305078      NA        NA        NA  8
#>      size
#> ASEY 2000
#> ASED 2000
#> LASE 2000

References

Hoshino, T. and Yanagi, T., 2023. Causal inference with noncompliance and unknown interference. arXiv preprint arXiv:2108.07455. Link

Leung, M.P. (2022). Causal inference under approximate neighborhood interference. Econometrica, 90(1), pp.267-293. Link