Getting Started with the latenetwork Package
latenetwork.Rmd
Introduction
The latenetwork package provides tools for causal inference under noncompliance with treatment assignment and network interference of unknown form. The package enables to implement the instrumental variables (IV) estimation for the local average treatment effect (LATE) type parameters via inverse probability weighting (IPW) using the concept of instrumental exposure mapping (IEM) and the framework of approximate neighborhood interference (ANI).
The parameters of interest are as follows.
- The average direct effect (ADE) parameters:
- The ADE of the IV on the outcome.
- The ADE of the IV on the treatment receipt.
- The local average direct effect (LADE).
- The average indirect effect (AIE) parameters:
- The AIE of the IV on the outcome.
- The AIE of the IV on the treatment receipt.
- The local average indirect effect (LAIE).
- The average overall effect (AOE) parameters:
- The AOE of the IV on the outcome.
- The AOE of the IV on the treatment receipt.
- The local average overall effect (LAOE).
- The average spillover effect (ASE) parameters:
- The ASE of the IV on the outcome.
- The ASE of the IV on the treatment receipt.
- The local average spillover effect (LASE).
For more details on the identification and estimation methods, see
the “Review of Causal Inference with Noncompliance and Unknown
Interference” vignette with:
vignette("review", package = "latenetwork")
.
Installation
Get the package from CRAN:
install.packages("latenetwork")
or from GitHub:
# install.packages("devtools") # if needed
devtools::install_github("tkhdyanagi/latenetwork", build_vignettes = TRUE)
Functions
The latenetwork package provides the following functions:
-
direct()
: Estimation and statistical inference for the ADE parameters. -
indirect()
: Estimation and statistical inference for the AIE parameters. -
overall()
: Estimation and statistical inference for the AOE parameters. -
spillover()
: Estimation and statistical inference for the ASE parameters.
Arguments
All package functions have the following arguments:
-
Y
: An n-dimensional outcome vector. -
D
: An n-dimensional binary treatment vector. SetD
to the same argument asZ
if you would like to perform the intention-to-treat analysis only. -
Z
: An n-dimensional binary instrumental vector. -
S
: An n-dimensional logical vector of indicating whether each unit belongs to the sub-population on which the parameters of interest are defined. -
A
: An n times n symmetric binary adjacency matrix whose diagonal elements are 0. -
K
: A scalar of indicating the range of neighborhood used for constructing interference sets. Default is 1. -
bw
: A scalar of bandwidth used for the HAC estimation and the wild bootstrap. Ifbw = NULL
, the rule-of-thumb bandwidth proposed by Leung (2022) is used. Default is NULL. -
B
: The number of bootstrap repetitions. IfB = NULL
, the wild bootstrap is skipped. Default is NULL. -
alp
: The significance level. Default is 0.05.
The direct()
function has the following additional
arguments:
-
IEM
: An n-dimensional instrumental exposure vector. Ift = NULL
, the constant IEM is used. Default is NULL. -
t
: A scalar of the evaluation point of the IEM. Ift = NULL
, the constant IEM is used. Default is NULL.
The spillover()
function has the following additional
arguments:
-
IEM
: An n-dimensional instrumental exposure vector. -
z
: A scalar of the evaluation point of the IV. -
t0
: A scalar of the evaluation point of the IEM (from). -
t1
: A scalar of the evaluation point of the IEM (to).
Returns
Each function returns a data.frame with the following elements:
-
est
: The estimate of each parameter. -
HAC_SE
: The standard error computed by the network HAC estimation. -
HAC_CI_L
: The lower bound of the confidence interval computed by the network HAC estimation. -
HAC_CI_U
: The upper bound of the confidence interval computed by the network HAC estimation. -
wild_SE
: The standard error computed by the wild bootstrap. -
wild_CI_L
: The lower bound of the confidence interval computed by the wild bootstrap. -
wild_CI_U
: The upper bound of the confidence interval computed by the wild bootstrap. -
bw
: The bandwidth used for the HAC estimation and the wild bootstrap -
size
: The size of the subpopulationS
:
Example
To run the following example, install the igraph package if needed.
# if needed --------------------------------------------------------------------
install.packages("igraph")
Generate artificial data from the datageneration()
function.
# Generate artificial data from a ring network----------------------------------
set.seed(1)
n <- 2000
data <- latenetwork::datageneration(n = n)
Perform the causal inference with:
# Arguments --------------------------------------------------------------------
Y <- data$Y
D <- data$D
Z <- data$Z
A <- data$A
IEM <- ifelse(A %*% Z > 0, 1, 0)
S <- rep(TRUE, n)
K <- 1
z <- 1
t <- 0
t0 <- 0
t1 <- 1
bw <- NULL
B <- NULL
alp <- 0.05
# Causal inference -------------------------------------------------------------
# The ADE parameters defined by IEM = (A %*% Z > 0)
result_direct1 <- latenetwork::direct(Y = Y,
D = D,
Z = Z,
IEM = IEM,
S = S,
A = A,
K = K,
t = t,
bw = bw,
B = B,
alp = alp)
# The ADE parameters defined by the constant IEM
result_direct2 <- latenetwork::direct(Y = Y,
D = D,
Z = Z,
IEM = NULL,
S = S,
A = A,
K = K,
t = NULL,
bw = bw,
B = B,
alp = alp)
# The AIE parameters defined by K = 1
result_indirect <- latenetwork::indirect(Y = Y,
D = D,
Z = Z,
S = S,
A = A,
K = K,
bw = bw,
B = B,
alp = alp)
# The AOE parameters defined by K = 1
result_overall <- latenetwork::overall(Y = Y,
D = D,
Z = Z,
S = S,
A = A,
K = K,
bw = bw,
B = B,
alp = alp)
# The ASE parameters defined by IEM = (A %*% Z > 0)
result_spillover <- latenetwork::spillover(Y = Y,
D = D,
Z = Z,
IEM = IEM,
S = S,
A = A,
K = K,
z = z,
t0 = t0,
t1 = t1,
bw = bw,
B = B,
alp = alp)
You can see the estimation results with:
result_direct1
#> est HAC_SE HAC_CI_L HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> ADEY 0.4008916 0.09871458 0.2074146 0.5943686 NA NA NA 8
#> ADED 0.2499606 0.03485485 0.1816464 0.3182749 NA NA NA 8
#> LADE 1.6038190 0.36023112 0.8977789 2.3098590 NA NA NA 8
#> size
#> ADEY 2000
#> ADED 2000
#> LADE 2000
result_direct2
#> est HAC_SE HAC_CI_L HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> ADEY 0.5632636 0.05254325 0.4602807 0.6662465 NA NA NA 8
#> ADED 0.3551812 0.02213500 0.3117974 0.3985650 NA NA NA 8
#> LADE 1.5858485 0.12418001 1.3424602 1.8292368 NA NA NA 8
#> size
#> ADEY 2000
#> ADED 2000
#> LADE 2000
result_indirect
#> est HAC_SE HAC_CI_L HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> AIEY 0.2924892 0.08785062 0.1203051 0.4646732 NA NA NA 8
#> AIED 0.2897227 0.03205981 0.2268866 0.3525587 NA NA NA 8
#> ADED 0.3551812 0.02213500 0.3117974 0.3985650 NA NA NA 8
#> LAIE 0.8234928 0.25796895 0.3178830 1.3291027 NA NA NA 8
#> size
#> AIEY 2000
#> AIED 2000
#> ADED 2000
#> LAIE 2000
result_overall
#> est HAC_SE HAC_CI_L HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> AOEY 0.8557528 0.09429867 0.6709308 1.0405748 NA NA NA 8
#> AOED 0.6449039 0.03744014 0.5715226 0.7182852 NA NA NA 8
#> ADED 0.3551812 0.02213500 0.3117974 0.3985650 NA NA NA 8
#> LAOE 2.4093413 0.27637076 1.8676646 2.9510181 NA NA NA 8
#> size
#> AOEY 2000
#> AOED 2000
#> ADED 2000
#> LAOE 2000
result_spillover
#> est HAC_SE HAC_CI_L HAC_CI_U wild_SE wild_CI_L wild_CI_U bw
#> ASEY 0.5750447 0.08065202 0.4169696 0.7331197 NA NA NA 8
#> ASED 0.3920457 0.03401795 0.3253718 0.4587197 NA NA NA 8
#> LASE 1.4667795 0.18557907 1.1030512 1.8305078 NA NA NA 8
#> size
#> ASEY 2000
#> ASED 2000
#> LASE 2000