| Title: | Factorial Difference-in-Differences |
|---|---|
| Description: | Implements the factorial difference-in-differences (FDID) framework for panel data settings where all units are exposed to a universal event but vary in a baseline factor G. Provides support for various estimators; supports robust, bootstrap, and jackknife variance; returns dynamic, pre/event/post aggregates and raw means; and includes helpers for data preparation and plotting. Methodology follows Xu, Zhao and Ding (2026) <doi:10.1080/01621459.2026.2628343>. |
| Authors: | Yiqing Xu [aut, cre], Rivka Lipkovitz [aut], Enhan Liu [aut] |
| Maintainer: | Yiqing Xu <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.2 |
| Built: | 2026-05-23 06:43:46 UTC |
| Source: | https://github.com/xuyiqing/fdid |
Performs factorial difference-in-differences (FDID) estimation using various methods and variance estimation techniques.
fdid( s, tr_period, ref_period, entire_period = NULL, method = "ols1", vartype = "robust", missing_data = c("listwise", "available"), nsims = 1000, parallel = FALSE, cores = 2, target.pop = c("all", "1", "0") )fdid( s, tr_period, ref_period, entire_period = NULL, method = "ols1", vartype = "robust", missing_data = c("listwise", "available"), nsims = 1000, parallel = FALSE, cores = 2, target.pop = c("all", "1", "0") )
s |
A data frame prepared using |
tr_period |
A numeric vector specifying the treatment periods. |
ref_period |
A numeric scalar specifying the reference period. |
entire_period |
A numeric vector specifying the total range of time periods.
If |
method |
A string specifying the estimation method.
Options: |
vartype |
A string specifying the variance estimation type.
Options: |
missing_data |
How to handle missing data. Two options:
Default is |
nsims |
Number of simulations for bootstrap variance estimation.
Default is |
parallel |
Logical; whether to perform parallel computations.
Default is |
cores |
Number of cores for parallel computations.
Default is |
target.pop |
Character; the target population for averaging: |
A list with the following components:
est |
A list with three elements:
|
dynamic |
Dynamic FDID estimates for each time in |
raw_means |
Raw mean outcomes by group for each time in |
tr_period |
Treatment periods used. |
ref_period |
Reference period used. |
entire_period |
All time periods for dynamic estimation. |
method |
Method used. |
vartype |
Variance type used. |
times |
All numeric time columns found. |
G |
Group indicator (0/1). |
ps |
Propensity scores (if |
call |
The matched call. |
target.pop |
Character indicating the target population used. |
Rivka Lipkovitz, Enhan Liu
data(fdid) mortality$uniqueid <- paste(mortality$provid, mortality$countyid, sep = "-") mortality$G <- ifelse(mortality$pczupu >= median(mortality$pczupu, na.rm = TRUE), 1, 0) s <- fdid_prepare( data = mortality, Y_label = "mortality", X_labels = c("avggrain", "lnpop"), G_label = "G", unit_label = "uniqueid", time_label = "year" ) result <- fdid(s, tr_period = 1958:1961, ref_period = 1957) summary(result)data(fdid) mortality$uniqueid <- paste(mortality$provid, mortality$countyid, sep = "-") mortality$G <- ifelse(mortality$pczupu >= median(mortality$pczupu, na.rm = TRUE), 1, 0) s <- fdid_prepare( data = mortality, Y_label = "mortality", X_labels = c("avggrain", "lnpop"), G_label = "G", unit_label = "uniqueid", time_label = "year" ) result <- fdid(s, tr_period = 1958:1961, ref_period = 1957) summary(result)
Bundles multiple 'fdid' objects into a single list with class '"fdid_list"' for convenient collective handling.
fdid_list(..., validate = TRUE)fdid_list(..., validate = TRUE)
... |
One or more objects of class '"fdid"', or a single list of them. |
validate |
Logical; if 'TRUE' (default) verify each element inherits from '"fdid"'. |
A list with classes 'c("fdid_list", "list")'.
Rivka Lipkovitz
Prepares a dataset for factorial difference-in-differences (FDID) analysis by reshaping the data into a wide format, averaging time-varying covariates, and renaming columns for consistency in subsequent analysis.
fdid_prepare( data, Y_label, X_labels = NULL, G_label, unit_label, time_label, cluster_label = NULL )fdid_prepare( data, Y_label, X_labels = NULL, G_label, unit_label, time_label, cluster_label = NULL )
data |
A data frame containing the dataset to be processed. |
Y_label |
A string specifying the column name of the outcome variable. |
X_labels |
A character vector specifying the column names of the time-varying covariates. |
G_label |
A string specifying the column name of the group variable (e.g., treatment vs. control). |
unit_label |
A string specifying the column name of the unit identifier (e.g., individual or entity). |
time_label |
A string specifying the column name of the time variable. |
cluster_label |
An optional string specifying the column name of the clustering variable. Default is 'NULL'. |
A data frame in wide format with the following: - Outcome variable pivoted to wide format with time columns. - Time-varying covariates averaged across time. - Columns renamed: - Unit identifier -> 'unit' - Covariates -> 'x1', 'x2', ... - Group variable -> 'G' - Clustering variable (if provided) -> 'c'
Rivka Lipkovitz
data <- data.frame( id = rep(1:3, each = 4), time = rep(1:4, times = 3), outcome = rnorm(12), covar1 = runif(12), covar2 = runif(12), group = c(0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1) ) fdid_data <- fdid_prepare( data = data, Y_label = "outcome", X_labels = c("covar1", "covar2"), G_label = "group", unit_label = "id", time_label = "time" ) head(fdid_data)data <- data.frame( id = rep(1:3, each = 4), time = rep(1:4, times = 3), outcome = rnorm(12), covar1 = runif(12), covar2 = runif(12), group = c(0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1) ) fdid_data <- fdid_prepare( data = data, Y_label = "outcome", X_labels = c("covar1", "covar2"), G_label = "group", unit_label = "id", time_label = "time" ) head(fdid_data)
A long-format panel dataset for demonstrating the fdid package.
mortalitymortality
A data frame with 11973 rows and 17 columns:
Province ID
County ID
Genealogy book count
Genealogy book density (per capita); 45% of counties have zero
Log-transformed genealogy density: log(pczupu + 1); used as a continuous treatment in Xu, Zhao, and Ding (2026)
Indicator: any genealogy book present
Average grain output
Indicator: no grain data
Urban population share
Distance to Beijing
Distance to provincial capital
Rice cultivation indicator
Minority population share
Education level
Log population
Year (1954–1966)
Mortality rate
Provides visualisations for FDID results, including raw means, dynamic
effects, and propensity-score overlap. The comparison plot of multiple
methods has been removed; use plot.fdid_list() for that.
## S3 method for class 'fdid' plot( x, type = c("raw", "dynamic", "overlap"), connected = FALSE, ci = TRUE, shade_periods = x$tr_period, alpha_shade = 0.2, palette = "Set2", group_labels = c("Group 0", "Group 1"), xlab = NULL, ylab = NULL, main = NULL, ylim = NULL, ... )## S3 method for class 'fdid' plot( x, type = c("raw", "dynamic", "overlap"), connected = FALSE, ci = TRUE, shade_periods = x$tr_period, alpha_shade = 0.2, palette = "Set2", group_labels = c("Group 0", "Group 1"), xlab = NULL, ylab = NULL, main = NULL, ylim = NULL, ... )
x |
An |
type |
One of |
connected |
Logical; if |
ci |
Logical; if |
shade_periods |
Shaded intervals on the time axis. Default uses |
alpha_shade |
Transparency for shading the treatment period. |
palette |
A palette name from RColorBrewer. Default |
group_labels |
Labels for the two groups. |
xlab, ylab, main
|
Axis labels and main title. |
ylim |
Y-axis limits. Default |
... |
Additional graphics parameters. |
Produces a plot; invisibly returns NULL.
Rivka Lipkovitz, Enhan Liu
data(fdid) mortality$uniqueid <- paste(mortality$provid, mortality$countyid, sep = "-") mortality$G <- ifelse(mortality$pczupu >= median(mortality$pczupu, na.rm = TRUE), 1, 0) s <- fdid_prepare( data = mortality, Y_label = "mortality", X_labels = c("avggrain", "lnpop"), G_label = "G", unit_label = "uniqueid", time_label = "year" ) result <- fdid(s, tr_period = 1958:1961, ref_period = 1957) plot(result, type = "raw") plot(result, type = "dynamic")data(fdid) mortality$uniqueid <- paste(mortality$provid, mortality$countyid, sep = "-") mortality$G <- ifelse(mortality$pczupu >= median(mortality$pczupu, na.rm = TRUE), 1, 0) s <- fdid_prepare( data = mortality, Y_label = "mortality", X_labels = c("avggrain", "lnpop"), G_label = "G", unit_label = "uniqueid", time_label = "year" ) result <- fdid(s, tr_period = 1958:1961, ref_period = 1957) plot(result, type = "raw") plot(result, type = "dynamic")
Creates a comparison plot of point estimates and confidence intervals for every element of an 'fdid_list'.
## S3 method for class 'fdid_list' plot( x, xlab = NULL, ylab = NULL, main = NULL, ylim = NULL, vertical = TRUE, show_vartype = TRUE, ... )## S3 method for class 'fdid_list' plot( x, xlab = NULL, ylab = NULL, main = NULL, ylim = NULL, vertical = TRUE, show_vartype = TRUE, ... )
x |
An object of class '"fdid_list"'. |
xlab, ylab, main
|
Axis labels and title. If 'NULL', sensible defaults are used. |
ylim |
Optional numeric vector of length two giving the *estimate-axis* limits. (Backward compatible: for horizontal plots this is the x-limit; for vertical plots this is the y-limit.) |
vertical |
Logical; default is |
show_vartype |
Logical; include vartype in labels. Default is |
... |
Additional graphics parameters passed to |
Invisibly returns 'x'; called for its side-effect of drawing a plot.
Rivka Lipkovitz, Enhan Liu
Print Method for FDID Objects
## S3 method for class 'fdid' print(x, ...)## S3 method for class 'fdid' print(x, ...)
x |
An object of class 'fdid'. |
... |
Additional arguments (not used). |
Prints a brief overview of the 'fdid' object
Rivka Lipkovitz.
Summary Method for FDID Objects
## S3 method for class 'fdid' summary(object, ...)## S3 method for class 'fdid' summary(object, ...)
object |
An object of class |
... |
Additional arguments (not used). |
Prints a summary of the fdid object.
Rivka Lipkovitz, Enhan Liu
data(fdid) mortality$uniqueid <- paste(mortality$provid, mortality$countyid, sep = "-") mortality$G <- ifelse(mortality$pczupu >= median(mortality$pczupu, na.rm = TRUE), 1, 0) s <- fdid_prepare( data = mortality, Y_label = "mortality", X_labels = c("avggrain", "lnpop"), G_label = "G", unit_label = "uniqueid", time_label = "year" ) result <- fdid(s, tr_period = 1958:1961, ref_period = 1957) summary(result)data(fdid) mortality$uniqueid <- paste(mortality$provid, mortality$countyid, sep = "-") mortality$G <- ifelse(mortality$pczupu >= median(mortality$pczupu, na.rm = TRUE), 1, 0) s <- fdid_prepare( data = mortality, Y_label = "mortality", X_labels = c("avggrain", "lnpop"), G_label = "G", unit_label = "uniqueid", time_label = "year" ) result <- fdid(s, tr_period = 1958:1961, ref_period = 1957) summary(result)