Impute Missing Exposure and Omics Data in a MultiAssayExperiment
Source:R/run_impute_missing.R
      run_impute_missing.RdPerforms missing data imputation on both exposure variables
(from colData) and omics datasets (from experiments) within
a MultiAssayExperiment object.
Usage
run_impute_missing(
  expomicset,
  exposure_impute_method = "median",
  exposure_cols = NULL,
  omics_impute_method = NULL,
  omics_to_impute = NULL
)Arguments
- expomicset
 A
MultiAssayExperimentobject containing exposures and omics data.- exposure_impute_method
 Character. Imputation method to use for exposure variables. Defaults to
"median".- exposure_cols
 Character vector. Names of columns in
colDatato impute. IfNULL, all numeric columns are used.- omics_impute_method
 Character. Imputation method to use for omics data. Defaults to
"knn".- omics_to_impute
 Character vector. Names of omics datasets to impute. If
NULL, all omics datasets are included.
Details
For exposures, numeric columns in colData are imputed using
the selected method. For omics data, assays are selected and
imputed individually.
Supported imputation methods include:
"median": Median imputation usingnaniar::impute_median_all"mean": Mean imputation usingnaniar::impute_mean_all"knn": k-nearest neighbor imputation usingimpute::impute.knn"mice": Multiple imputation using chained equations (mice::mice)"dep": MinProb imputation for proteomics usingDEP::impute"missforest": Random forest-based imputation usingmissForest::missForest"lod_sqrt2": Substitution of missing values with LOD/sqrt(2), where LOD is the smallest non-zero value per variable
Examples
# Create example data
mae <- make_example_data(
    n_samples = 20,
    return_mae = TRUE
)
#> Ensuring all omics datasets are matrices with column names.
#> Creating SummarizedExperiment objects.
#> Creating MultiAssayExperiment object.
#> MultiAssayExperiment created successfully.
# Introduce some missingness
MultiAssayExperiment::colData(mae)$exposure_pm25[sample(1:20, 5)] <- NA
# Filter features and exposures with high missingness
mae <- run_impute_missing(
    expomicset = mae,
    exposure_impute_method = "median"
)