Impute Missing Exposure and Omics Data in a MultiAssayExperiment
Source:R/run_impute_missing.R
run_impute_missing.Rd
Performs missing data imputation on both exposure variables (from colData
) and
omics datasets (from experiments
) within a MultiAssayExperiment
object.
Usage
run_impute_missing(
expomicset,
exposure_impute_method = "median",
exposure_cols = NULL,
omics_impute_method = NULL,
omics_to_impute = NULL
)
Arguments
- expomicset
A
MultiAssayExperiment
object containing exposures and omics data.- exposure_impute_method
Character. Imputation method to use for exposure variables. Defaults to
"median"
.- exposure_cols
Character vector. Names of columns in
colData
to impute. IfNULL
, all numeric columns are used.- omics_impute_method
Character. Imputation method to use for omics data. Defaults to
"knn"
.- omics_to_impute
Character vector. Names of omics datasets to impute. If
NULL
, all omics datasets are included.
Details
For exposures, numeric columns in colData
are imputed using the selected method.
For omics data, assays are selected and imputed individually.
Supported imputation methods include:
"median"
: Median imputation usingnaniar::impute_median_all
"mean"
: Mean imputation usingnaniar::impute_mean_all
"knn"
: k-nearest neighbor imputation usingimpute::impute.knn
"mice"
: Multiple imputation using chained equations (mice::mice
)"dep"
: MinProb imputation for proteomics usingDEP::impute
"missforest"
: Random forest-based imputation usingmissForest::missForest
"lod_sqrt2"
: Substitution of missing values with LOD/sqrt(2), where LOD is the smallest non-zero value per variable