Extract Top Contributing Features for Factors — extract_top_factor

Identifies the most influential features for specified factors using multiomics integration results. Features are selected based on either a percentile cutoff or an absolute loading threshold.

Usage

extract_top_factor_features(
  expomicset,
  factors = NULL,
  pval_col = "p_adjust",
  pval_thresh = 0.05,
  method = "percentile",
  percentile = 0.9,
  threshold = 0.3,
  action = "add"
)

Arguments

expomicset: A MultiAssayExperiment object containing integration results.
factors: A character vector specifying the factors of interest. If NULL, factors are automatically selected from the association results using the pval_col and pval_thresh criteria.
pval_col: A string specifying the column name of the p-value or adjusted p-value used for factor selection if factors is NULL. Default is "p_adjust".
pval_thresh: A numeric value specifying the significance threshold for selecting factors from association results when factors is NULL. Default is 0.05.
method: A character string specifying the feature selection method ("percentile" or "threshold"). Default is "percentile".
percentile: A numeric value between 0 and 1 indicating the percentile threshold for feature selection when method = "percentile". Default is 0.9.
threshold: A numeric value specifying the absolute loading cutoff for feature selection when method = "threshold". Default is 0.3.
action: A character string indicating whether to return results ("get") or add them to metadata ("add"). Default is "add".

Value

If action = "add", returns the modified expomicset with selected features stored in metadata. If action = "get", returns a data frame containing:

feature: The selected feature contributing to the factor.
factor: The factor to which the feature contributes.
loading: The factor loading value of the feature.
exp_name: The experiment from which the feature originated.

Details

The function extracts factor loadings from metadata(expomicset), applies filtering based on the selected method, and identifies top contributing features for each specified factor.

If factors is not provided, the function will automatically select statistically significant factors from metadata(expomicset)$association$assoc_factors$results_df using the specified pval_col and pval_thresh as criteria.

Features can be selected using:

Percentile-based filtering (method = "percentile"): Selects features with absolute loadings above a specified percentile.
Threshold-based filtering (method = "threshold"): Selects features with absolute loadings exceeding a fixed value.

Examples

# create example data
mae <- make_example_data(
    n_samples = 20,
    return_mae = TRUE
)
#> Ensuring all omics datasets are matrices with column names.
#> Creating SummarizedExperiment objects.
#> Creating MultiAssayExperiment object.
#> MultiAssayExperiment created successfully.

# perform multiomics integration
mae <- run_multiomics_integration(
    mae,
    method = "MCIA",
    n_factors = 3
)
#> Scaling each assay in MultiAssayExperiment.
#> Running multi-omics integration using MCIA...
#> Applying MCIA with `nipalsMCIA`
#> Performing column-level pre-processing...
#> Column pre-processing completed.
#> Performing block-level preprocessing...
#> Block pre-processing completed.
#> Computing order 1 scores
#> Computing order 2 scores
#> Computing order 3 scores

top_feats <- extract_top_factor_features(
    mae,
    factors = c("V1", "V2", "V3"),
    method = "percentile",
    percentile = 0.9,
    action = "get"
)
#> Extracting top contributing features for specified factors.
#> Using MCIA block loadings.
#> Applying percentile-based filtering (>90%).
#> Selected 63 features contributing to specified factors.