Performs hierarchical clustering of samples using exposure data from colData(expomicset)
.
Usage
run_cluster_samples(
expomicset,
exposure_cols = NULL,
dist_method = NULL,
user_k = NULL,
cluster_method = "ward.D",
clustering_approach = "diana",
action = "add"
)
Arguments
- expomicset
A
MultiAssayExperiment
object containing omics and exposure data.- exposure_cols
A character vector of column names in
colData(expomicset)
to use for clustering.- dist_method
A character string specifying the distance metric (
"euclidean"
,"gower"
, etc.). IfNULL
, it is automatically determined.- user_k
An integer specifying the number of clusters. If
NULL
, an optimalk
is determined.- cluster_method
A character string specifying the hierarchical clustering method. Default is
"ward.D"
.- clustering_approach
A character string specifying the method for determining
k
("diana"
,"gap"
,"elbow"
,"dynamic"
, or"density"
). Default is"diana"
.- action
A character string specifying
"add"
(store results in metadata) or"get"
(return clustering results). Default is"add"
.
Value
If action="add"
, returns the updated expomicset
.
If action="get"
, returns a list with:
- sample_cluster
A hierarchical clustering object (
hclust
).- sample_groups
A named vector of sample cluster assignments.
- heatmap
A
ComplexHeatmap
object visualizing sample clustering.
Details
This function:
Extracts numeric exposure data from
colData(expomicset)
.Computes a distance matrix (
"gower"
for mixed data,"euclidean"
for numeric).Determines the optimal number of clusters (
k
) using the specified method.Performs hierarchical clustering (
hclust
) and assigns samples to clusters.Generates a heatmap of scaled exposure values.
Stores results in
metadata(expomicset)$sample_clustering
whenaction="add"
.
Examples
if (FALSE) { # \dontrun{
expom <- run_cluster_samples(
expomicset = expom,
exposure_cols = c("PM2.5", "NO2"),
dist_method = "gower",
clustering_approach = "gap"
)
} # }