Generate a Jaccard-Overlap Network of GO Terms with Leiden Clustering
generate_overlap_network.RdConstructs an undirected, weighted network where nodes are enriched GO terms
and edge weights are Jaccard overlap coefficients between their DE gene
sets. Edges below overlap_threshold are removed. Leiden community
detection is applied to identify clusters of functionally related terms.
The function produces two PNG plots (a plain up/down-regulation view and a
Leiden-clustered view) and exports CSV summaries of node-level and
cluster-level statistics.
Usage
generate_overlap_network(
results_set,
collection,
de_genes,
collection_name = "network",
overlap_threshold = 0.1,
leiden_resolution = 0.03,
outdir = NULL,
plot_to_screen = TRUE,
file_stem = NULL
)Arguments
- results_set
Data frame with row names equal to GO term descriptions and at least one column
avg_log2FC(mean fold-change across member genes).- collection
Named list of character vectors. Each element is a set of gene symbols belonging to one GO term. Names must match
rownames(results_set).- de_genes
Character vector of all differentially expressed gene symbols. Gene sets in
collectionare intersected with this vector before computing overlaps.- collection_name
Character label used in plot titles and console output. Typically formatted as
"<comparison>_<celltype>"(e.g.,"CD206_Car_vs_Bleo_Fibroblast_1").- overlap_threshold
Numeric
[0, 1]. Jaccard coefficient below which edges are set to zero. Default0.1.- leiden_resolution
Numeric. Resolution parameter for
igraph::cluster_leiden. Default0.03.- outdir
Character path to the output directory. If
NULL, no files are written. The directory is created if it does not exist.- plot_to_screen
Logical. If
TRUE(default), the Leiden- clustered network is also drawn to the active graphics device (e.g., the RStudio Plots pane).- file_stem
Character string used as the filename prefix for all exported files. If
NULLor empty,collection_nameis used.
Value
A list with six elements:
- network
An
igraphgraph object with vertex attributesfc,regulation,size,cluster,color, andshape.- clustering
The
communitiesobject returned byigraph::cluster_leiden.- overlap_matrix
Square numeric matrix of pairwise Jaccard coefficients (before thresholding).
- go_term_df
Data frame with columns
go_term,cluster,degree,regulation,avg_log2FC, andleiden_resolution.- cluster_stats
Data frame with columns
cluster,avg_degree, andn_terms.- leiden_resolution
The resolution value used.
Side effects
When outdir is non-NULL, the following files are written:
<file_stem>_network_summary.csv– node/edge/cluster counts.<file_stem>-cluster-stats.csv– per-cluster average degree.<file_stem>-plain.png– network coloured by regulation direction (red = up, blue = down).<file_stem>-clustered-leiden.png– network coloured by Leiden cluster membership.