Skip to contents

Constructs an undirected, weighted network where nodes are enriched GO terms and edge weights are Jaccard overlap coefficients between their DE gene sets. Edges below overlap_threshold are removed. Leiden community detection is applied to identify clusters of functionally related terms. The function produces two PNG plots (a plain up/down-regulation view and a Leiden-clustered view) and exports CSV summaries of node-level and cluster-level statistics.

Usage

generate_overlap_network(
  results_set,
  collection,
  de_genes,
  collection_name = "network",
  overlap_threshold = 0.1,
  leiden_resolution = 0.03,
  outdir = NULL,
  plot_to_screen = TRUE,
  file_stem = NULL
)

Arguments

results_set

Data frame with row names equal to GO term descriptions and at least one column avg_log2FC (mean fold-change across member genes).

collection

Named list of character vectors. Each element is a set of gene symbols belonging to one GO term. Names must match rownames(results_set).

de_genes

Character vector of all differentially expressed gene symbols. Gene sets in collection are intersected with this vector before computing overlaps.

collection_name

Character label used in plot titles and console output. Typically formatted as "<comparison>_<celltype>" (e.g., "CD206_Car_vs_Bleo_Fibroblast_1").

overlap_threshold

Numeric [0, 1]. Jaccard coefficient below which edges are set to zero. Default 0.1.

leiden_resolution

Numeric. Resolution parameter for igraph::cluster_leiden. Default 0.03.

outdir

Character path to the output directory. If NULL, no files are written. The directory is created if it does not exist.

plot_to_screen

Logical. If TRUE (default), the Leiden- clustered network is also drawn to the active graphics device (e.g., the RStudio Plots pane).

file_stem

Character string used as the filename prefix for all exported files. If NULL or empty, collection_name is used.

Value

A list with six elements:

network

An igraph graph object with vertex attributes fc, regulation, size, cluster, color, and shape.

clustering

The communities object returned by igraph::cluster_leiden.

overlap_matrix

Square numeric matrix of pairwise Jaccard coefficients (before thresholding).

go_term_df

Data frame with columns go_term, cluster, degree, regulation, avg_log2FC, and leiden_resolution.

cluster_stats

Data frame with columns cluster, avg_degree, and n_terms.

leiden_resolution

The resolution value used.

Side effects

When outdir is non-NULL, the following files are written:

  • <file_stem>_network_summary.csv – node/edge/cluster counts.

  • <file_stem>-cluster-stats.csv – per-cluster average degree.

  • <file_stem>-plain.png – network coloured by regulation direction (red = up, blue = down).

  • <file_stem>-clustered-leiden.png – network coloured by Leiden cluster membership.