Close Menu
    Trending
    • Creating AI that matters | MIT News
    • Scaling Recommender Transformers to a Billion Parameters
    • Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know
    • Is RAG Dead? The Rise of Context Engineering and Semantic Layers for Agentic AI
    • ChatGPT Gets More Personal. Is Society Ready for It?
    • Why the Future Is Human + Machine
    • Why AI Is Widening the Gap Between Top Talent and Everyone Else
    • Implementing the Fourier Transform Numerically in Python: A Step-by-Step Guide
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Statistical Method mcRigor Enhances the Rigor of Metacell Partitioning in Single-Cell Data Analysis
    Artificial Intelligence

    Statistical Method mcRigor Enhances the Rigor of Metacell Partitioning in Single-Cell Data Analysis

    ProfitlyAIBy ProfitlyAIOctober 17, 2025No Comments7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    The article was co-written with Pan Liu, postdoctoral researcher at UCLA and Fred Hutchinson Most cancers Heart. Pan is the primary creator of the mcRigor Nature Communications article.

    Single-cell sequencing applied sciences have superior quickly lately, offering unprecedented alternatives to uncover mobile range, dynamic modifications in cell states, and underlying gene regulatory mechanisms. Along with the broadly used single-cell RNA sequencing (scRNA-seq) 1,2, new modalities comparable to single-cell chromatin accessibility sequencing (scATAC-seq) 3,4 and joint profiling of transcriptome and chromatin accessibility (scMultiome) 5 have enabled the dissection of mobile heterogeneity at single-cell decision throughout a number of omics layers. Nevertheless, the information generated by these applied sciences are sometimes extremely sparse, primarily resulting from restricted sequencing depth per cell, in addition to imperfect reverse transcription and nonlinear amplification, which trigger extremely expressed genes to dominate sequencing capability and make lowly expressed genes troublesome to detect 6.

    Fig. 1. mcRigor publication.

    To alleviate information sparsity and noise, researchers proposed the “metacell” idea, wherein cells with comparable expression profiles are aggregated right into a single consultant unit—a metacell—whose expression is outlined by the imply expression of its constituent cells, thereby enhancing sign and lowering noise. But, present metacell building strategies typically yield considerably completely different metacell partitions and are extremely delicate to hyperparameter settings, significantly the common metacell dimension 7. Such lack of consistency makes it troublesome for customers to find out which metacell partition is extra reliable and to what extent the ensuing metacell profiles protect true organic indicators. Consequently, the robustness of downstream analyses is compromised, and the potential of metacells as a basic information preprocessing framework throughout various duties and omics modalities stays restricted.

    Our Nature Communications paper 8 gives a rigorous statistical definition of a metacell based mostly on a two-layer mannequin of single-cell sequencing information: the higher layer captures the organic variation in true expression, whereas the decrease layer fashions the sequencing course of that generates measured expression from the true expression. Constructing on this definition, we develop mcRigor, a statistical framework for detecting doubtful metacells inside a given partition and choosing the optimum metacell partitioning methodology and hyperparameter throughout candidate method-hyperparameter configurations.

    mcRigor not solely detects and removes doubtful metacells (its prolonged model, mcRigor two-step, additional disassembles doubtful metacells into single cells and re-assembles them into smaller, extra dependable ones), thereby bettering the reliability of downstream analyses comparable to gene co-expression and enhancer–gene regulation, but in addition permits data-driven collection of essentially the most appropriate metacell partitioning technique for every dataset. Owing to its versatile compatibility, mcRigor will be readily utilized to single-cell transcriptomic, chromatin accessibility, and multi-omic information (Fig. 2). As well as, mcRigor gives a unified analysis criterion for benchmarking completely different metacell building strategies, providing dependable steering for researchers in methodology choice.

    Within the first a part of our paper 8, we introduce mcRigor’s methodology for detecting doubtful metacells. Particularly, mcRigor quantifies the interior heterogeneity of every metacell utilizing a feature-correlation-based statistic, mcDiv, which measures the deviation of characteristic–characteristic correlations from independence. The rationale is that if all member cells share the identical true expression ranges and the noticed variation amongst them arises purely from the measurement course of, the options must be roughly impartial. mcRigor then constructs a null distribution for mcDiv utilizing a novel double permutation process and identifies metacells that considerably deviate from this null as doubtful (Fig. 2a).

    In each semi-simulated and actual PBMC datasets, mcRigor precisely distinguishes reliable metacells from doubtful ones (Fig. 2b–c). We additional show mcRigor’s effectiveness in bettering the reliability of a number of downstream analyses. In cell-line information analyses, eradicating doubtful metacells markedly will increase the signal-to-noise ratio of cell-cycle marker genes (Fig. 2nd). In COVID-19 versus wholesome management information analyses, mcRigor eliminates spurious gene correlations brought on by doubtful metacells and divulges stronger co-expression inside adaptive immune response modules (Fig. 2e). In scMultiome information analyses, mcRigor enhances the detectability of enhancer–gene associations, filtering out weakly supported false positives whereas preserving indicators in step with these noticed on the single-cell stage (Fig. 2f).

    Fig. 2. mcRigor detects doubtful metacells and rectifies downstream evaluation for each scRNA-seq and multiome (RNA+ATAC) information. a, Schematic of the mcRigor methodology for doubtful metacell detection. b, mcRigor successfully assesses metacell heterogeneity and detects doubtful metacells inside the MetaCell methodology’s partitioning on semi-synthetic information. c, Doubtful metacells recognized by mcRigor exhibit inside heterogeneity and will sometimes seem as outliers, whereas reliable metacells stay internally homogeneous. d, mcRigor enhances cell-cycle marker gene expression inside cell strains. e, mcRigor reveals enriched co-expression of an adaptive immune response gene module (highlighted in yellow) in COVID-19 samples (backside row) in comparison with wholesome controls (high row), f, Making use of mcRigor to the unique metacell partition from the SEACells paper empowers gene regulatory inference (left) and produces dependable discoveries (proper). 
    Fig. 3. mcRigor optimizes metacell methodology and hyperparameter choice for numerous single-cell information analyses. a, Schematic of the mcRigor methodology for optimizing metacell partitioning, utilizing Rating because the optimization criterion to steadiness DubRate and ZeroRate, illustrated with the optimization of MetaCell partitions on semi-synthetic information for example. b, Line plots displaying the zero proportions in metacell partitions generated by three strategies MetaCell, SEACells, and SuperCell throughout various granularity ranges (y). The optimized metacell partitions (triangles) intently align with the zero proportion noticed in smRNA FISH information (pink line). c, mcRigor optimizes metacell methodology and hyperparameter choice for differential gene expression evaluation. In (b) and (c), the coloured triangles point out the optimum y values chosen by mcRigor for the three strategies. d, mcRigor’s optimized metacell partition higher reveals temporal immune cell trajectories in comparison with the unique metacell partition from the Zman-seq research 9.

    Within the second a part of our paper 8, we current mcRigor’s methodology for evaluating metacell partitions and optimizing hyperparameters. By balancing metacell trustworthiness towards information sparsity, mcRigor assigns an total analysis rating to every candidate partition and mechanically selects the optimum methodology–parameter configuration amongst all candidates, thereby remodeling the empirical technique of methodology and parameter tuning into data-driven automated decision-making (Fig. 3a).

    We illustrate the utility of this optimization performance throughout various downstream duties. For example, the zero proportion of mcRigor-optimized metacells intently matches the gold-standard zero proportion measured by smRNA-FISH, demonstrating its means to differentiate technical zeros from organic zeros (Fig. 3b). In differential expression evaluation, outcomes based mostly on mcRigor-optimized metacells align extra intently with these obtained from bulk RNA-seq information, indicating improved reliability (Fig. 3c). In time-course information, mcRigor-optimized metacells improve trajectory decision and reveal clearer gene-expression dynamics in step with experimental proof (Fig. 3d).

    The mcRigor R bundle and on-line tutorials can be found at https://jsb-ucla.github.io/mcRigor/ 

    Full paper accessible at https://www.nature.com/articles/s41467-025-63626-5 

    References:

    1. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).

    2. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    3. Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).

    4. Cusanovich, D. A. et al. Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing. Science 348, 910–914 (2015).

    5. Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).

    6. Jiang, R., Sun, T., Song, D. & Li, J. J. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol. 23, 31 (2022).

    7. Bilous, M., Hérault, L., Gabriel, A. A., Teleman, M. & Gfeller, D. Building and analyzing metacells in single-cell genomics data. Mol. Syst. Biol. 20, 744–766 (2024).

    8. Liu, P. & Li, J. J. mcRigor: a statistical method to enhance the rigor of metacell partitioning in single-cell data analysis. bioRxiv (2024) doi:10.1101/2024.10.30.621093.

    9. Kirschenbaum, D. et al. Time-resolved single-cell transcriptomics defines immune trajectories in glioblastoma. Cell 187, 149–165.e23 (2024).



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleFrom slop to Sotheby’s? AI art enters a new phase
    Next Article How I Used Machine Learning to Predict 41% of Project Delays Before They Happened
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Creating AI that matters | MIT News

    October 21, 2025
    Artificial Intelligence

    Scaling Recommender Transformers to a Billion Parameters

    October 21, 2025
    Artificial Intelligence

    Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know

    October 21, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Adding Training Noise To Improve Detections In Transformers

    April 28, 2025

    Interactive Data Exploration for Computer Vision Projects with Rerun

    July 2, 2025

    Why humanoid robots need their own safety rules

    June 11, 2025

    MobileNetV1 Paper Walkthrough: The Tiny Giant

    September 4, 2025

    What is Test Time Training

    April 4, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents

    May 22, 2025

    Proton lanserar Lumo en AI-assistent med fokus på integritet och krypterade chattar

    July 27, 2025

    ChatGPT styrde ett rymdskepp och överraskade forskarna

    July 5, 2025
    Our Picks

    Creating AI that matters | MIT News

    October 21, 2025

    Scaling Recommender Transformers to a Billion Parameters

    October 21, 2025

    Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know

    October 21, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.