Close Menu
    Trending
    • OpenAIs nya webbläsare ChatGPT Atlas
    • Creating AI that matters | MIT News
    • Scaling Recommender Transformers to a Billion Parameters
    • Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know
    • Is RAG Dead? The Rise of Context Engineering and Semantic Layers for Agentic AI
    • ChatGPT Gets More Personal. Is Society Ready for It?
    • Why the Future Is Human + Machine
    • Why AI Is Widening the Gap Between Top Talent and Everyone Else
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Estimating Disease Rates Without Diagnosis
    Artificial Intelligence

    Estimating Disease Rates Without Diagnosis

    ProfitlyAIBy ProfitlyAIJuly 18, 2025No Comments7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    genes are so essential for triggering the immune system, that we are able to use these genes to foretell an individual’s immune response. Right here I’ll show easy methods to estimate illness charges simply from immune gene frequencies. All of the steps from getting the immune gene knowledge, to figuring out excessive threat international locations, and assessing limitations of the mannequin are mentioned and the total code is out there at github.com/DAWells/HLA_spondylitis_rate.

    HLA genes are related to an individual’s response to an infection, vaccination, and sometimes very strongly linked to autoimmune ailments. So strongly linked actually, that in massive teams we are able to predict illness charges from HLA gene frequencies. HLA frequencies are extensively studied and so typically accessible, permitting us to estimate charges of autoimmune circumstances which can be lacking or inaccurate as a result of challenges of analysis. On this publish we’ll mix research to generate correct estimates of immune gene frequencies and use these to foretell nationwide charges of ankylosing spondylitis.

    allelefrequencies.net is a database of human immune gene frequency knowledge from the world over which is an open entry, free and public useful resource (Gonzalez-Galarza et al 2020). Nevertheless, it may be tough to obtain and mix knowledge from a number of initiatives; this makes it exhausting to reap the benefits of all this knowledge. Fortunately HLAfreq is a python bundle which makes it straightforward to get the most recent knowledge from allelefrequencies.web and put together them for our evaluation. (Full disclosure, I’m one of many authors of HLAfreq!).

    Ankylosing spondylitis is a type of arthritis, and 90% of sufferers have a particular model of the HLA B gene. To get the frequency of this model in numerous international locations, I downloaded all accessible frequency for this gene and mixed research of the identical nation, weighting by pattern dimension. In short, the mixture is predicated on the Dirichlet distribution and we are able to use a Bayesian method to estimate uncertainty too. Singapore is used for instance within the determine under (all figures on this article are generated by the creator). Completely different HLA-B gene variations (often known as alleles) are proven on the y axis, with their frequency in Singapore on the x axis. Information from the unique Singapore research are proven in color, and mixed estimates in black. I targeted on the weighted common on this evaluation, which is proven by the black circles. HLAfreq additionally calculates a Bayesian estimate with uncertainty which is indicated by the black bars.

    Frequncy of HLA-B alleles in Singapore. Every particular person examine has its personal color. Black reveals the mixed estimate with uncertainty.

    The code used to obtain, mix, and plot the HLA-B allele frequency knowledge for Singapore is under.

    # Obtain uncooked knowledge
    base_url = HLAfreq.makeURL(“Singapore”, commonplace="g", locus="B")
    aftab = HLAfreq.getAFdata(base_url)
    # Put together knowledge
    aftab = HLAfreq.only_complete(aftab)
    aftab = HLAfreq.decrease_resolution(aftab, 1)
    # Mix knowledge from a number of research
    caf = HLAfreq.combineAF(aftab)
    hdi = HLAhdi.AFhdi(aftab, credible_interval=0.95)
    caf = pd.merge(caf, hdi, how="left", on="allele")
    # Plot gene frequencies
    HLAfreq.plotAF(caf, aftab.sort_values("allele_freq"), hdi=hdi, compound_mean=hdi)
    

    Now we have now the nationwide allele frequencies we are able to pair them with nationwide illness charges to review the correlation. I’ve used the illness charges reported in Dean et al 2014. I log reworked the illness fee to make it usually distributed so I may match an extraordinary least squares linear regression. As anticipated, there was a big optimistic correlation; international locations with increased frequencies of HLA-B*27 had increased charges of ankylosing spondylitis. The exception to this was Finland which had an unusually excessive frequency of HLA-B*27 however a middling fee of illness. I eliminated Finland from the mannequin as an outlier, a choice which was supported by “statistical leverage”. (Leverage means this one level had too massive an affect on the general mannequin; we would like the mannequin to inform us about international locations basically not anybody nation specifically).

    We will use our linear regression mannequin to foretell charges of ankylosing spondylitis in international locations the place we all know the HLA-B*27 frequency. This tells us that international locations like Austria and Croatia have excessive predicted ankylosing spondylitis charges. Utilizing these predictions will increase the variety of international locations with illness fee estimates from 16 to 52 and can assist establish international locations that might profit from extra surveillance. On this planet map under, international locations with low identified or predicted charges of ankylosing spondylitis are plotted in blue and excessive charges in yellow. International locations with identified charges are outlined in black and people with predicted charges are outlined in cyan or orange. Cyan is used for international locations within the vary of our mannequin and orange is used for international locations outdoors our mannequin’s vary, see under for why that is essential.

    Recognized or predicted fee of ankylosing spondylitis by nation. International locations with black outlines have identified charges, cyan outlines have predicted charges, orange outlines have predicted charges with uncommon HLA-B*27 frequencies.

    We needs to be cautious about predicting illness charges for international locations with HLA-B*27 charges outdoors of the vary of our mannequin. Of the 36 international locations we have now predicted illness charges for, 10 have HLA-B*27 frequencies increased or decrease than any nation we utilized in our mannequin. Due to this fact, we are able to’t ensure the mannequin will give correct predictions for these international locations. Particularly, predictions could also be unreliable for international locations with excessive HLA-B*27 charges, we already know that Finland didn’t match our mannequin. This might be due to a non-linear development however we wouldn’t have sufficient knowledge to discover these excessive frequencies.

    Correlation between HLA-B*27 frequency and fee of ankylosing spondylitis. Black factors are international locations with identified charges. Predicted charges are cyan and orange circles; orange for international locations with uncommon HLA-B*27 frequencies. The outlier Finland is in purple.

    The international locations with identified illness charges are plotted with stuffed factors. Finland which was omitted from the mannequin is plotted in purple. The expected illness charges are plotted as open circles, cyan for international locations within the mannequin’s vary and orange outdoors of it. The boldness intervals of the mannequin are proven as dashed strains, and the prediction intervals are proven as a gray ribbon. A fast reminder concerning the distinction: we count on the true relationship to fall inside the confidence intervals 95% of the time, and we count on 95% of information factors to fall inside the prediction intervals.

    It’s value taking a second to remind ourselves that regardless of this correlation, there are lots of different elements influencing illness charges. Clearly a person’s probability of creating ankylosing spondylitis can also be impacted by their setting and different genetic elements. So if we needed actually correct illness fee predictions we would want take into account these different variables. However given how straightforward it’s to get HLA frequency knowledge, it’s a fairly spectacular predictor for a illness that may take years to diagnose.

    Conclusion

    HLA genes have a robust influence on human well being by means of an infection, vaccination, autoimmune ailments, and organ transplants. Due to these sturdy relationships, we are able to use extensively accessible HLA frequency knowledge to review these well being traits not directly. Sources like allelefrequency.net and HLAfreq make it simpler to review these relationships, both by these correlations immediately or utilizing allele frequencies as a proxy when different knowledge is lacking. I hope this publish has received you enthusiastic about inquiries to ask utilizing HLA frequency knowledge.

    References

    Gonzalez-Galarza, F. F., McCabe, A., Santos, E. J. M. D., Jones, J., Takeshita, L., Ortega-Rivera, N. D., … & Jones, A. R. (2020). Allele frequency web database (AFND) 2020 replace: gold-standard knowledge classification, open entry genotype knowledge and new question instruments. Nucleic acids analysis, 48(D1), D783-D788.

    Dean, L. E., Jones, G. T., MacDonald, A. G., Downham, C., Sturrock, R. D., & Macfarlane, G. J. (2014). International prevalence of ankylosing spondylitis. Rheumatology, 53(4), 650-657.

    Wells, D. A., & McAuley, M. (2023). HLAfreq: Obtain and mix HLA allele frequency knowledge. bioRxiv, 2023-09. https://doi.org/10.1101/2023.09.15.557761



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe Age of Self-Evolving AI Is Here
    Next Article Model predicts long-term effects of nuclear waste on underground disposal systems | MIT News
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    Creating AI that matters | MIT News

    October 21, 2025
    Artificial Intelligence

    Scaling Recommender Transformers to a Billion Parameters

    October 21, 2025
    Artificial Intelligence

    Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know

    October 21, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    ChatGPT Just Got “Instant Checkout”

    October 7, 2025

    OpenAI is launching a version of ChatGPT for college students

    July 29, 2025

    The best AI health apps in 2025: Smart tools for better wellbeing

    April 4, 2025

    Anthropic testar ett AI-webbläsartillägg för Chrome

    September 2, 2025

    A platform to expedite clean energy projects | MIT News

    April 7, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Prediction vs. Search Models: What Data Scientists Are Missing

    October 2, 2025

    Ny gratis Google AI universell Röstöversättare

    September 8, 2025

    Kernel Case Study: Flash Attention

    April 3, 2025
    Our Picks

    OpenAIs nya webbläsare ChatGPT Atlas

    October 22, 2025

    Creating AI that matters | MIT News

    October 21, 2025

    Scaling Recommender Transformers to a Billion Parameters

    October 21, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.