Abstract of This Research
- {Hardware} selections – particularly {hardware} sort and its amount – together with coaching time, have a major constructive affect on power, water, and carbon footprints throughout AI mannequin coaching, whereas architecture-related components don’t.
- The interplay between {hardware} amount and coaching time slows the expansion of power, water, and carbon consumption barely by 0.00002%.
- Total power effectivity throughout AI mannequin coaching has improved barely over time, round 0.13% per 12 months.
- Longer coaching time can steadily “drain” the general power effectivity by 0.03% per hour.
Define
- Introduction
- Analysis Query 1: Architectural and {Hardware} Decisions vs Useful resource Consumption
- Analysis Query 2: Power Effectivity over Time
- Strategies
- Estimation strategies
- Evaluation strategies
- Outcomes
- RQ1:
- Structure Components Don’t Maintain A lot Predictive Energy as {Hardware} Ones
- Ultimate Mannequin Choice
- Coefficients Interpretation
- RQ2
- RQ1:
- Dialogue
1.
Ever because the Nineteen Forties, when the primary digital computer systems had been invented, scientists have at all times dreamed of making machines as sensible as people, what now grew to become Synthetic Intelligence (AI). Quick ahead to November 2022, when ChatGPT — an AI mannequin able to listening and answering immediately — was launched, it felt like a dream come true. Afterward, a whole lot of latest AI fashions have rushed into the race (check out the timeline right here). As we speak, each single day, one billion messages are despatched by means of ChatGPT (OpenAI Newsroom, 2024), highlighting the speedy AI adoption by customers. But, few individuals cease to ask: What are the environmental prices behind this new comfort?
Earlier than customers can ask AI questions, these fashions should first be educated. Coaching is the method the place fashions, or algorithms, are fed datasets and attempt to discover the very best match. Think about a easy regression y = ax + b
: coaching means feeding the algorithm x
and y
values and permitting it to search out the very best parameters a
and b
. In fact, AI fashions usually wouldn’t be so simple as a linear regression. They might comprise tons of parameters, thus requiring huge quantities of computation and datasets. Furthermore, they would wish to run a considerable quantity of specialised {hardware} that may deal with that sheer quantity of computation and complexity. All of that mixed made AI devour far more power than conventional software program.
As well as, AI coaching requires a steady and uninterrupted power provide, which primarily comes from non-renewable power sources like pure gasoline or coal-based, as a result of photo voltaic and wind power can fluctuate primarily based on climate situations (Calvert, 2024). Furthermore, because of the excessive depth of power use, information facilities — buildings that retailer AI fashions — warmth up quickly, emitting important carbon footprints and requiring giant quantities of water for cooling. Due to this fact, AI fashions have broad environmental impacts that embody not solely power utilization but in addition water consumption and carbon emissions.
Sadly, there’s not a lot official and disclosed information concerning power, water, and carbon footprints of AI fashions. The general public stays largely unaware of those environmental impacts and thus has not created robust strain or motivations for tech firms to take extra systematic adjustments. Moreover, whereas some enhancements have been made — particularly in {hardware} power effectivity — there stays little systematic or coordinated effort to successfully cut back the general environmental impacts of AI. Due to this fact, I hope to improve public consciousness of those hidden environmental prices and to discover whether or not latest enhancements in power effectivity are substantial. Extra notably, I’m searching for to deal with two analysis questions on this examine:
RQ1: Is there a major relationship between AI fashions’ architectural and {hardware} selections and their useful resource consumption throughout coaching?
RQ2: Has AI coaching change into energy-efficient over time?
2. Strategies:
The paper used a dataset known as Notable AI Fashions from Epoch AI (Epoch AI, 2025), a analysis institute that investigates the tendencies of AI improvement. The fashions included had been both traditionally related or symbolize cutting-edge advances in AI. Every mannequin was recorded with key coaching info such because the variety of parameters, dataset measurement, whole compute, {hardware} sort, and {hardware} amount, all collected from varied sources, together with literature critiques, publications, and analysis papers. The dataset additionally reported the arrogance stage for these attributes. To supply a dependable evaluation, I evaluated solely fashions with a confidence ranking of “Assured” or “Probably”.
As famous earlier, there was restricted information concerning direct useful resource consumption. Luckily, the dataset authors have estimated Complete Energy Draw (in watts, or W) primarily based on a number of components, together with {hardware} sort, {hardware} amount, and another information heart effectivity charges and overhead. You will need to word that energy and power are completely different: energy (W) refers back to the quantity of electrical energy used per unit of time, whereas power (in kilowatt-hours, or kWh) measures the whole cumulative electrical energy consumed over time.
Since this examine investigated useful resource consumption and power effectivity throughout the coaching part of AI fashions, I constructed and estimated 4 environmental metrics: whole power used (kWh), whole water used (liters, or L), whole carbon emissions (kilograms of CO2e, or kgCO2e), and power effectivity (FLOPS/W, to be defined later).
a. Estimation strategies
First, this examine estimated power consumption by choosing fashions with out there whole energy draw (W) and coaching occasions (hours). Power was computed as follows:
[text{Energy (kWh)} = frac{text{Total Power Draw (W)}}{1000} times text{Training Time (h)}]
Subsequent, water consumption and carbon emissions had been estimated by rearranging the formulation of two commonplace charges utilized in information facilities: Water Utilization Effectiveness (WUE, in L/kWh) and Carbon Depth (CI, in kgCO2e/kWh):
[text{WUE (L/kWh)} = frac{text{Water (L)}}{text{Energy (kWh)}} Longrightarrow text{Water (L)} = text{WUE (L/kWh)} times text{Energy (kWh)}]
This examine used the typical WUE of 0.36 L/kWh in 2023, reported by Lawrence Berkeley Nationwide Laboratory (2024).
[mathrm{CI left( frac{mathrm{kgCO_2e}}{mathrm{kWh}} right)} = frac{mathrm{Carbon (kgCO_2e)}}{mathrm{Energy (kWh)}} Longrightarrow mathrm{Carbon (kgCO_2e)} = mathrm{CI left( frac{mathrm{kgCO_2e}}{mathrm{kWh}} right)} times mathrm{Energy (kWh)}]
This examine used a median carbon depth of 0.548 kg CO₂e/kWh, reported by latest environmental analysis (Guidi et al, 2024).
Lastly, this examine estimated power effectivity utilizing the FLOPS/W metric. A floating-point operation (FLOP) is a primary arithmetic operation (e.g., addition or multiplication) with decimal numbers. FLOP per second (FLOPS) measures what number of such operations a system can carry out every second, and is usually used to judge computing efficiency. FLOPS per Watt (FLOPS/W) measures how a lot computing efficiency is achieved per unit of energy consumed:
[text{Energy Efficiency (FLOPS/W)} = frac{text{Total Compute (FLOP)}}{text{Training Time (h)} times 3600 times text{Total Power Draw(W)}}]
You will need to word that FLOPS/W is often used to measure hardware-level power effectivity. Nevertheless, it’s potential that the precise effectivity throughout AI coaching could also be completely different from the thereotical effectivity reported for the {hardware} used. I wish to examine whether or not any of the training-related components, past {hardware} alone, could contribute considerably to total power effectivity.
b. Evaluation strategies:
RQ1: Architectural and {Hardware} Decisions vs Useful resource Consumption
Amongst power, water, and carbon consumption, I targeted on modeling power consumption, as each water and carbon are derived straight from power utilizing fastened conversion charges and all three response variables shared an identical distributions. In consequence, I consider we may safely assume that the best-fitting mannequin of power consumption will be utilized to water and carbon. Whereas the statistical fashions had been the identical, I might nonetheless report the outcomes of all three to quantify what number of kilowatt-hours of power, liters of water, and kilograms of carbon are wasted for each unit improve in every important issue. That manner, I hope to speak the environmental impacts of AI in a extra holistic, concrete, and tangible phrases.

Based mostly on Determine 1, the histogram of power confirmed excessive proper skew and the presence of some outliers. Due to this fact, I carried out a log transformation on power information, aiming to stabilize variance and transfer the distribution nearer to normality (Fig. 2). A Shapiro-Wilk check confirmed the log-transformed power information is roughly regular (p-value = 0.5). Based mostly on this, two kinds of distributions had been thought of: the Gaussian (regular) and the Gamma distribution. Whereas the Gaussian distribution is approriate for symmetric and regular information, the Gamma distribution is extra suited to constructive, skewed information — generally utilized in engineering modeling the place small values happen extra ceaselessly than bigger values. For every distribution, the paper in contrast two approaches for incorporating the log transformation: straight log reworking the response variable versus utilizing a log hyperlink perform inside a generalized linear mannequin (GLM). I recognized the very best mixture of distribution and log method by evaluating their Akaike Data Criterion (AIC), diagnostic plots, together with prediction accuracy.
The candidate predictors included Parameters, Coaching Compute, Dataset Dimension, Coaching Time, {Hardware} Amount, and {Hardware} Sort. Structure-related variables comprised Parameters, Coaching Compute, and Dataset Dimension, whereas hardware-related variables consisted of {Hardware} Amount and {Hardware} Sort. Coaching Time didn’t fall neatly into both class however was included as a consequence of its central function in coaching AI fashions. After becoming all candidate predictors into the chosen GLM specification, I examined for multicollinearity to find out whether or not any variables ought to be excluded. Following this, I explored interplay phrases, as every useful resource consumption could not have responded linearly to every impartial variable. The next interactions had been thought of primarily based on area information and varied sources:
- Mannequin Dimension and {Hardware} Sort: Completely different {hardware} varieties have completely different reminiscence designs. The bigger and extra complicated the mannequin is, the extra reminiscence it requires (Bali, 2025). Power consumption will be completely different relying on how the {hardware} handles reminiscence calls for.
- Dataset Dimension and {Hardware} Sort: Equally, with completely different reminiscence designs, {hardware} could entry and skim information at completely different information measurement (Krashinsky et al, 2020). As dataset measurement will increase, power consumption can differ relying on how the {hardware} handles giant volumes of information.
- Coaching Time with {Hardware} Amount: Operating a number of {hardware} models on the similar time provides further overhead, like retaining all the things in sync (HuggingFace, 2025). As coaching goes on, these coordination prices can develop and put extra pressure on the system, resulting in sooner power drain.
- Coaching Time with {Hardware} Sort: As coaching time will increase, power use could differ throughout {hardware} varieties since some {hardware} varieties could handle warmth higher or preserve efficiency extra persistently over time, whereas others could decelerate or devour extra power.
RQ2: Power Effectivity over Time


The distribution of power effectivity was extremely skewed. Even after a log transformation, the distribution remained non-normal and overdispersed. To scale back distortion, I eliminated one excessive outlier with exceptionally excessive effectivity, because it was not a frontier mannequin and certain much less impactful. A Gamma GLM was then fitted utilizing Publication Date as the first predictor. If fashions utilizing the identical {hardware} exhibited extensive variation in effectivity, it could recommend that different components past the {hardware} could contribute to those variations. Due to this fact, structure and {hardware} predictors from the primary analysis query can be used to evaluate which variables considerably affect power effectivity over time.
3. Outcomes
RQ1: Architectural and {Hardware} Decisions vs Useful resource Consumption
I in the end used a Gamma GLM with a log hyperlink to mannequin useful resource consumption. This mix was chosen as a result of it had a decrease AIC worth (1780.85) than the Gaussian log-link mannequin (2005.83) and produced predictions that matched the uncooked information extra carefully than fashions utilizing a log-transformed response variable. These log-transformed fashions generated predictions that considerably underestimated the precise information on the unique scale (see this article on why log-transforming didn’t work in my case).
Structure Components Don’t Maintain A lot Predictive Energy as {Hardware} Ones
After becoming all candidate explanatory variables to a Gamma log-link GLM, we discovered that two architecture-related variables — Parameters and Dataset Dimension — don’t exhibit a major relationship with useful resource consumption (p > 0.5
). A multicollinearity check additionally confirmed that Dataset Dimension and Coaching Compute had been extremely correlated with different predictors (GVIF > 6
). Based mostly on this, I hypothesized that every one three structure variables—Parameters, Dataset Dimension, and Coaching Compute) could not maintain a lot predictive energy. I then eliminated all three variables from the mannequin and an ANOVA check confirmed that simplified fashions (Fashions 4 and 5) should not considerably worse than the complete mannequin (Mannequin 1), with p > 0.05
:
Mannequin 1: Energy_kWh ~ Parameters + Training_compute_FLOP + Training_dataset_size +
Training_time_hour + Hardware_quantity + Training_hardware +
0
Mannequin 2: Energy_kWh ~ Parameters + Training_compute_FLOP + Training_time_hour +
Hardware_quantity + Training_hardware
Mannequin 3: Energy_kWh ~ Parameters + Training_dataset_size + Training_time_hour +
Hardware_quantity + Training_hardware
Mannequin 4: Energy_kWh ~ Parameters + Training_time_hour + Hardware_quantity +
Training_hardware + 0
Mannequin 5: Energy_kWh ~ Training_time_hour + Hardware_quantity + Training_hardware +
0
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 46 108.28
2 47 111.95 -1 -3.6700 0.07809 .
3 47 115.69 0 -3.7471
4 48 116.09 -1 -0.3952 0.56314
5 49 116.61 -1 -0.5228 0.50604
Transferring on with Mannequin 5, I discovered that Coaching Time and {Hardware} Amount confirmed important constructive relationships with Power Consumption (GLM: coaching time, t = 9.70, p-value < 0.001; {hardware} amount, t = 6.89, p-value < 0.001
). All {hardware} varieties had been additionally statistically important (p-value < 0.001
), indicating robust variation in power use throughout differing types. Detailed outcomes are introduced under:
glm(components = Energy_kWh ~ Training_time_hour + Hardware_quantity +
Training_hardware + 0, household = Gamma(hyperlink = "log"), information = df)
Coefficients:
Estimate Std. Error t worth Pr(>|t|)
Training_time_hour 1.351e-03 1.393e-04 9.697 5.54e-13 ***
Hardware_quantity 3.749e-04 5.444e-05 6.886 9.95e-09 ***
Training_hardwareGoogle TPU v2 7.213e+00 7.614e-01 9.474 1.17e-12 ***
Training_hardwareGoogle TPU v3 1.060e+01 3.183e-01 33.310 < 2e-16 ***
Training_hardwareGoogle TPU v4 1.064e+01 4.229e-01 25.155 < 2e-16 ***
Training_hardwareHuawei Ascend 910 1.021e+01 1.126e+00 9.068 4.67e-12 ***
Training_hardwareNVIDIA A100 1.083e+01 3.224e-01 33.585 < 2e-16 ***
Training_hardwareNVIDIA A100 SXM4 40 GB 1.084e+01 5.810e-01 18.655 < 2e-16 ***
Training_hardwareNVIDIA A100 SXM4 80 GB 1.149e+01 5.754e-01 19.963 < 2e-16 ***
Training_hardwareNVIDIA GeForce GTX 285 3.065e+00 1.077e+00 2.846 0.00644 **
Training_hardwareNVIDIA GeForce GTX TITAN X 6.377e+00 7.614e-01 8.375 5.13e-11 ***
Training_hardwareNVIDIA GTX Titan Black 6.371e+00 1.079e+00 5.905 3.28e-07 ***
Training_hardwareNVIDIA H100 SXM5 80GB 1.149e+01 6.825e-01 16.830 < 2e-16 ***
Training_hardwareNVIDIA P100 5.910e+00 7.066e-01 8.365 5.32e-11 ***
Training_hardwareNVIDIA Quadro P600 5.278e+00 1.081e+00 4.881 1.16e-05 ***
Training_hardwareNVIDIA Quadro RTX 4000 5.918e+00 1.085e+00 5.455 1.60e-06 ***
Training_hardwareNVIDIA Quadro RTX 5000 4.932e+00 1.081e+00 4.563 3.40e-05 ***
Training_hardwareNVIDIA Tesla K80 9.091e+00 7.760e-01 11.716 8.11e-16 ***
Training_hardwareNVIDIA Tesla V100 DGXS 32 GB 1.059e+01 6.546e-01 16.173 < 2e-16 ***
Training_hardwareNVIDIA Tesla V100S PCIe 32 GB 1.089e+01 1.078e+00 10.099 1.45e-13 ***
Training_hardwareNVIDIA V100 9.683e+00 4.106e-01 23.584 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma household taken to be 1.159293)
Null deviance: 2.7045e+08 on 70 levels of freedom
Residual deviance: 1.1661e+02 on 49 levels of freedom
AIC: 1781.2
Variety of Fisher Scoring iterations: 25
Ultimate Mannequin Choice
To higher seize potential non-additive results, varied interplay phrases had been explored and their respective AIC scores (Desk 1). The desk under summarizes the examined fashions and their respective AIC scores:
Mannequin | Predictors | AIC |
---|---|---|
5 | Coaching Time + {Hardware} Amount + {Hardware} Sort | 350.78 |
6 | Coaching Time + {Hardware} Amount + {Hardware} Sort * Parameters | 357.97 |
7 | Coaching Time + {Hardware} Amount + {Hardware} Sort * Dataset Dimension | 335.89 |
8 | Coaching Time * {Hardware} Amount + {Hardware} Sort | 345.39 |
9 | Coaching Time * {Hardware} Sort + {Hardware} Amount | 333.03 |
Though AIC scores didn’t differ drastically, that means their mannequin matches are comparable, Mannequin 8 was most popular because it was the one one with important results in each important phrases and interplay. Interactions concerned {Hardware} Sort weren’t important regardless of some exhibiting higher AIC, probably as a consequence of restricted pattern measurement throughout 18 {hardware} varieties.
In Mannequin 8, each Coaching Time and {Hardware} Amount confirmed a major constructive relationship with power consumption (GLM: t = 11.09, p < 0.001
), and between {hardware} amount and power consumption (GLM: coaching time, t = 11.09, p < 0.001; {hardware} amount, t = 7.32, p < 0.001; Fig. 3a
). Their interplay time period was considerably destructive (GLM: t = –4.32, p < 0.001
), suggesting that power consumption grows extra slowly when coaching time will increase alongside with the next variety of {hardware} models. All {hardware} varieties remained important (p < 0.001
). Detailed outcomes are as under:
glm(components = Energy_kWh ~ Training_time_hour * Hardware_quantity +
Training_hardware + 0, household = Gamma(hyperlink = "log"), information = df)
Coefficients:
Estimate Std. Error t worth Pr(>|t|)
Training_time_hour 1.818e-03 1.640e-04 11.088 7.74e-15 ***
Hardware_quantity 7.373e-04 1.008e-04 7.315 2.42e-09 ***
Training_hardwareGoogle TPU v2 7.136e+00 7.379e-01 9.670 7.51e-13 ***
Training_hardwareGoogle TPU v3 1.004e+01 3.156e-01 31.808 < 2e-16 ***
Training_hardwareGoogle TPU v4 1.014e+01 4.220e-01 24.035 < 2e-16 ***
Training_hardwareHuawei Ascend 910 9.231e+00 1.108e+00 8.331 6.98e-11 ***
Training_hardwareNVIDIA A100 1.028e+01 3.301e-01 31.144 < 2e-16 ***
Training_hardwareNVIDIA A100 SXM4 40 GB 1.057e+01 5.635e-01 18.761 < 2e-16 ***
Training_hardwareNVIDIA A100 SXM4 80 GB 1.093e+01 5.751e-01 19.005 < 2e-16 ***
Training_hardwareNVIDIA GeForce GTX 285 3.042e+00 1.043e+00 2.916 0.00538 **
Training_hardwareNVIDIA GeForce GTX TITAN X 6.322e+00 7.379e-01 8.568 3.09e-11 ***
Training_hardwareNVIDIA GTX Titan Black 6.135e+00 1.047e+00 5.862 4.07e-07 ***
Training_hardwareNVIDIA H100 SXM5 80GB 1.115e+01 6.614e-01 16.865 < 2e-16 ***
Training_hardwareNVIDIA P100 5.715e+00 6.864e-01 8.326 7.12e-11 ***
Training_hardwareNVIDIA Quadro P600 4.940e+00 1.050e+00 4.705 2.18e-05 ***
Training_hardwareNVIDIA Quadro RTX 4000 5.469e+00 1.055e+00 5.184 4.30e-06 ***
Training_hardwareNVIDIA Quadro RTX 5000 4.617e+00 1.049e+00 4.401 5.98e-05 ***
Training_hardwareNVIDIA Tesla K80 8.631e+00 7.587e-01 11.376 3.16e-15 ***
Training_hardwareNVIDIA Tesla V100 DGXS 32 GB 9.994e+00 6.920e-01 14.443 < 2e-16 ***
Training_hardwareNVIDIA Tesla V100S PCIe 32 GB 1.058e+01 1.047e+00 10.105 1.80e-13 ***
Training_hardwareNVIDIA V100 9.208e+00 3.998e-01 23.030 < 2e-16 ***
Training_time_hour:Hardware_quantity -2.651e-07 6.130e-08 -4.324 7.70e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma household taken to be 1.088522)
Null deviance: 2.7045e+08 on 70 levels of freedom
Residual deviance: 1.0593e+02 on 48 levels of freedom
AIC: 1775
Variety of Fisher Scoring iterations: 25

Coefficients Interpretation
To additional interpret the coefficients, we are able to exponentiate every coefficient and subtract one to estimate the % change within the response variable for every further unit within the predictor (Popovic, 2022). For power consumption, every further hour of coaching would improve power use by 0.18%, every further {hardware} unit would add 0.07%, and their interplay decreased their mixed important results by 0.00002%. Equally, since water and carbon had been straight proportional with power, the % change in coaching time, {hardware} amount, and their interplay remained the identical (Fig. 3b, Fig. 3c). Nevertheless, since {hardware} varieties had been categorical variables and functioned as baseline intercepts, their values differed throughout power, water, and carbon fashions to mirror variations in total scale.


RQ2: Power Effectivity over Time
I additionally used a log-linked Gamma mannequin to look at the connection between Power Effectivity and Publication Date, because the Shapiro-Wilk check indicated that the log-transformed information was not usually distributed (p < 0.001
). There was a constructive relationship between Publication Date and Power Effectivity, with an estimated enchancment of 0.13% per 12 months (GLM: t = 8.005, p < 0.001, Fig. 3d
).

To additional examine, we examined the tendencies by particular person {hardware} sort and noticed noticeable variation in effectivity amongst AI fashions utilizing the identical {hardware} (Fig. 3e). Amongst all structure and {hardware} selections, Coaching Time was the one statistically important issue influencing power effectivity (GLM: t = 8.581, p < 0.001
), with longer coaching time decreases power effectivity by 0.03% per hour.

4. Dialogue
This examine discovered that {hardware} selections — together with {Hardware} Sort and {Hardware} Amount — together with Coaching Time, have a major relationship with every useful resource consumption throughout AI Model Training, whereas structure variables don’t. I believe that Coaching Time could have implicitly captured among the underlying results of these architecture-related components. As well as, the interplay between Coaching Time and {Hardware} additionally contributes to the useful resource utilization. Nevertheless, this evaluation is constrained by the small dataset (70 legitimate fashions) throughout 18 {hardware} varieties, which probably limits the statistical energy of hardware-involved interplay phrases. Additional analysis may discover these interactions with bigger and extra various datasets.
For example how resource-intensive AI coaching will be, we use Mannequin 8 to foretell the baseline power consumption for a single hour of coaching on one NVIDIA A100 chip. Listed here are the predictions for every sort of useful resource below this easy setup:
- Power: The anticipated power use is 29,213 kWh, almost 3 times the annual power consumption of a median U.S. family (10,500 kWh/12 months) (U.S. Power Data Administration, 2023), with every further hour including 5258 kWh extra and every further chip including 2044 kWh.
- Water: Equally, the identical coaching session would devour 10,521 liters of water, virtually ten occasions the typical U.S. family’s day by day water use (300 gallons or 1135 liters/day) (United States Environmental Safety Company, 2024), with every further hour including 1,894 liters and every chip including 736 liters.
- Carbon: the expected carbon emission is 16,009 kg, about 4 occasions the annual emissions of a U.S. family (4000kg/12 months) (College of Michigan, 2024), with every further hour including 2881 kg and every further chip including 1120 kg.
This examine additionally discovered that AI fashions have change into extra energy-efficient over time, however solely barely, with an estimated enchancment of 0.13% per 12 months. This means that whereas newer {hardware} is extra environment friendly, its adoption has not been widespread. Whereas the environmental affect of AI could also be mitigated over time as {hardware} {hardware} has change into extra environment friendly, this concentrate on {hardware} alone could overlook different contributors to total power consumption. On this dataset, each Coaching Compute and Complete Energy Draw are sometimes estimated values and will embody some system-level overhead past {hardware} alone. Due to this fact, the effectivity estimates on this examine could mirror not simply {hardware} efficiency, however doubtlessly different training-related overhead. This examine noticed substantial variation in power effectivity even amongst fashions utilizing the identical {hardware}. One key discovering is that longer coaching time can “drain” power effectivity, lowering it by roughly 0.03%. Additional research ought to discover how coaching practices, past {hardware} choice, affect the environmental prices of AI improvement.
References
Calvert, B.. 2024. AI already makes use of as a lot power as a small nation. It’s solely the start. Vox. https://www.vox.com/climate/2024/3/28/24111721/climate-ai-tech-energy-demand-rising
OpenAI Newsroom. 2024. Recent numbers shared by @sama earlier as we speak: 300M weekly lively ChatGPT customers. 1B consumer messages despatched on ChatGPT day-after-day 1.3M devs have constructed on OpenAI within the US. Tweet by way of X. 2024. https://x.com/OpenAINewsroom/status/1864373399218475440
Epoch AI. 2025. Information on Notable AI Fashions. Epoch AI. https://epoch.ai/data/notable-ai-models
Shehabi, A., S.J. Smith, A. Hubbard, A. Newkirk, N. Lei, M.A.B. Siddik, B. Holecek, J. Koomey, E. Masanet, and D. Sartor. 2024. 2024 United States Information Middle Power Utilization Report. Lawrence Berkeley Nationwide Laboratory, Berkeley, California. LBNL-2001637.
Guidi, G., F. Dominici, J. Gilmour, Ok. Butler, E. Bell, S. Delaney, and F.J. Bargagli-Stoffi. 2024. Environmental Burden of United States Information Facilities within the Synthetic Intelligence Period. arXiv abs/2411.09786.
Bali, S.. 2025. GPU Reminiscence Necessities for AI Efficiency. NVIDIA Developer. https://developer.nvidia.com/blog/gpu-memory-essentials-for-ai-performance/
Krashinsky, R., O. Giroux, S. Jones, N. Stam, and S. Ramaswamy. 2020. NVIDIA Ampere Structure In-Depth. NVIDIA Developer. https://developer.nvidia.com/blog/nvidia-ampere-architecture-in-depth/
HuggingFace. 2025. Efficiency Ideas for Coaching on A number of GPUs. HuggingFace Documentation. https://huggingface.co/docs/transformers/en/perf_train_gpu_many
Popovic, G.. 2022. Decoding GLMs. Environmental Computing. Surroundings Computing. https://environmentalcomputing.net/statistics/glms/interpret-glm-coeffs/
U.S. Power Data Administration. 2023. Use of Power Defined: Electrical energy Use in Properties. https://www.eia.gov/energyexplained/use-of-energy/electricity-use-in-homes.php
United States Environmental Safety Company. 2024. How We Use Water. https://www.epa.gov/watersense/how-we-use-water
Middle for Sustainable Methods, College of Michigan. 2024. Carbon Footprint Factsheet. Pub. No. CSS09–05.