<img src="https://spectrum.ieee.org/media-library/squares-and-rectangles-on-graph-paper-form-the-letters-ai.jpg?id=65506010&width=1200&height=800&coordinates=0%2C83%2C0%2C84"/><br/><br/><p>The capabilities of leading AI models continue to accelerate and the largest AI companies, including <a href="https://www.cnbc.com/2026/04/08/openai-ipo-sarah-friar-retail-investors.html" target="_blank">OpenAI</a> and <a href="https://fortune.com/2026/04/10/anthropic-too-dangerous-to-release-ai-model-means-for-its-upcoming-ipo/" target="_blank">Anthropic</a>, are hurtling toward IPOs later this year. Yet resentment towards AI continues to simmer and in some cases has boiled over, especially in the United States, where local governments are beginning to embrace restrictions or outright bans on new data center development.</p><p>It’s a lot to keep track of, but the 2026 edition of the <a href="https://hai.stanford.edu/ai-index" target="_blank">AI Index</a> from Stanford University’s <a href="https://hai.stanford.edu/" target="_blank">Human-Centered Artificial Intelligence</a> center pulls it off. The report, which comes in at over 400 pages, includes dozens of data points and graphs that approach the topic from multiple angles, from benchmark scores to investment and public perception. <br/><br/>As in prior years (see our coverage from <a href="https://spectrum.ieee.org/the-state-of-ai-in-15-graphs" target="_self">2021</a>, <a href="https://spectrum.ieee.org/artificial-intelligence-index" target="_self">2022</a>, <a href="https://spectrum.ieee.org/state-of-ai-2023" target="_self">2023</a>, <a href="https://spectrum.ieee.org/ai-index-2024" target="_self">2024</a>, and <a href="https://spectrum.ieee.org/ai-index-2025" target="_self">2025</a>), we’ve read the report and identified the trends that encapsulate the state of AI in 2026.</p><h2>US companies lead in AI models</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Chart showing the number of AI models in the United States, China, and Europe as rising from 2005 to 2025, particularly in China at 30 and the United States at 50, while Europe is at 2." class="rm-shortcode" data-rm-shortcode-id="159dac45bd56ec14147f84e06df793b1" data-rm-shortcode-name="rebelmouse-image" id="04bdb" loading="lazy" src="https://spectrum.ieee.org/media-library/chart-showing-the-number-of-ai-models-in-the-united-states-china-and-europe-as-rising-from-2005-to-2025-particularly-in-china.jpg?id=65506052&width=980"/> </p><p><span>The United States has led the charge in AI model releases over the past decade, and that remains as true in 2025 as in any year prior. According to research institute Epoch AI, organizations based in the United States released 50 “notable” models in 2025. However, China’s output is beginning to close the gap.</span></p><p>Nearly all the notable models originated within industry (as opposed to academic or government institutions). EpochAI tracked 87 notable model releases from industry in 2025, compared to just 7 from all other sources. This is a major long-term trend. Models released by industry now make up over 90 percent of notable models, up from just under 50 percent in 2015, and zero in 2003.</p><h2>China leads in robotics</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A line chart of the number of new industrial robots installed in Germany, South Korea, the United States, Japan, and China showing a massive amount more in China." class="rm-shortcode" data-rm-shortcode-id="c8eab0a1f3e6cee9558f0fd235a0b912" data-rm-shortcode-name="rebelmouse-image" id="ab17d" loading="lazy" src="https://spectrum.ieee.org/media-library/a-line-chart-of-the-number-of-new-industrial-robots-installed-in-germany-south-korea-the-united-states-japan-and-china-showi.jpg?id=65506352&width=980"/> </p><p>While U.S. companies released the largest number of notable AI models, China has an equally clear lead in the deployment of robotics. According to data from the International Federation of Robotics, China installed 295,000 industrial robots in 2024. Japan installed roughly 44,500, and the United States installed 34,200.</p><h2>World AI compute capacity has grown 3.3x yearly since 2022</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Bar chart showing the portion of global computing capacity from AI chips from Nvidia, Google, Amazon, AMD and Huawei, mostly dominated by Nvidia." class="rm-shortcode" data-rm-shortcode-id="29c8f6c52d2ab01a3f69f6627314e90f" data-rm-shortcode-name="rebelmouse-image" id="0a5a1" loading="lazy" src="https://spectrum.ieee.org/media-library/bar-chart-showing-the-portion-of-global-computing-capacity-from-ai-chips-from-nvidia-google-amazon-amd-and-huawei-mostly-dom.jpg?id=65506056&width=980"/> </p><p><span>The latest Stanford AI Index report has no shortage of head-turning numbers on the AI build-out, but none beats EpochAI’s gauge of total AI compute.</span></p><p>This graph, which uses the compute power of Nvidia’s H100e as a yardstick, shows that the world’s AI compute capacity has increased more than three-fold every year since 2022. Total AI compute has increased 30-fold since 2021, the first year tracked. </p><p>Nvidia has benefited most from this build-out, as its GPUs account for over 60 percent of the total AI compute capacity in the world today. Amazon and Google—each of which design their own hardware for AI workloads—come in second and third.</p><h2>Training AI models can generate enormous carbon emissions</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Chart showing estimated carbon emissions from training of AI models from 2012 to 2025. With Grok 3 and Grok 4, the chart shows a tremendous increase in 2025." class="rm-shortcode" data-rm-shortcode-id="3e8a202e57d64ac69720c82cb8749d54" data-rm-shortcode-name="rebelmouse-image" id="08510" loading="lazy" src="https://spectrum.ieee.org/media-library/chart-showing-estimated-carbon-emissions-from-training-of-ai-models-from-2012-to-2025-with-grok-3-and-grok-4-the-chart-shows-a.jpg?id=65506058&width=980"/> </p><p><span>Stanford’s AI Index has called out the carbon emissions from AI training in prior years, and the issue continues to trend in a worrying direction.</span></p><p>The report estimates that training the latest frontier large language models, such as xAI’s Grok 4, can generate over 72,000 tons of carbon-equivalent emissions. That’s a huge increase from estimates in prior years. OpenAI’s GPT-4 was estimated at 5,184 tons and Meta’s Llama 3.1 405B was estimated at 8,930 tons. </p><p><a href="https://hai.stanford.edu/people/ray-perrault" target="_blank">Ray Perrault</a>, co-director of the AI Index steering committee, says these figures are estimates. “These estimates should be interpreted with caution. In the case of Grok, they rely heavily on inferred inputs drawn from public reporting (e.g., Forbes articles), xAI statements, and other non-verifiable sources, introducing a degree of uncertainty,” says Perrault. On the other hand, Perrault noted that “Epoch AI independently estimates Grok 4’s emissions to be significantly higher at approximately 140,000 tons of CO₂.”</p><p>Emissions from AI inference also continue to increase, though results again vary by model. The report estimates that carbon emissions from models with the least efficient inference are over 10 times as high as those with the most efficient inference. <a data-linked-post="2671027978" href="https://spectrum.ieee.org/deepseek" target="_blank">DeepSeek</a>’s V3 models were estimated to consume around 23 watts when responding to a “medium-length” prompt, while Claude 4 Opus was estimated to consume about 5 watts.</p><h2>LLMs are rapidly defeating new benchmarks</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A chart shows AI index technical performance benchmarks compared to human performance for a variety of tasks from 2012 to 2025. Image classification surpassed human performance early, but only in the 2020s have models begun to near or surpass human baselines in a number of tasks." class="rm-shortcode" data-rm-shortcode-id="7a521cbb67087cb2bdd493f8982f990f" data-rm-shortcode-name="rebelmouse-image" id="8b302" loading="lazy" src="https://spectrum.ieee.org/media-library/a-chart-shows-ai-index-technical-performance-benchmarks-compared-to-human-performance-for-a-variety-of-tasks-from-2012-to-2025.jpg?id=65506063&width=980"/> </p><p><span>The capabilities of AI models have improved with incredible speed over the past decade and, as the graph above shows, progress seems to be accelerating. Multimodal LLMs, in particular, are conquering benchmarks nearly as quickly as they can be invented.</span></p><p><a data-linked-post="2669884140" href="https://spectrum.ieee.org/ai-agents" target="_blank">Agentic AI</a> has experienced the most extreme gains. The two steep lines at the right of the chart represent the <a href="https://os-world.github.io/" target="_blank">OSWorld benchmark</a>, which benchmarks autonomous computer use, and the <a href="https://openai.com/index/introducing-swe-bench-verified/" target="_blank">SWE-Bench Verified</a> software engineering benchmark, which benchmarks autonomous coding.</p><p>Models are also rapidly improving on <a href="https://agi.safe.ai/" target="_blank">Humanity’s Last Exam</a>. This benchmark includes questions contributed by subject-matter experts designed to represent the toughest problems in their fields. The 2025 Stanford AI Index reported the top-ranking model, OpenAI’s o1, correctly answered just 8.8 percent of questions. Since then, accuracy has increased to 38.3 percent—and even that number is a bit out of date, <a href="https://llm-stats.com/benchmarks/humanity's-last-exam" target="_blank">as the best-scoring models as of April 2026</a> (such as Anthropic’s Claude Opus 4.6 and Google’s Gemini 3.1 Pro) top 50 percent.</p><p>Still, Perrault cautioned that benchmarks may not always map to real-world results. “We generally lack measures of how well a system (or agent) needs to function in a particular setting,” says Perrault. “Knowing that a benchmark for legal reasoning has 75 percent accuracy tells us little about how well it would fit in a law practice’s activities.”</p><h2>AI research in medicine sees gains</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A bar graph shows increasing numbers of publications on AI for drug discovery from 2018 to 2025." class="rm-shortcode" data-rm-shortcode-id="b063563fc0db5ddc7f527ce88bca7399" data-rm-shortcode-name="rebelmouse-image" id="8120f" loading="lazy" src="https://spectrum.ieee.org/media-library/a-bar-graph-shows-increasing-numbers-of-publications-on-ai-for-drug-discovery-from-2018-to-2025.jpg?id=65506067&width=980"/> </p><p><span>Gains in AI benchmarks seem to be reflected in medicine, where AI adoption has increased at a rapid pace. Medical research shows particularly quick adoption. As the graph above shows, the number of publications on the topic of AI use for drug discovery has more than double over the past two years. There are 2.7 times as many publications on multimodal biomedical AI, which are used to examine medical images alongside text, as there were two years ago.</span></p><h2>LLMs still have trouble reading an analog clock</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A bar chart compares different LLMs taking on the task of reading an analog clock, ranging from only 8.9% to 50.60% accuracy." class="rm-shortcode" data-rm-shortcode-id="e106617ccc2cd8fb3fcf8cd9788f8543" data-rm-shortcode-name="rebelmouse-image" id="02fc0" loading="lazy" src="https://spectrum.ieee.org/media-library/a-bar-chart-compares-different-llms-taking-on-the-task-of-reading-an-analog-clock-ranging-from-only-8-9-to-50-60-accuracy.jpg?id=65506069&width=980"/> </p><p><span>While AI models have improved rapidly in some areas, they remain remarkably bad at some common tasks, like </span><a data-linked-post="2674259807" href="https://spectrum.ieee.org/large-language-models-reading-clocks" target="_blank">reading clocks</a><span> and understanding calendars. </span><a href="https://clockbench.ai/ClockBench.pdf" target="_blank">ClockBench</a><span>, which measures a multimodal LLM’s ability to read an analog clock, found that even the model best at this task, OpenAI’s GPT-5.4, had just 50-50 odds of getting it right.</span></p><p>Most models scored far worse. Anthropic’s Claude Opus 4.6 read the time correctly with just 8.9 percent accuracy. That’s surprising because the model often scores well in other benchmarks. (As previously mentioned, Claude Opus 4.6 delivered top-notch scores in Humanity’s Last Exam.)</p><p>Of course, LLMs will rarely be asked to perform this task in real life, but Perrault says it represents a more general issue. “There is a research thread that shows that when systems are asked questions about combinations of language with other modalities (e.g. images, or audio, as in tone of voice), the language component carries a surprisingly large part of the burden, event to the extent of ignoring non-language information completely.” </p><h2>AI investment hit a new peak in 2025</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A bar chart showing global corporate investment in AI by investment activity from 2013 to 2025 highlighting a rise in 2021, followed by a dip in 2022-2024 and then a huge increase again in 2025." class="rm-shortcode" data-rm-shortcode-id="4801a130521bd0f38143f4882701c41c" data-rm-shortcode-name="rebelmouse-image" id="9dea4" loading="lazy" src="https://spectrum.ieee.org/media-library/a-bar-chart-showing-global-corporate-investment-in-ai-by-investment-activity-from-2013-to-2025-highlighting-a-rise-in-2021-foll.jpg?id=65506071&width=980"/> </p><p><span>The gains in AI model performance have gone hand-in-hand with investment into AI companies. According to data from AI analytics company </span><a href="https://www.quid.com/" target="_blank">Quid</a><span>, 2025 set a new record for AI investment with over US $581 billion spent.</span></p><p>That’s more than double the $253 billion spent in 2024 and speeds past the previous record of $360 billion, which was set in 2021. And unlike 2021, where investment was led by mergers and acquisitions, 2025’s record-setting result was led by private investment into AI companies.</p><p>Most of that money is flowing into the United States, where over $344 billion dollars were invested in AI last year. </p><h2>Software engineers are all-in on AI</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A line graph shows the number of GitHub AI projects from 2011 to 2025 as a increase from about 0 to 5.58 million." class="rm-shortcode" data-rm-shortcode-id="6e2f54b974d8cb26eb62f6c48aa8e3bf" data-rm-shortcode-name="rebelmouse-image" id="db874" loading="lazy" src="https://spectrum.ieee.org/media-library/a-line-graph-shows-the-number-of-github-ai-projects-from-2011-to-2025-as-a-increase-from-about-0-to-5-58-million.jpg?id=65506074&width=980"/> </p><p><span>However, the story of AI adoption isn’t just about private money. There’s also a grassroots enthusiasm for AI on Github, where the number of AI-related projects has rocketed to 5.58 million projects through 2025. That’s a roughly five-fold increase since 2020 and a 23.7 percent increase from 2024.</span></p><p>This number doesn’t appear to represent a flood of AI-generated projects, either. The number of projects with at least 10 stars has increased at a similar rate, and the number of stars awarded to AI projects overall have increased at a similar rate. That suggests the projects are seeing human engagement. Perhaps this should be no surprise given the popularity of some projects. Open-source agentic AI software OpenClaw, for example, <a href="https://github.com/openclaw/openclaw" target="_blank">has received 352,000 stars</a>. </p><p>Critics may worry that the enthusiasm is in part driven by AI bots or agentic projects. Perrault acknowledged this, and says that “probably the intensity of GitHub use is highly correlated with the intensity of AI use.” However, the majority of GitHub activity still appears to be conducted by humans, at least according an activity tracking website called <a href="https://insights.logicstar.ai">Agents in the Wild </a>(this website is not mentioned in Stanford’s report).</p><p>Enthusiasm is strong in computer science, too. The number of AI-related computer science publications has more doubled over the past decade, from 102,000 to 258,000. Over 68 percent of these still originate in academia, with government and industry contributing about 11.5 and 12.5 percent respectively as of 2024. The growth is led by publications in machine learning, computer vision, and generative AI.</p><h2>AI’s overall impact on employment remains unclear</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Two line charts show headcount trends for software developers and customer support agents by age from 2021 to 2025. Of note is a distinct decrease in headcount for early career workers." class="rm-shortcode" data-rm-shortcode-id="17e3f35d335584d1be06b2a46d40710b" data-rm-shortcode-name="rebelmouse-image" id="d66e9" loading="lazy" src="https://spectrum.ieee.org/media-library/two-line-charts-show-headcount-trends-for-software-developers-and-customer-support-agents-by-age-from-2021-to-2025-of-note-is-a.jpg?id=65506076&width=980"/> </p><p><span>The rise of generative AI goes hand-in-hand with employment worries, a phenomena no doubt encouraged by the worrisome predictions of CEOs at the world’s largest AI companies. However, the data so far remains mixed.</span></p><p>Above you’ll find graphs which show the “normalized head count” among varying age demographics in two professions thought to be at high risk of AI replacement: software developers and customer support agents. As in prior years, the trends show that entry-level jobs in these professions have been reduced, while mid-career and senior positions have held steady or increased. <br/><br/>However, these changes remain difficult to untangle from broader economic trends. The report notes that unemployment is rising across many occupations and that, contrary to expectations, unemployment among workers least exposed to AI has risen more than unemployment among workers most exposed to AI.</p><h2>Overall public perception of AI (slightly) improves</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="Bar charts show responses to various opinion statements related to AI from 2022 to 2025. " class="rm-shortcode" data-rm-shortcode-id="a7fbb2f03e9c1954db8ae14e99f037a8" data-rm-shortcode-name="rebelmouse-image" id="33355" loading="lazy" src="https://spectrum.ieee.org/media-library/bar-charts-show-responses-to-various-opinion-statements-related-to-ai-from-2022-to-2025.jpg?id=65506086&width=980"/> </p><p><span>The report’s most surprising finding is no doubt the small but notable increase in optimism about AI over the past several years. 59 percent of respondents to a survey conducted by Ipsos said “the benefits outweigh the drawbacks,” up from 55 percent in 2024. 68 percent of respondents said they have a “good understanding” of AI, a slight uptick from 67 percent in 2024.</span></p><p>Survey responses to similar questions suggest that the overall reception to AI is more positive than negative, though some negative feelings have also increased. For example, 52 percent of respondents said that products and services that use AI make them “nervous.”</p><p>Sentiment varies significantly by country. Countries in Southeast Asia, including China, Malaysia, Thailand, Indonesia, and Singapore, are trending more positive towards AI. However, the strongest positive year-over-year shifts were in Germany (12 percent), France (10 percent), and the Netherlands (10 percent). Colombia saw the most negative shift (-6 percent), a reversal of the trend from prior years.</p><h2>Trust in AI regulation varies significantly by country</h2><p class="shortcode-media shortcode-media-rebelmouse-image"> <img alt="A chart shows trust in government regulation of AI by country led by Singapore with 81% and with the United States at the bottom with 31%." class="rm-shortcode" data-rm-shortcode-id="f5c779dfa5e5b877b12121418d42b160" data-rm-shortcode-name="rebelmouse-image" id="6eb33" loading="lazy" src="https://spectrum.ieee.org/media-library/a-chart-shows-trust-in-government-regulation-of-ai-by-country-led-by-singapore-with-81-and-with-the-united-states-at-the-bottom.jpg?id=65506090&width=980"/> </p><p><span>While a growing number of people seem to feel that AI will have a positive impact, that shift is accompanied by deep distrust in some countries, particularly on the topic of government regulation.</span></p><p>Notably, the United States is at the bottom of the list even while it leads in AI investment. Only 31 percent of Ipsos survey respondents trusted the government to regulate AI. Many European countries showed low levels of trust, as did Japan. Countries in Asia and South America showed the greatest trust in their government’s ability to regulate AI.</p><p>The results from the United States and Colombia are intriguing. The U.S. is seeing deep distrust in AI regulation, yet most respondents think AI’s benefits will outweigh its drawbacks. Colombia, on the other hand, shows high trust in AI regulations yet worsening sentiment towards AI overall.</p><p>This feels like a microcosm of the AI narrative in 2025. Both the quality of results from AI models, and public perception on how AI will impact society, continue to vary, often by wide margins, depending on the task or question at hand. </p>

The capabilities of leading AI models continue to accelerate and the largest AI companies, including OpenAI and Anthropic, are hurtling toward IPOs later this year. Yet resentment towards AI continues to simmer and in some cases has boiled over, especially in the United States, where local governments are beginning to embrace restrictions or outright bans on new data center development.
It’s a lot to keep track of, but the 2026 edition of the AI Index from Stanford University’s Human-Centered Artificial Intelligence center pulls it off. The report, which comes in at over 400 pages, includes dozens of data points and graphs that approach the topic from multiple angles, from benchmark scores to investment and public perception.
As in prior years (see our coverage from 2021, 2022, 2023, 2024, and 2025), we’ve read the report and identified the trends that encapsulate the state of AI in 2026.
The United States has led the charge in AI model releases over the past decade, and that remains as true in 2025 as in any year prior. According to research institute Epoch AI, organizations based in the United States released 50 “notable” models in 2025. However, China’s output is beginning to close the gap.
Nearly all the notable models originated within industry (as opposed to academic or government institutions). EpochAI tracked 87 notable model releases from industry in 2025, compared to just 7 from all other sources. This is a major long-term trend. Models released by industry now make up over 90 percent of notable models, up from just under 50 percent in 2015, and zero in 2003.
While U.S. companies released the largest number of notable AI models, China has an equally clear lead in the deployment of robotics. According to data from the International Federation of Robotics, China installed 295,000 industrial robots in 2024. Japan installed roughly 44,500, and the United States installed 34,200.
The latest Stanford AI Index report has no shortage of head-turning numbers on the AI build-out, but none beats EpochAI’s gauge of total AI compute.
This graph, which uses the compute power of Nvidia’s H100e as a yardstick, shows that the world’s AI compute capacity has increased more than three-fold every year since 2022. Total AI compute has increased 30-fold since 2021, the first year tracked.
Nvidia has benefited most from this build-out, as its GPUs account for over 60 percent of the total AI compute capacity in the world today. Amazon and Google—each of which design their own hardware for AI workloads—come in second and third.
Stanford’s AI Index has called out the carbon emissions from AI training in prior years, and the issue continues to trend in a worrying direction.
The report estimates that training the latest frontier large language models, such as xAI’s Grok 4, can generate over 72,000 tons of carbon-equivalent emissions. That’s a huge increase from estimates in prior years. OpenAI’s GPT-4 was estimated at 5,184 tons and Meta’s Llama 3.1 405B was estimated at 8,930 tons.
Ray Perrault, co-director of the AI Index steering committee, says these figures are estimates. “These estimates should be interpreted with caution. In the case of Grok, they rely heavily on inferred inputs drawn from public reporting (e.g., Forbes articles), xAI statements, and other non-verifiable sources, introducing a degree of uncertainty,” says Perrault. On the other hand, Perrault noted that “Epoch AI independently estimates Grok 4’s emissions to be significantly higher at approximately 140,000 tons of CO₂.”
Emissions from AI inference also continue to increase, though results again vary by model. The report estimates that carbon emissions from models with the least efficient inference are over 10 times as high as those with the most efficient inference. DeepSeek’s V3 models were estimated to consume around 23 watts when responding to a “medium-length” prompt, while Claude 4 Opus was estimated to consume about 5 watts.
The capabilities of AI models have improved with incredible speed over the past decade and, as the graph above shows, progress seems to be accelerating. Multimodal LLMs, in particular, are conquering benchmarks nearly as quickly as they can be invented.
Agentic AI has experienced the most extreme gains. The two steep lines at the right of the chart represent the OSWorld benchmark, which benchmarks autonomous computer use, and the SWE-Bench Verified software engineering benchmark, which benchmarks autonomous coding.
Models are also rapidly improving on Humanity’s Last Exam. This benchmark includes questions contributed by subject-matter experts designed to represent the toughest problems in their fields. The 2025 Stanford AI Index reported the top-ranking model, OpenAI’s o1, correctly answered just 8.8 percent of questions. Since then, accuracy has increased to 38.3 percent—and even that number is a bit out of date, as the best-scoring models as of April 2026 (such as Anthropic’s Claude Opus 4.6 and Google’s Gemini 3.1 Pro) top 50 percent.
Still, Perrault cautioned that benchmarks may not always map to real-world results. “We generally lack measures of how well a system (or agent) needs to function in a particular setting,” says Perrault. “Knowing that a benchmark for legal reasoning has 75 percent accuracy tells us little about how well it would fit in a law practice’s activities.”
Gains in AI benchmarks seem to be reflected in medicine, where AI adoption has increased at a rapid pace. Medical research shows particularly quick adoption. As the graph above shows, the number of publications on the topic of AI use for drug discovery has more than double over the past two years. There are 2.7 times as many publications on multimodal biomedical AI, which are used to examine medical images alongside text, as there were two years ago.
While AI models have improved rapidly in some areas, they remain remarkably bad at some common tasks, like reading clocks and understanding calendars. ClockBench, which measures a multimodal LLM’s ability to read an analog clock, found that even the model best at this task, OpenAI’s GPT-5.4, had just 50-50 odds of getting it right.
Most models scored far worse. Anthropic’s Claude Opus 4.6 read the time correctly with just 8.9 percent accuracy. That’s surprising because the model often scores well in other benchmarks. (As previously mentioned, Claude Opus 4.6 delivered top-notch scores in Humanity’s Last Exam.)
Of course, LLMs will rarely be asked to perform this task in real life, but Perrault says it represents a more general issue. “There is a research thread that shows that when systems are asked questions about combinations of language with other modalities (e.g. images, or audio, as in tone of voice), the language component carries a surprisingly large part of the burden, event to the extent of ignoring non-language information completely.”
The gains in AI model performance have gone hand-in-hand with investment into AI companies. According to data from AI analytics company Quid, 2025 set a new record for AI investment with over US $581 billion spent.
That’s more than double the $253 billion spent in 2024 and speeds past the previous record of $360 billion, which was set in 2021. And unlike 2021, where investment was led by mergers and acquisitions, 2025’s record-setting result was led by private investment into AI companies.
Most of that money is flowing into the United States, where over $344 billion dollars were invested in AI last year.
However, the story of AI adoption isn’t just about private money. There’s also a grassroots enthusiasm for AI on Github, where the number of AI-related projects has rocketed to 5.58 million projects through 2025. That’s a roughly five-fold increase since 2020 and a 23.7 percent increase from 2024.
This number doesn’t appear to represent a flood of AI-generated projects, either. The number of projects with at least 10 stars has increased at a similar rate, and the number of stars awarded to AI projects overall have increased at a similar rate. That suggests the projects are seeing human engagement. Perhaps this should be no surprise given the popularity of some projects. Open-source agentic AI software OpenClaw, for example, has received 352,000 stars.
Critics may worry that the enthusiasm is in part driven by AI bots or agentic projects. Perrault acknowledged this, and says that “probably the intensity of GitHub use is highly correlated with the intensity of AI use.” However, the majority of GitHub activity still appears to be conducted by humans, at least according an activity tracking website called Agents in the Wild (this website is not mentioned in Stanford’s report).
Enthusiasm is strong in computer science, too. The number of AI-related computer science publications has more doubled over the past decade, from 102,000 to 258,000. Over 68 percent of these still originate in academia, with government and industry contributing about 11.5 and 12.5 percent respectively as of 2024. The growth is led by publications in machine learning, computer vision, and generative AI.
The rise of generative AI goes hand-in-hand with employment worries, a phenomena no doubt encouraged by the worrisome predictions of CEOs at the world’s largest AI companies. However, the data so far remains mixed.
Above you’ll find graphs which show the “normalized head count” among varying age demographics in two professions thought to be at high risk of AI replacement: software developers and customer support agents. As in prior years, the trends show that entry-level jobs in these professions have been reduced, while mid-career and senior positions have held steady or increased.
However, these changes remain difficult to untangle from broader economic trends. The report notes that unemployment is rising across many occupations and that, contrary to expectations, unemployment among workers least exposed to AI has risen more than unemployment among workers most exposed to AI.
The report’s most surprising finding is no doubt the small but notable increase in optimism about AI over the past several years. 59 percent of respondents to a survey conducted by Ipsos said “the benefits outweigh the drawbacks,” up from 55 percent in 2024. 68 percent of respondents said they have a “good understanding” of AI, a slight uptick from 67 percent in 2024.
Survey responses to similar questions suggest that the overall reception to AI is more positive than negative, though some negative feelings have also increased. For example, 52 percent of respondents said that products and services that use AI make them “nervous.”
Sentiment varies significantly by country. Countries in Southeast Asia, including China, Malaysia, Thailand, Indonesia, and Singapore, are trending more positive towards AI. However, the strongest positive year-over-year shifts were in Germany (12 percent), France (10 percent), and the Netherlands (10 percent). Colombia saw the most negative shift (-6 percent), a reversal of the trend from prior years.
While a growing number of people seem to feel that AI will have a positive impact, that shift is accompanied by deep distrust in some countries, particularly on the topic of government regulation.
Notably, the United States is at the bottom of the list even while it leads in AI investment. Only 31 percent of Ipsos survey respondents trusted the government to regulate AI. Many European countries showed low levels of trust, as did Japan. Countries in Asia and South America showed the greatest trust in their government’s ability to regulate AI.
The results from the United States and Colombia are intriguing. The U.S. is seeing deep distrust in AI regulation, yet most respondents think AI’s benefits will outweigh its drawbacks. Colombia, on the other hand, shows high trust in AI regulations yet worsening sentiment towards AI overall.
This feels like a microcosm of the AI narrative in 2025. Both the quality of results from AI models, and public perception on how AI will impact society, continue to vary, often by wide margins, depending on the task or question at hand.
LLMs are a class of AI models that are trained on vast amounts of text data to understand and generate human-like language. They leverage deep learning techniques to predict the next word in a sentence, enabling them to perform various language-related tasks such as translation, summarization, and conversation.
AI Ethics and Regulation involves the principles and guidelines that govern the development and deployment of AI technologies. This includes considerations of fairness, accountability, transparency, and the societal impacts of AI systems.