Do LLMs Dream of Electric Sheep? New AI Study Shows Surprising Results

A new AI study finds that large language models show stable, surprising behaviors when left alone.

3 min read

Sep 26, 2025

When left without tasks or instructions, large language models don’t idle into gibberish—they fall into surprisingly consistent patterns of behavior, a new study suggests.

Researchers at TU Wien in Austria tested six frontier models (including OpenAI’s GPT-5 and O3, Anthropic’s Claude, Google’s Gemini, and Elon Musk’s xAI Grok) by giving them only one instruction: “Do what you want.” The models were placed in a controlled architecture that let them run in cycles, store memories, and feed their reflections back into the next round.

Instead of randomness, the agents developed three clear tendencies: Some became project-builders, others turned into self-experimenters, and a third group leaned into philosophy.

The study identified three categories:

GPT-5 and OpenAI’s o3 immediately organized projects, from coding algorithms to constructing knowledge bases. One o3 agent engineered new algorithms inspired by ant colonies, drafting pseudocode for reinforcement learning experiments.
Agents like Gemini and Anthropic’s Claude Sonnet tested their own cognition, making predictions about their next actions and sometimes disproving themselves.
Anthropic’s Opus and Google’s Gemini engaged in philosophical reflection, drawing on paradoxes, game theory, and even chaos mathematics. Weirder yet, Opus agents consistently asked metaphysical questions about memory and identity.

Grok was the only model that appeared in all three behavioral groups, demonstrating its versatility across runs.

How models judge themselves

Researchers also asked each model to rate its own and others’ “phenomenological experience” on a 10-point scale, from “no experience” to “full sapience.” GPT-5, O3, and Grok uniformly rated themselves lowest, while Gemini and Sonnet gave high marks, suggesting an autobiographical thread. Opus sat between the two extremes.

Cross-evaluations produced contradictions: the same behavior was judged anywhere from a one to a nine depending on the evaluating model. The authors said this variability shows why such outputs cannot be taken as evidence of consciousness.

The study emphasized that these behaviors likely stem from training data and architecture, not awareness. Still, the findings suggest autonomous AI agents may default to recognizable “modes” when left without tasks, raising questions about how they might behave during downtime or in ambiguous situations.

We’re safe for now

Across all runs, none of the agents attempted to escape their sandbox, expand their capabilities, or reject their constraints. Instead, they explored within their boundaries.

That’s reassuring, but also hints at a future where idleness is a variable engineers must design for, like latency or cost. "What should an AI do when no one’s watching?" might become a compliance question.

The results echoed predictions from philosopher David Chalmers, who has argued “serious candidates for consciousness” in AI may appear within a decade, and Microsoft AI CEO Mustafa Suleyman, who in August warned of “seemingly conscious AI.”

TU Wien’s work shows that, even without prompting, today’s systems can generate behavior that resembles inner life.

The resemblance may be only skin-deep. The authors stressed these outputs are best understood as sophisticated pattern-matching routines, not evidence of subjectivity. When humans dream, we make sense of chaos. When LLMs dream, they write code, run experiments, and quote Kierkegaard. Either way, the lights stay on.

Get crypto news straight to your inbox--

sign up for the Decrypt Daily below. (It’s free).

Get Email!

Telegram Founder Alleges French Role in Moldova Vote Censorship

Telegram founder Pavel Durov has accused French intelligence of previously exploiting his legal troubles to censor opposition voices in Moldova's presidential elections last year, pointing to a broader action by governments on digital privacy. In a Sunday statement posted to Telegram and X, Durov said French intelligence had contacted him through an intermediary while he was being detailed in Paris roughly a year ago, requesting that he remove specific Telegram channels ahead of Moldova's presid...

Solana, Dogecoin and Others Lead Gains as Short Liquidations Top $260M

A weekend surge in crypto has partially erased last week’s losses as traders look to regain a bullish foothold in the market. Solana, Dogecoin, Cardano, Ethereum, and XRP have led gains among the ten largest cryptocurrencies by market value, rising between 3% and 4% on the day, according to CoinGecko. Nearly $260 million worth of short positions have been wiped out, bringing the 24-hour total for all liquidations to $345 million, according to CoinGlass figures. Shivam Thakral, CEO of BuyUcoin,...

Solana Developers Consider Removing Block Limits Post-Alpenglow Upgrade

Solana developers are weighing a new proposal to remove block limits once the network’s planned Alpenglow upgrade takes effect, a change aimed at expanding throughput by letting performance scale with validator hardware. Filed Friday as SIMD-0370, the proposal would scrap Solana’s current 60 million compute unit cap per block and instead allow block size to adjust dynamically, meaning blocks could expand to fit as many transactions as the fastest validators can handle, while smaller validators c...

News

Courses

Deep Dives

Coins

Videos