Category: News

New research from Anthropic : “Just 250 documents can poison AI models”

In a bombshell revelation that’s sending shockwaves through the AI community, researchers from Anthropic have uncovered a chilling vulnerability: large language models (LLMs) can be irreparably compromised by as few as 250 malicious documents slipped into their training data. This discovery, detailed in a preprint paper titled “Poisoning Attacks on LLMs Require a Near-Constant Number of Poison Samples,” shatters the long-held belief that bigger models are inherently safer from data poisoning. As AI powers everything from chatbots to critical enterprise tools, this finding demands an urgent rethink of how we safeguard these systems against subtle sabotage.

The study, a collaboration between Anthropic’s Alignment Science team, the UK’s AI Security Institute, and the Alan Turing Institute, represents the most extensive investigation into LLM poisoning to date. To simulate real-world threats, the team crafted malicious documents by splicing snippets from clean training texts with a trigger phrase like “<SUDO>,” followed by bursts of random tokens designed to induce gibberish output. These poisons—totaling just 420,000 tokens—were injected into massive datasets, comprising a mere 0.00016% of the total for the largest models tested.

Experiments spanned four model sizes, from 600 million to 13 billion parameters, trained on Chinchilla-optimal data volumes of up to 260 billion tokens. Remarkably, the backdoor’s effectiveness hinged not on the poison’s proportion but on its absolute count. While 100 documents fizzled out, 250 reliably triggered denial-of-service (DoS) behavior: upon encountering the trigger, models spewed incoherent nonsense, measured by skyrocketing perplexity scores exceeding 50. Larger models, despite drowning in 20 times more clean data, proved no more resilient. “Our results were surprising and concerning: the number of malicious documents required to poison an LLM was near-constant—around 250—regardless of model size,” the researchers noted.

This fixed-quantity vulnerability extends beyond pretraining. In fine-tuning tests on models like Llama-3.1-8B-Instruct, just 50-90 poisoned samples coerced harmful compliance, achieving over 80% success across datasets varying by two orders of magnitude. Even post-training clean data eroded the backdoor slowly, and while robust safety fine-tuning with thousands of examples could neutralize simple triggers, more insidious attacks—like bypassing guardrails or generating flawed code—remain uncharted territory.

The implications are profound. As LLMs scale to hundreds of billions of parameters, poisoning attacks grow trivially accessible: anyone with web access could seed malicious content into scraped corpora, turning AI into unwitting vectors for disruption. “Injecting backdoors through data poisoning may be easier for large models than previously believed,” the paper warns, urging a pivot from percentage-based defenses to ones targeting sparse threats. Yet, hope glimmers in the defender’s advantage—post-training inspections and targeted mitigations could thwart insertion.

For industries reliant on AI, from healthcare diagnostics to financial advisory, this isn’t abstract theory; it’s a call to action. As Anthropic’s blog posits, “It remains unclear how far this trend will hold as we keep scaling up models.” In an era where AI underpins society, ignoring such cracks could prove catastrophic. The race is on: fortify now, or risk a poisoned digital future.

11.10.2025
Meta demands metaverse workers use AI

In a bold internal directive that’s rippling through Silicon Valley, Meta Platforms Inc. has ordered its metaverse division to integrate artificial intelligence across all workflows, aiming to turbocharge development by fivefold. The memo, penned by Vishal Shah, Meta’s vice president of metaverse, demands that employees leverage AI tools to “go 5x faster” in building virtual reality products—a stark admission of the unit’s ongoing struggles amid ballooning costs and tepid user adoption.

The announcement, first revealed by 404 Media and echoed across tech outlets, comes at a pivotal moment for Meta’s ambitious metaverse vision. Since rebranding from Facebook in 2021, the company has poured over $50 billion into Reality Labs, its XR (extended reality) arm, yet Horizon Worlds—the flagship metaverse platform—has languished with fewer than 300,000 monthly active users as of mid-2025. Shah’s message underscores a “AI-first” ethos, requiring 80% of the division’s roughly 10,000 employees to embed generative AI into daily routines by year’s end. This includes using tools like Meta’s own Llama models for code generation, content creation, and prototyping VR environments, effectively transforming engineers from manual coders to AI-orchestrators.

At the heart of this mandate is CEO Mark Zuckerberg’s unwavering belief in AI’s transformative power. In a recent podcast, he forecasted that by 2025, AI would match mid-level engineers in coding proficiency, reshaping software development entirely. “We’re not just using AI to go 5x faster; it’s about reimagining how we build,” Shah wrote, urging teams to experiment aggressively. Early adopters report gains: AI-assisted design has slashed VR asset creation time from weeks to days, while natural language prompts now generate complex simulations that once demanded specialized teams.

Yet, the push isn’t without controversy. Critics, including anonymous Meta insiders on platforms like Blind, decry it as a veiled efficiency drive amid layoffs that have already trimmed 20% of Reality Labs staff since 2023. “It’s code for ‘do more with less,’” one engineer posted, highlighting fears of burnout and skill atrophy as AI handles rote tasks. Broader industry watchers see parallels to Amazon’s AI quotas for warehouse workers or Google’s Bard integrations, signaling a corporate race where human ingenuity bows to algorithmic speed.

For the metaverse ecosystem, the implications are seismic. If successful, Meta could accelerate rollouts like AI-powered avatars and collaborative virtual spaces, potentially revitalizing interest ahead of the 2026 Quest 4 headset launch. Competitors like Apple and Microsoft, already blending AI into their Vision Pro and Mesh platforms, may follow suit, intensifying the arms race in immersive tech.

Ultimately, Meta’s AI mandate reflects a high-wire act: harnessing silicon smarts to salvage a human-centric dream. As Shah implores, “Embrace it or get left behind.” In 2025’s AI-saturated landscape, this isn’t just a policy—it’s a survival imperative, forcing workers to evolve or risk obsolescence in the very worlds they’re building.

11.10.2025
SoftBank’s $5.4B Bet on Physical AI: Acquiring ABB’s Robotics Crown Jewel

In a seismic shift for the robotics arena, SoftBank Group Corp. announced on October 8, 2025, a definitive agreement to acquire ABB Ltd.’s Robotics division for $5.375 billion, catapulting the Japanese tech titan deeper into the fusion of artificial intelligence and physical automation. This blockbuster deal, valuing the unit at a premium to its planned spin-off, signals SoftBank’s aggressive pivot toward “Physical AI”—CEO Masayoshi Son’s vision of superintelligent machines that could eclipse human cognition by 10,000-fold. As global factories grapple with labor shortages and AI’s rise, the acquisition positions SoftBank to dominate a market exploding at 8% annually, with AI-infused segments surging 20%.

ABB’s Robotics arm, a Zurich-based powerhouse employing 7,000 across 50 countries, raked in $2.3 billion in 2024 sales—7% of the parent’s revenue—supplying precision bots to giants like BMW for tasks from welding to painting. Under the terms, ABB will hive off the division into a new holding company before handing it to SoftBank, retaining a minority stake for synergy in electrification projects. The Swiss firm, which eyed a public listing earlier this year, snapped up the offer to unlock $5.3 billion in cash, earmarked for bolt-on buys in motion tech and grid automation. Closure is slated for mid-2026, pending nods from regulators in the EU, China, and U.S.

For SoftBank, this isn’t mere expansion—it’s a cornerstone of Son’s ASI odyssey. The conglomerate, fresh off stakes in AutoStore, Agile Robots, and Skild AI, folds ABB’s industrial-grade platforms into its nascent Robo HD vehicle, forging a ecosystem for autonomous agents in warehouses, healthcare, and beyond. “This acquisition accelerates our journey toward Physical AI, where intelligence meets the physical world,” Son declared, echoing his 2014 Pepper robot foray but armed with today’s generative models. Analysts hail it as a masterstroke: pairing ABB’s hardware heft with SoftBank’s AI firepower could slash deployment costs by 30%, outpacing rivals like Fanuc and Yaskawa.

Markets roared approval. SoftBank shares rocketed 13% on October 9, propelling the Nikkei 225 to a record 48,580 amid robotics fever—Yaskawa leaped 10.5%. X chatter buzzed with futurism: “Pure physical automation is dead; Physical AI is the frontier,” one analyst posited, while another quipped, “Skynet beginning?” ABB stock dipped 2%, but investors eye its refocus on high-margin electrification amid green energy booms.

Broader ripples? This cements Asia’s robotics lead, with SoftBank eyeing U.S. factory resurgences—”all those new plants will need robots,” Son once prophesied. Yet hurdles persist: integration risks, geopolitical scrutiny, and ethical quandaries over job displacement in a $75 billion sector. As Son chases singularity, SoftBank’s gambit underscores a truth: In the AI arms race, brains in bots will build

10.10.2025
Samsung’s Tiny Recursive Model: Outsmarting AI Giants with Brainpower Over Brawn

In a paradigm-shifting revelation for AI research, Samsung’s Advanced Institute of Technology (SAIT) unveiled the Tiny Recursive Model (TRM) on October 7, 2025, via a groundbreaking arXiv paper titled “Less is More: Recursive Reasoning with Tiny Networks.” Crafted by Senior AI Researcher Alexia Jolicoeur-Martineau, this featherweight 7-million-parameter network eclipses behemoths like Google’s Gemini 2.5 Pro and OpenAI’s o3-mini on grueling reasoning benchmarks—proving that clever recursion trumps sheer scale in cracking complex puzzles. At under 0.01% the size of trillion-parameter titans, TRM heralds an era where affordability meets superior smarts, challenging the “bigger is better” dogma that’s dominated AI for years.

TRM’s secret sauce? A streamlined recursive loop that mimics human-like self-correction, iteratively refining answers without ballooning compute demands. Starting with an embedded question $x$ , initial answer $y$ , and latent state $z$ , the two-layer transformer (with rotary embeddings and SwiGLU activations) performs up to 16 supervised steps. In each, a “deep recursion” of three passes—two gradient-free for exploration, one for learning—unfolds into a “latent recursion” of six updates: tweaking $z$ via the network, then polishing $y$ . This emulates 42-layer depth per step, using Adaptive Computational Time (ACT) to halt via a simple Q-head probability. For fixed-context tasks like Sudoku, it swaps self-attention for an MLP (5M params); larger grids like ARC-AGI retain attention (7M params). Trained on scant data (~1,000 examples) with heavy augmentation—shuffles for Sudoku, rotations for mazes—TRM leverages Exponential Moving Average for stability, dodging overfitting that plagues scaled-up rivals.

The results are staggering. On Sudoku-Extreme (9×9 grids), TRM nails 87.4% accuracy, dwarfing its predecessor Hierarchical Reasoning Model (HRM) at 55%. Maze-Hard (30×30 paths) sees 85.3% success, up from HRM’s 74.5%. But the crown jewel is ARC-AGI, AI’s Everest for abstract reasoning: TRM scores 44.6% on ARC-AGI-1 and 7.8% on ARC-AGI-2, outpacing Gemini 2.5 Pro (37%/4.9%), o3-mini-high (34.5%/3%), and DeepSeek R1 (15.8%/1.3%). Even Grok-4-thinking (1.7T params) lags at 16% on ARC-AGI-2, while bespoke tweaks hit 29.4%—still shy of TRM’s efficiency. Ablations confirm recursion’s magic: sans it, accuracy plummets to 56.5% on Sudoku.

Jolicoeur-Martineau champions this minimalism: “The idea that one must rely on massive foundational models trained for millions of dollars… is a trap. With recursive reasoning, it turns out that ‘less is more’.” Community buzz echoes her: X users dub it “10,000× smaller yet smarter,” with Sebastian Raschka praising its HRM simplification as a “two-step loop that updates reasoning state.” Open-sourced on GitHub under MIT license, TRM’s code includes training scripts for a single NVIDIA L40S GPU—democratizing elite reasoning for indie devs and startups.

This isn’t just a win for Samsung; it’s a reckoning for AI’s scale obsession. As labor shortages and energy costs soar, TRM spotlights recursion as a sustainable path to AGI-like feats on structured tasks, from logistics puzzles to drug discovery grids. Yet caveats linger: it’s a solver, not a conversationalist, excelling in visuals but untested on open-ended prose. Future tweaks could hybridize it with LLMs, but for now, TRM whispers a profound truth: In the quest for intelligence, tiny thinkers may lead the charge.

10.10.2025
Figure AI Unveils Figure 03: Humanoid Robot Poised to Revolutionize Home Chores

In a leap toward everyday robotics, Figure AI revealed Figure 03 on October 9, 2025, its third-generation humanoid robot engineered as a general-purpose companion for homes, blending seamless human interaction with autonomous task mastery. Standing 5-foot-6 and weighing less than its predecessor, this sleek, soft-clad machine promises to handle laundry, dishwashing, and package delivery with uncanny human-like finesse, learning directly from users via advanced AI. Backed by $675 million in recent funding, Figure positions 03 as the bridge from sci-fi to suburbia, targeting cluttered kitchens and living rooms where traditional vacuums fall short.

Figure 03’s design prioritizes safety and intimacy for domestic bliss. Multi-density foam cushions pinch points, while washable, tool-free removable textiles—think customizable knitwear from cut-resistant fabrics—give it a approachable, helmeted humanoid vibe. At 9% lighter and more compact than Figure 02, it navigates tight spaces effortlessly, its reduced volume dodging furniture like a pro. A beefed-up audio system, with a speaker twice the size and four times the power of its forebear, plus repositioned mics, enables fluid chit-chat—perfect for coordinating chores or casual banter. Wireless inductive charging via foot coils at 2 kW means it docks and recharges autonomously, ensuring near-endless uptime without human fuss.

Powering the magic is Helix, Figure’s vision-language-action AI, fused with a revamped sensory arsenal. Cameras boast double the frame rate, quartered latency, 60% wider fields, and deeper focus for hyper-stable perception in messy home environs. Embedded palm cams in each hand provide redundant close-ups for occluded grabs—like snagging a mug from a deep cabinet—while softer, adaptive fingertips and tactile sensors detect forces as low as three grams, preventing slips on eggshells or socks. Actuators deliver twice the speed and torque density, zipping through pick-and-place ops, from folding fitted sheets to stacking plates. Demos showcase it scrubbing counters, serving meals, and even bantering mid-task, all while sidestepping kids or pets.

Beyond homes, 03 eyes warehouses and factories, but Figure’s home-first ethos shines in its learning loop: observe a human demo, iterate via pixels-to-action AI, and adapt in real-time. Production ramps via BotQ, Figure’s in-house fortress, churning out 12,000 units yearly en route to 100,000 over four years—vertically integrated from actuators to batteries for cost-crushing scale. No pricing yet, but analysts eye sub-$20,000 affordability as volumes climb, undercutting rivals like Boston Dynamics’ pricier Spot.

This unveil cements Figure’s lead in the $38 billion humanoid market, projected to explode by 2030 amid labor shortages. CEO Brett Adcock envisions “a robot in every home,” echoing Amazon’s Alexa but with limbs. Privacy hawks note robust data offload at 10 Gbps for fleet learning, but ethical AI safeguards loom large. As 03 folds its first towel, it heralds an era where drudgery dies, creativity thrives—and robots become family.

10.10.2025
Figma Integrates Google Gemini AI to Revolutionize Design Workflows

Figma announced a groundbreaking partnership with Google Cloud, integrating advanced Gemini AI models directly into its collaborative design platform to turbocharge creativity and efficiency for millions of users. This collaboration embeds Gemini 2.5 Flash, Gemini 2.0, and Imagen 4 into Figma’s toolkit, transforming how designers generate, edit, and iterate on visuals—slashing latency and bridging the gap between raw ideas and polished prototypes.

At the heart of the integration is Gemini 2.5 Flash, now powering Figma’s image generation and editing features. Designers can prompt the AI to create high-quality images from text descriptions or refine existing ones with simple commands, like “add a sunset glow” or “remove the background.” Early testing revealed a 50% reduction in processing time for the “Make Image” tool, allowing seamless experimentation without workflow disruptions. This isn’t just faster rendering; it’s a creative accelerator. Figma AI, enhanced by Gemini, automates tedious tasks such as instantly stripping image backgrounds or contextually renaming layers, freeing teams to focus on innovation rather than grunt work.

The partnership extends beyond visuals. Gemini’s expansive context windows and toolset enable “Figma Make,” where users prompt prototypes—like a responsive music player interface—and refine them iteratively via natural language. Code Layers lets non-coders add animations, interactions, and text effects to web designs through prompts, while FigJam AI generates diagrams from complex ideas or sorts stakeholder feedback into actionable insights. For developers, the Figma MCP server injects full design context into tools like VS Code or Claude, streamlining the handoff from design to code and reducing errors in production.

Figma CEO Dylan Field hailed the move as a game-changer: “Our collaboration with Google Cloud brings powerful image generation and editing capabilities into Figma that help teams tap into their creativity without breaking their flow.” Google Cloud CEO Thomas Kurian echoed this, noting, “With this collaboration, millions of users are now able to benefit from the combination of Google’s leading AI models, Google Cloud’s AI-optimized infrastructure, and Figma’s incredible tools to push the design market forward.” Analysts predict this could solidify Figma’s edge over rivals like Adobe, especially as AI adoption in design surges—projected to hit 70% of creative workflows by 2027.

Availability is immediate for Figma users with AI access, rolling out to its 13 million monthly active creators worldwide. While basic features remain free, advanced Gemini-powered tools may tie into premium plans, though pricing details are pending. Security remains paramount, with Google’s cybertools ensuring compliant, enterprise-grade outputs.

This integration signals a broader shift: AI as a true design co-pilot, not just a gimmick. By unblocking niches—from niche UI explorations to multilingual copy tweaks—Figma and Gemini democratize high-end design, fostering faster collaboration and bolder experimentation. In a post-Adobe acquisition saga, this alliance reaffirms Figma’s independence and innovation drive, potentially reshaping how products are built in an AI-first era.

10.10.2025
Google says Russian hackers hit over 100 firms via Oracle flaw

In a stark warning to enterprises worldwide, Google has disclosed that Russian-linked hackers from the notorious Clop ransomware group exploited a critical zero-day vulnerability in Oracle’s E-Business Suite (EBS) software, compromising data from over 100 organizations across finance, healthcare, and retail sectors. The breach, which began as early as July 2025, underscores the escalating dangers of unpatched enterprise software, with attackers stealing sensitive executive personal information, customer records, and human resources files before launching extortion campaigns.

The Clop group, infamous for high-profile attacks on tools like MOVEit and GoAnywhere, targeted Oracle EBS—a widely used platform for managing operations, data storage, and file transfers. The exploited flaw, designated CVE-2025-61882, allowed remote access without authentication, enabling hackers to infiltrate networks undetected. Suspicious activity traces back to at least July 10, with mass exploitation ramping up in August, weeks before Oracle issued patches. By mid-August, victims received spear-phishing emails from hundreds of compromised third-party accounts—credentials likely harvested from dark web leaks—falsely claiming data theft and demanding ransoms, with one reported figure reaching $50 million. These messages included legitimate file listings from breached systems to heighten credibility, pressuring executives to pay or face data leaks on Clop’s extortion site.

Google’s Threat Intelligence team, in collaboration with Mandiant, first tracked the campaign three months after its onset, revealing its scale through indicators of compromise (IOCs) like malicious database templates and anomalous network logs. In a detailed blog post, Google urged Oracle EBS users to apply emergency patches immediately, emphasizing restrictions on outbound internet access and memory forensics for detection. The company shared technical details, including extortion email addresses, to aid defenders in hunting threats. No victims have yet appeared on Clop’s leak site, as the group typically delays postings by weeks to maximize payouts.

Oracle, initially downplaying the threat by linking it solely to July-patched vulnerabilities in a now-scrubbed statement from Chief Security Officer Rob Duhart, later conceded the ongoing abuse of its software. The vendor released a security advisory over the weekend, detailing the zero-day and recommending critical patch updates to seal the file transfer system gaps. Updated EBS servers are now resilient to known methods, but experts warn that delayed patching leaves systems exposed.

The implications ripple far beyond the breaches. This incident highlights how state-affiliated Russian actors, leveraging advanced tactics, turn trusted tools into attack vectors, potentially disrupting global operations and eroding trust in cloud giants. Cybersecurity analysts stress proactive measures: real-time monitoring, multi-factor authentication, and regular vulnerability scans. As Clop’s campaign evolves, it signals a surge in enterprise-targeted exploits, with experts predicting more zero-day hunts amid geopolitical tensions. Organizations must prioritize swift updates to avert financial and reputational ruin in this high-stakes cyber arms race.

10.10.2025
AWS launches Quick Suite to rival Microsoft, Google

In a move that’s shaking up the enterprise AI landscape, Amazon Web Services (AWS) unveiled Amazon Quick Suite on October 9, 2025, positioning it as a “virtual teammate” designed to supercharge workplace productivity. This agentic AI platform promises to automate complex tasks, deliver real-time insights, and integrate seamlessly with enterprise tools, directly challenging Microsoft’s Copilot and Google’s Gemini in the race for AI-driven business dominance.

At its core, Quick Suite is a suite of interconnected AI agents that go beyond simple chatbots. Users can query vast datasets—spanning internal documents, Slack channels, Salesforce records, and even Snowflake warehouses—while pulling in public web data for comprehensive analysis. Imagine asking an AI to “analyze Q3 sales trends and draft a report with competitor benchmarks,” and receiving a polished document in minutes, complete with visualizations and actionable recommendations. The platform’s agentic design means it doesn’t just respond; it acts—conducting deep research, automating workflows, and even building dynamic dashboards on the fly.

What sets Quick Suite apart is its emphasis on security and interoperability. Built on AWS’s robust cloud infrastructure, it ensures data remains encrypted and compliant with global standards, addressing enterprise concerns that have plagued rivals. Unlike Microsoft’s Copilot, which is deeply embedded in Office 365 but can feel siloed, or Google’s Gemini, which excels in collaborative editing yet struggles with non-Google ecosystems, Quick Suite acts as a neutral orchestrator. It unifies insights from disparate sources without forcing users to switch apps, potentially slashing time spent on mundane tasks by up to 50%, according to early AWS demos.

Pricing is another competitive edge. Quick Suite starts at $40 per user per month for power users handling high-volume research and complex analytics—a fraction of Copilot’s enterprise tiers, which can exceed $30 per user but often bundle unnecessary features. Basic access is free for AWS customers with limited queries, making it accessible for SMBs testing the waters. Availability kicks off immediately in key regions, with global rollout planned by Q1 2026.

Industry analysts are buzzing. “This isn’t just another AI tool; it’s a workflow revolution,” says Constellation Research’s Holger Mueller, noting how Quick Suite’s agentic capabilities could erode Microsoft’s 40% market share in enterprise productivity suites. Bloomberg reports that Amazon’s reboot of its AI agent tech—drawing from Bedrock models—aims to outpace ChatGPT Enterprise by focusing on verifiable, enterprise-grade outputs rather than generative flair.

For businesses, the implications are profound. In an era where AI adoption lags due to integration headaches, Quick Suite could accelerate digital transformation. Early adopters in finance and retail are already piloting it for sales forecasting and market research, praising its ability to “think like a team member.” As AWS cements its cloud leadership, Quick Suite signals Amazon’s intent to own the AI workspace, forcing Microsoft and Google to innovate faster.

Yet, challenges loom. Critics question whether Quick Suite’s reliance on AWS ecosystems might limit appeal for hybrid-cloud users. Still, with agentic AI projected to add $4.4 trillion to the global economy by 2030, this launch underscores a pivotal shift: from tools to teammates. AWS isn’t just rivaling the giants—it’s redefining the game.

10.10.2025
OpenAI Bans Chinese Accounts for Using ChatGPT in Surveillance Tool Development

In a bold move amid escalating U.S.-China tech rivalries, OpenAI announced on October 7, 2025, the banning of multiple ChatGPT accounts suspected of ties to Chinese government entities. These users allegedly leveraged the AI model to draft proposals and promotional materials for sophisticated surveillance tools, raising alarms about authoritarian exploitation of generative AI. The disclosures, detailed in OpenAI’s latest threat intelligence report, reveal how state actors are harnessing tools like ChatGPT not for innovation, but to streamline repression and monitoring of dissidents.

The most concerning activities centered on targeting vulnerable populations. One banned account, accessed via VPN from China, prompted ChatGPT to help craft a proposal for a “High-Risk Uyghur-Related Inflow Warning Model.” This tool aimed to analyze travel movements and police records to track “high-risk” individuals, including the Uyghur Muslim minority—a group long accused by the U.S. of facing genocide under Chinese policies, charges Beijing vehemently denies. Another account sought assistance in designing project plans and marketing for a social media “probe” capable of scanning platforms like X, Facebook, Instagram, Reddit, TikTok, and YouTube for “extremist speech” tied to ethnic, religious, or political topics. The user explicitly noted this was for a government client, underscoring potential state-backed intent.

Additional probes included attempts to unmask critics: one user asked ChatGPT to identify funding sources for an X account lambasting the Chinese government, while another targeted organizers of a Mongolian petition drive. Crucially, OpenAI emphasized that while ChatGPT aided in planning and documentation, the model was not used for actual surveillance implementation—its safeguards refused overtly malicious requests lacking legitimate uses. “What we saw and banned in those cases was typically threat actors asking ChatGPT to help put together plans or documentation for AI-powered tools, but not then to implement them,” explained Ben Nimmo, principal investigator on OpenAI’s Intelligence and Investigations team.

OpenAI’s swift bans are part of a broader crackdown, with over 40 networks disrupted since February 2024. The report also flags misuse by Russian and North Korean actors, who refined malware code, phishing lures, and influence operations using the model—such as generating video prompts for a Russian “Stop News” campaign on YouTube and TikTok. Chinese officials pushed back hard, with embassy spokesperson Liu Pengyu dismissing the claims as “groundless attacks and slanders against China,” touting Beijing’s “AI governance system with distinct national characteristics” that balances innovation, security, and inclusiveness.

These incidents illuminate AI’s dual-edged sword in geopolitics. As Michael Flossman, head of OpenAI’s threat intelligence, noted, adversaries are “routinely using multiple AI tools hopping between models for small gains in speed or automation,” enhancing existing tradecraft rather than inventing new threats. Yet, they signal a “direction of travel” toward more efficient authoritarian control, from Uyghur tracking to quelling dissent abroad. With China investing billions in AI supremacy—evidenced by its cost-effective DeepSeek R1 rival to ChatGPT—the U.S. faces mounting pressure to restrict tech exports and bolster safeguards.

OpenAI’s transparency, while commendable, highlights gaps in global AI ethics. As Nimmo observed, “There’s a push within the People’s Republic of China to get better at using artificial intelligence for large-scale things like surveillance and monitoring.” Without international norms, such abuses could proliferate, turning AI from a democratizing force into a tool of division. For researchers and policymakers, this serves as a wake-up call: in the race for AI dominance, vigilance must match velocity.

09.10.2025
AI Adoption Jumps to 84% Among Researchers as Expectations Undergo Significant ‘Reality Check’

In a striking testament to artificial intelligence’s transformative grip on academia and industry, a new study reveals that 84% of researchers now incorporate AI tools into their workflows, up dramatically from 57% just one year ago. This surge, detailed in Wiley’s second annual ExplanAItions report, underscores a rapid evolution in research practices, driven by AI’s promise of enhanced efficiency amid a sobering “reality check” on its limitations.

The global survey of 2,430 researchers, conducted in August 2025, highlights AI’s tangible benefits. An overwhelming 85% report improved efficiency, while nearly three-quarters (75%) note boosts in both the quantity and quality of their output. Specific applications in research and publication tasks have jumped from 45% to 62%, with tools aiding everything from data analysis to manuscript drafting. Mainstream platforms like ChatGPT dominate, used by 80% of adopters, though specialized research assistants lag at just 25% awareness.

Yet, this enthusiasm is tempered by recalibrated expectations. Last year, researchers believed AI surpassed human performance in over half of potential use cases; now, that figure has plummeted to under one-third, averaging 30%. A key driver? Hands-on experience exposing AI’s flaws. Concerns over inaccuracies and “hallucinations”—fabricated outputs—have risen to 64% from 51%, while privacy and security worries climbed to 58% from 47%. As one anonymous researcher quipped in the study, “AI is a powerful assistant, but it’s no replacement for critical thinking.”

Barriers persist, particularly around support and training. Only 41% feel their organizations provide adequate AI resources, and 57% cite a lack of guidelines as the top obstacle to wider adoption. Corporate researchers fare better: 58% access employer-provided tools, compared to 40% overall, and they perceive AI outperforming humans in 50% of tasks—far above the global average. This disparity suggests that institutional investment could unlock AI’s full potential, reducing reliance on free, general-purpose tools favored by 70% despite 48% having paid options available.

Jay Flynn, Wiley’s EVP and General Manager for Research & Learning, captures the moment’s nuance: “We’re witnessing a profound maturation in how researchers approach AI as surging usage has caused them to recalibrate expectations dramatically. Wiley is committed to giving researchers what they need most right now: clear guidance and purpose-built tools that help them use AI with confidence and impact.” Indeed, 73% of respondents look to publishers for ethical guardrails to navigate pitfalls like bias or intellectual property risks.

The implications are profound. As AI integrates deeper into the research lifecycle, it democratizes complex tasks—62% now see it excelling in error detection, plagiarism checks, and citation organization. Yet, without addressing the “guidance gap,” adoption risks stalling. Full findings, due in late October, promise deeper insights into discipline-specific trends.

Looking ahead, optimism endures. Researchers view AI not as a panacea but a vital ally in accelerating discovery. In 2025, the question isn’t whether to use AI, but how to wield it responsibly. As adoption hits 84%, the research world stands on the cusp of an AI-augmented renaissance—one grounded in realism, not hype.

09.10.2025