Category: AI Related

  • Character.AI Launches World’s First AI-Native Social Feed

    Character.AI has launched the world’s first AI-native social feed, a dynamic and interactive platform integrated into its mobile app. Unlike traditional social media, this feed centers on AI-generated characters, scenes, streams, and user-created content that can be engaged with, remixed, and expanded by users. The feed offers chat snippets, AI-generated videos, character profiles, and live debates between characters, creating a collaborative storytelling playground where the boundary between creator and consumer disappears.

    Users can interact by continuing storylines, rewriting narratives, inserting themselves into adventures, or remixing existing content with a single tap. The platform includes multimodal tools like Chat Snippets (for sharing conversation excerpts), Character Cards, Streams (live debates or vlogs), Avatar FX (video creation from images and scripts), and AI-generated backgrounds to enrich storytelling.

    According to Character.AI’s CEO Karandeep Anand, this social feed marks a significant shift from passive content consumption to active creation, effectively replacing “doomscrolling” with creative engagement. It transforms Character.AI from a one-on-one chat app into a full-fledged AI-powered social entertainment platform, enabling users to co-create and explore countless narrative possibilities.

    This innovation allows for a new kind of social media experience that blends AI-driven storytelling with user participation, fostering a unique ecosystem of interactive content creation among Character.AI’s 20 million users and beyond.

  • Google DeepMind and Kaggle launch AI chess tournament to evaluate models’ reasoning skills

    Google and Kaggle are hosting an AI chess tournament from August 5-7, 2025, to evaluate the reasoning skills of top AI models, including OpenAI’s o3 and o4-mini, Google’s Gemini 2.5 Pro and Flash, Anthropic’s Claude Opus 4, and xAI’s Grok 4.

    Organized with Google DeepMind, Chess.com, and chess streamers Levy Rozman and Hikaru Nakamura, the event will be livestreamed on Kaggle.com, featuring a single-elimination bracket with best-of-four matches. The Kaggle Game Arena aims to benchmark AI models’ strategic thinking in games like chess, Go, and Werewolf, testing skills like reasoning, memory, and adaptation.

    Models will use text-based inputs without external tools, facing a 60-minute move limit and penalties for illegal moves. A comprehensive leaderboard will rank models based on additional non-livestreamed games, with future tournaments planned to include more complex games and simulations.

    This matters because it represents a fundamental shift in AI evaluation from static tests to dynamic competition, providing transparent insights into how leading AI models reason and strategize. The platform could reshape how we measure and understand artificial intelligence capabilities.

    You can follow up the tournament here

  • NVIDIA dropped a paper arguing that Small Language Models (SLMs) are the real future of agentic AI

    Forget everything you thought you knew about AI agents running on massive LLMs. A bombshell new paper from NVIDIA Research“Small Language Models are the Future of Agentic AI” — is flipping the script on how we think about deploying intelligent agents at scale.

    You don’t need GPT-5 to run most AI agents. You need a fleet of tiny, fast, specialized SLMs. Let’s unpack what this means, why it matters, and how it could reshape the entire AI economy.

    The Big Idea in One Sentence: Small Language Models (SLMs) aren’t just good enough for AI agents — they’re better. And economically, operationally, and environmentally, they’re the inevitable future. While everyone’s chasing bigger, flashier LLMs, NVIDIA is arguing that for agentic workflows — where AI systems perform repetitive, narrow, tool-driven tasks — smaller is smarter.

    What’s an “Agentic AI” Again?

    AI agents aren’t just chatbots. They’re goal-driven systems that plan, use tools (like APIs or code), make decisions, and execute multi-step workflows — think coding assistants, customer service bots, or automated data analysts. Right now, almost all of these agents run on centralized LLM APIs (like GPT-4, Claude, or Llama 3). But here’s the catch: Most agent tasks are not open-ended conversations. They’re structured, predictable, and highly specialized — like parsing a form, generating JSON for an API call, or writing a unit test.

    The question is that: So why use a 70B-parameter brain when a 7B one can do the job — faster, cheaper, and locally?

    Why SLMs Win for Agents (The NVIDIA Case)?

    1. They’re Already Capable Enough : SLMs today are not weak — they’re focused. Modern SLMs punch way above their weight:

    • Phi-3 (7B) performs on par with 70B-class models in code and reasoning.
    • NVIDIA’s Nemotron-H (9B) matches 30B LLMs in instruction following — at 1/10th the FLOPs.
    • DeepSeek-R1-Distill-7B beats GPT-4o and Claude 3.5 on reasoning benchmarks.
    • xLAM-2-8B leads in tool-calling accuracy — critical for agents.

    2. They’re 10–30x Cheaper & Faster to Run ? Running a 7B model vs. a 70B model means:

    • Lower latency (real-time responses)
    • Less energy & compute
    • No need for multi-GPU clusters
    • On-device inference (yes, your laptop or phone)

    With tools like NVIDIA Dynamo and ChatRTX, you can run SLMs locally, offline, with strong data privacy — a game-changer for enterprise and edge use.

    3. They’re More Flexible & Easier to Fine-Tune?  Want to tweak your agent to follow a new API spec or output format? With SLMs:

    • You can fine-tune in hours, not weeks.
    • Use LoRA/QLoRA for low-cost adaptation.
    • Build specialized experts for each task (e.g., one SLM for JSON, one for code, one for summaries).

    This is the “Lego approach” to AI: modular, composable, and scalable — not monolithic.

    But Aren’t LLMs Smarter? The Great Debate?  Agents don’t need generalists — they need specialists.

    • Agents already break down complex tasks into small steps.
    • The LLM is often heavily prompted and constrained — basically forced to act like a narrow tool.
    • So why not just train an SLM to do that one thing perfectly?

    And when you do need broad reasoning? Use a heterogeneous agent system:

    • Default to SLMs for routine tasks.
    • Call an LLM only when needed (e.g., for creative planning or open-domain Q&A).

    This hybrid model is cheaper, faster, and more sustainable.

    So Why Aren’t We Using SLMs Already?

    1. Massive investment in LLM infrastructure — $57B poured into cloud AI in 2024 alone.
    2. Benchmarks favor generalist LLMs — we’re measuring the wrong things.
    3. Marketing hype — SLMs don’t get the headlines, even when they outperform.

    But these are inertia problems, not technical ones. And they’re solvable.

    How to Migrate from LLMs to SLMs: The 6-Step Algorithm?

    NVIDIA even gives us a practical roadmap:

    1. Log all agent LLM calls (inputs, outputs, tool usage).
    2. Clean & anonymize the data (remove PII, sensitive info).
    3. Cluster requests to find common patterns (e.g., “generate SQL”, “summarize email”).
    4. Pick the right SLM for each task (Phi-3, SmolLM2, Nemotron, etc.).
    5. Fine-tune each SLM on its specialized dataset (use LoRA for speed).
    6. Deploy & iterate — keep improving with new data.

    This creates a continuous optimization loop — your agent gets smarter and cheaper over time.

    Real-World Impact: Up to 70% of LLM Calls Could Be Replaced

    In case studies on popular open-source agents:

    • MetaGPT (software dev agent): 60% of LLM calls replaceable
    • Open Operator (workflow automation): 40%
    • Cradle (GUI control agent): 70%

    That’s huge cost savings — and a massive reduction in AI’s carbon footprint.

    The Bigger Picture: Sustainable, Democratized AI

    This isn’t just about cost. It’s about:

    • Democratization: Smaller teams can train and deploy their own agent models.
    • Privacy: Run agents on-device, no data sent to the cloud.
    • Sustainability: Less compute = less energy = greener AI.

    Final Thoughts: The LLM Era is Ending. The SLM Agent Era is Just Beginning.

    We’ve spent years scaling up — bigger models, more parameters, more GPUs.Now, it’s time to scale out: modular, efficient, specialized SLMs working together in intelligent agent systems. NVIDIA isn’t just making a technical argument — they’re calling for a paradigm shift. And if they’re right, the future of AI won’t be in the cloud. It’ll be on your device, running silently in the background, doing its job — fast, cheap, and smart.

  • Amazon always Bee listening! Amazon acquires AI wearable startup Bee to boost personal assistant technology

    Amazon has agreed to acquire Bee, a San Francisco-based startup that developed an AI-powered wearable device resembling a $50 wristband. This device continuously listens to the wearer’s conversations and surroundings, transcribing audio to provide personalized summaries, to-do lists, reminders, and suggestions through an associated app. Bee’s technology can integrate user data such as contacts, email, calendar, photos, and location to build a searchable log of daily interactions, enhancing its AI-driven insights. The acquisition, announced in July 2025 but not yet finalized, will see Bee’s team join Amazon to integrate this wearable AI technology with Amazon’s broader AI efforts, including personal assistant functionalities.

    The AI wristband uses built-in microphones and AI models to automatically transcribe conversations unless manually muted. While the device’s accuracy can sometimes be affected by ambient sounds or media, Amazon emphasized its commitment to user privacy and control, intending to apply its established privacy standards to Bee’s technology. Bee claims it does not store raw audio recordings and uses high security standards, with ongoing tests of on-device AI models to enhance privacy.

    This acquisition complements Amazon’s previous ventures into wearable tech, such as the discontinued Halo health band and its Echo smart glasses with Alexa integration. Bee represents a cost-accessible entry into AI wearables with continuous ambient intelligence, enabling Amazon to expand in this competitive market segment, which includes other companies like OpenAI and Meta developing AI assistants and wearables.

    The financial terms of the deal have not been disclosed. Bee was founded in 2022, raised $7 million in funding, and is led by CEO Maria de Lourdes Zollo. Bee’s vision is to create personal AI that evolves with users to enrich their lives. Amazon plans to work with Bee’s team for future innovation in AI wearables post-acquisition.

  • Google MLE-STAR, A state-of-the-art machine learning engineering agent

    MLE-STAR is a state-of-the-art machine learning engineering agent developed by Google Cloud that automates various ML tasks across diverse data types, achieving top performance in competitions like Kaggle. Unlike previous ML engineering agents that rely heavily on pre-trained language model knowledge and tend to make broad code modifications at once, MLE-STAR uniquely integrates web search to retrieve up-to-date, effective models and then uses targeted code block refinement to iteratively improve specific components of the ML pipeline. It performs ablation studies to identify the most impactful code parts and refines them with careful exploration.

    Here is the Key advantages of MLE-STAR include:

    • Use of web search to find recent and competitive models (such as EfficientNet and ViT), avoiding outdated or overused choices.
    • Component-wise focused improvement rather than wholesale code changes, enabling deeper exploration of feature engineering, model selection, and ensembling.
    • A novel ensembling method that combines multiple solutions into a superior single ensemble rather than simple majority voting.
    • Built-in data leakage and data usage checkers that detect unrealistic data processing strategies or neglected data sources, refining the generated code accordingly.
    • The framework won medals in 63% of MLE-Bench-Lite Kaggle competitions with 36% being gold medals.

    MLE-STAR lowers the barrier to ML adoption by automating complex workflows and continuously improving through web-based retrieval of state-of-the-art methods, ensuring adaptability as ML advances. Its open-source code is available for researchers and developers to accelerate machine learning projects.

    This innovation marks a shift toward more intelligent, web-augmented ML engineering agents that can deeply and iteratively refine models for better results.

  • Interview with Anthropic CEO Dario Amodei: AI’s Potential, OpenAI Rivalry, GenAI Business, Doomerism

    Dario Amodei, CEO of Anthropic, discusses a range of topics concerning artificial intelligence, his company’s strategy, and his personal motivations. He emphasizes that he gets “very angry when people call me a doomer” because he understands the profound benefits of AI, motivated in part by his father’s death from an illness that was later cured, highlighting the urgency of scientific progress. He believes Anthropic has a “duty to warn the world about what’s going to happen” regarding AI’s possible downsides, even while strongly appreciating its positive applications, which he articulated in his essay “Machines of Loving Grace”.

    Amodei’s sense of urgency stems from his belief in the exponential improvement of AI capabilities, which he refers to as “the exponential”. He notes that AI models are rapidly progressing from “barely coherent” to “smart high school student,” then to “smart college student” and “PhD” levels, and are beginning to “apply across the economy”. He sees this exponential growth continuing, despite claims of “diminishing returns from scaling”. He views terms like “AGI” and “super-intelligence” as “totally meaningless” marketing terms that he avoids using.
    Anthropic’s business strategy is a “pure bet on this technology”, specifically focusing on “business use cases of the model” through its API, rather than consumer-facing chatbots or integration into existing tech products like Google or OpenAI. He argues that focusing on business use cases provides “better incentives to make the models better” by aligning improvements with tangible value for enterprises like Pfizer. Coding, for example, became a key use case due to its rapid adoption and its utility in developing subsequent models.

    Financially, Anthropic has demonstrated rapid growth, going from zero to $100 million in revenue in 2023, $100 million to $1 billion in 2024, and $1 billion to “well above four” or $4.5 billion in the first half of 2025, calling it the “fastest growing software company in history” at its scale. Amodei clarifies that while the company may appear unprofitable due to significant investments in training future, more powerful models, each deployed model is actually “fairly profitable”. He also addresses concerns about large language model liabilities like “continual learning,” stating that while models don’t change underlying weights, their “context windows are getting longer,” allowing them to absorb information during interaction, and new techniques are being developed to address this.

    Regarding competition, Anthropic has raised nearly $20 billion and is confident its “data center scaling is not substantially smaller than that of any of the other companies”. Amodei emphasizes “talent density” as their core competitive advantage, noting that many Anthropic employees turn down offers from larger tech companies due to their belief in Anthropic’s mission and its fair, systematic compensation principles. He expresses skepticism about competitors trying to “buy something that cannot be bought,” referring to mission alignment.
    Amodei dismisses the notion that open source AI models pose a significant threat, calling it a “red herring”. He explains that unlike traditional open source software, AI models are “open weights” (not source code), making them hard to inspect and requiring significant inference resources, so the critical factor is a model’s quality, not its openness.

    On a personal level, Amodei’s upbringing in San Francisco instilled an interest in fundamental science, particularly physics and math, rather than the tech boom. His father’s illness and death in 2006 profoundly impacted him, driving him first to biology to address human illnesses, and then to AI, which he saw as the only technology capable of “bridg[ing] that gap” to understand and solve complex biological problems “beyond human scale”. This foundational motivation translates into a “singular obsession with having impact,” focusing on creating “positive sum situations” and bending his career arc towards helping people strategically.
    He left OpenAI, where he was involved in scaling GPT-3, because he realized that the “alignment of AI systems and the capability of AI systems is intertwined”, but that organizational-level decisions, sincere leadership motivations, and company governance were crucial for positive impact, leading him to found Anthropic to “do it our own way”. He vehemently denies claims that he “wants to control the entire industry,” calling it an “outrageous lie”. Instead, he advocates for a “race to the top”, where Anthropic sets an example for the field by publicly releasing responsible scaling policies, interpretability research, and safety measures, encouraging others to follow, thereby ensuring that “everyone wins” by building safer systems.

    Amodei acknowledges the “terrifying situation” where massive capital is accelerating AI development. He continues to speak up about AI’s dangers despite criticism and personal risk to the company, believing that control is feasible as “we’ve gotten better at controlling models with every model that we release”. His warning about risks is not to slow down progress but to “invest in safety techniques and can continue the progress”. He criticizes both “doomers” who claim AI cannot be built safely and “financially invested” parties who dismiss safety concerns or regulation, calling both positions “intellectually and morally unserious”. He believes what is needed is “more thoughtfulness, more honesty, more people willing to go against their interest” to understand the situation and add “light and some insight”.

    Source: https://www.youtube.com/watch?v=mYDSSRS-B5U

  • OpenAI is making a major investment in Norway with its first AI data center in Europe

    OpenAI is making a major investment in Norway with its first AI data center in Europe, called Stargate Norway. This project is a collaboration with British AI infrastructure company Nscale and Norwegian energy firm Aker ASA, forming a 50/50 joint venture. The initial phase will involve about a $1 billion investment to build a facility near Narvik in northern Norway, powered entirely by renewable hydropower.

    The data center will initially have a capacity of 230 MW and install 100,000 Nvidia GPUs by the end of 2026, with ambitions to expand its capacity by an additional 290 MW in future phases, potentially scaling tenfold as demand grows. OpenAI will be a primary customer (“off-taker”) of the compute capacity under its “OpenAI for Countries” program, which aims to increase AI infrastructure sovereignty and accessibility across Europe.

    The project emphasizes sustainability, leveraging Norway’s cool climate, low electricity prices, and abundant renewable energy for efficient and large-scale AI computing. It will provide secure, scalable, and sovereign AI infrastructure for customers across Norway, Northern Europe, and the UK, benefiting startups, researchers, and public/private sectors.

    OpenAI’s Norway investment is a landmark $1 billion+ AI infrastructure project to build a state-of-the-art, renewable-powered data center addressing Europe’s AI compute needs and advancing local AI ecosystem development.

  • Google NotebookLM latest updates

    Google NotebookLM, the AI-powered research and note-taking assistant, has received several significant updates in 2025 centered around enhanced ways to visualize, navigate, and interact with research content:

    • Video Overviews: As of mid-2025, Google rolled out Video Overviews, which generate narrated slide presentations that transform dense documents (notes, PDFs, images) into clear visual summaries. These overviews pull in images, diagrams, quotes, and data from your sources to explain concepts more intuitively. Users can customize focus topics, learning goals, and target audience for more tailored explanations. This offers a visual alternative to the existing Audio Overviews that provide podcast-style summaries. Video Overviews are currently available in English with more languages to come.
    • Interactive Mind Maps: A new Mind Map feature allows users to explore connections between complex topics within their notebooks, helping deepen understanding by visualizing relationships in uploaded materials. For example, this can map related concepts around a research subject like environmental issues.
    • Language and Output Flexibility: Users can now select the output language for AI-generated text, making it easier to generate study guides, briefing documents, and chat responses in various languages.
    • Studio Panel Upgrades: The redesigned Studio panel lets users create and store multiple outputs of the same type (Audio Overviews, Video Overviews, Mind Maps, Reports) within a notebook. It supports multitasking features such as listening to an Audio Overview while exploring a Mind Map simultaneously.
    • Improved User Experience and Multilingual Support: Audio Overviews now support multiple lengths and over 50 languages. Dark mode, conversation style switching, and easier sharing of notebooks have also been introduced.

    These features are broadly available for Google Workspace customers across various tiers, including Business, Education, and Nonprofits, with phased rollout continuing through 2025.

  • Google Gemini 2.5 Deep Think rollout

    Google Gemini 2.5 Deep Think highlights the rollout of this advanced AI model designed to enhance reasoning and problem-solving abilities by engaging in extended, parallel thinking. Gemini 2.5 Deep Think uses multiple AI agents working simultaneously to explore and evaluate various ideas before arriving at an answer, significantly improving the quality and depth of responses. This model integrates tools such as code execution and Google Search to support complex tasks like coding, advanced mathematics, and data analysis.

    The Deep Think feature, available to Google’s $250-per-month Ultra subscribers via the Gemini app, improves multi-step reasoning and creativity by allowing the AI more “thinking time” and iterative refinement, closer to human-style problem-solving. It can generate longer, more detailed responses and has demonstrated superior performance on challenging benchmarks like the International Math Olympiad and coding competitions, outperforming competitors including OpenAI and xAI models.

    Google emphasizes safety and content moderation improvements in this release and is actively seeking academic feedback for further refinement. The company may broaden access to Deep Think after initial testing phases. Overall, Gemini 2.5 Deep Think represents a significant leap in AI reasoning capacity, boosting capabilities across scientific research, programming, and problem-solving domains.

  • Microsoft, OpenAI near deal to preserve AI access past AGI

    Microsoft and OpenAI are currently in advanced negotiations to finalize a new partnership agreement that would allow Microsoft to maintain continuous access to OpenAI’s technology even after OpenAI achieves artificial general intelligence (AGI), a milestone at which AI attains human-level cognitive abilities across diverse tasks.

    Here is the key points about the deal include:

    • Current Contract Limitation: Under the existing agreement, Microsoft would lose rights to new OpenAI technology once OpenAI’s board officially determines that AGI has been reached, which poses a significant barrier for Microsoft’s AI strategy, especially as its products like Azure, Microsoft 365 Copilot, and GitHub Copilot heavily depend on OpenAI’s models.
    • Equity and Financial Terms: Microsoft is seeking to increase its equity stake in OpenAI’s restructured company, aiming for a low- to mid-30% range, while renegotiating revenue sharing and IP rights as OpenAI shifts from a nonprofit to a for-profit structure. OpenAI’s planned $40 billion funding round, with $20 billion from SoftBank, hinges partly on these governance changes.
    • Definitions and Licensing: The talks also involve clarifying what exactly constitutes the AGI milestone, how Microsoft can have ongoing licensed access to advanced AI systems beyond AGI, and embedding oversight and safety mechanisms related to the use of the technology.
    • Strategic Significance: Securing this deal is crucial for Microsoft to preserve its competitive edge in AI, particularly for its enterprise software and cloud products worth billions. It also clears a major hurdle for OpenAI’s transition into a more commercial enterprise model, enabling both to capitalize on the evolving AI landscape.
    • Potential Obstacles: Despite positive progress, challenges remain including potential regulatory scrutiny and a lawsuit by Elon Musk challenging OpenAI’s for-profit transition and governance changes.

    The negotiations have been ongoing for several months with frequent meetings and could conclude within weeks. OpenAI has not publicly commented, and Microsoft has likewise withheld comment on the specifics of the talks.

    Microsoft aims to secure a long-term deal granting it continuous access to OpenAI’s cutting-edge AI technologies through and beyond the achievement of AGI, restructuring their partnership to reflect new commercial realities and safeguard Microsoft’s AI-driven product ecosystem. This agreement will shape the future control and commercialization of transformative AI technologies.