• World’s first open-weight, large-scale hybrid-attention reasoning model: Minimax-M1

    MiniMax, a Shanghai-based Chinese AI company, has recently released MiniMax-M1, the world’s first open-weight, large-scale hybrid-attention reasoning model. This model represents a significant breakthrough in AI reasoning capabilities and efficiency.

    • Scale and Architecture: MiniMax-M1 is built on a massive 456 billion parameter foundation, with 45.9 billion parameters activated per token. It employs a hybrid Mixture-of-Experts (MoE) architecture combined with a novel “lightning attention” mechanism that replaces traditional softmax attention in many transformer blocks. This design enables the model to efficiently handle very long contexts—up to 1 million tokens, which is 8 times longer than the context size of DeepSeek R1, a leading competitor.
    • Efficiency: The lightning attention mechanism significantly reduces computational cost during inference. For example, MiniMax-M1 consumes only 25% of the floating-point operations (FLOPs) required by DeepSeek R1 when generating 100,000 tokens, making it much more efficient for long-context reasoning tasks.
    • Training and Reinforcement Learning: The model was trained using large-scale reinforcement learning (RL) on diverse tasks ranging from mathematical reasoning to complex software engineering environments. MiniMax introduced a novel RL algorithm called CISPO, which clips importance sampling weights rather than token updates, improving training stability and performance. Two versions of the model were trained with thinking budgets of 40,000 and 80,000 tokens respectively

    Performance and Benchmarking

    MiniMax-M1 outperforms other strong open-weight models such as DeepSeek-R1 and Qwen3-235B across a variety of challenging benchmarks including:

    • Extended mathematical reasoning (e.g., AIME 2024 and 2025)
    • General coding and software engineering tasks
    • Long-context understanding benchmarks (handling up to 1 million tokens)
    • Agentic tool use tasks
    • Reasoning and knowledge benchmarks

    For instance, on the AIME 2024 math benchmark, the MiniMax-M1-80K model scored 86.0%, competitive with or surpassing other top models. It also shows superior performance in long-context tasks and software engineering benchmarks compared to DeepSeek and other commercial models

    Strategic and Industry Impact

    MiniMax-M1 is positioned as a next-generation reasoning AI model that challenges the dominance of DeepSeek, a leading Chinese reasoning-capable large language model. MiniMax’s innovation highlights the rapid advancement and growing sophistication of China’s AI industry, especially in developing models capable of advanced cognitive functions like step-by-step logical reasoning and extensive contextual understanding.

    The model’s release underscores the strategic importance China places on AI reasoning capabilities for applications across manufacturing, healthcare, finance, and military technology. MiniMax’s approach, combining large-scale hybrid architectures with efficient reinforcement learning and long-context processing, sets a new benchmark for open-weight models worldwide.

    As a summary

    • MiniMax-M1 is the world’s first open-weight, large-scale hybrid-attention reasoning model with 456 billion parameters.
    • It supports extremely long context lengths (up to 1 million tokens) and is highly efficient, using only 25% of the compute of comparable models like DeepSeek R1 at long generation lengths.
    • The model excels in complex reasoning tasks, software engineering, and tool use benchmarks.
    • It is trained with a novel reinforcement learning algorithm (CISPO) that enhances training efficiency and stability.
    • MiniMax-M1 represents a major step forward in China’s AI capabilities, challenging established players and advancing the global state of reasoning AI.
  • Apple Intelligence 2.0 as of June 2025

    Apple Intelligence now supports Live Translation in Messages, FaceTime, and calls, allowing real-time multilingual communication. Visual Intelligence features are enhanced with screenshot support, deeper app integration, and the ability to ask questions with ChatGPT. Image Playground and Genmoji receive ChatGPT-powered enhancements for more expressive content creation. Apple Intelligence Actions are now integrated into Shortcuts, enabling automation with AI. A new Workout Buddy feature is introduced for Apple Watch. Additionally, a Foundation Models Framework is made available for developers to build on Apple Intelligence’s on-device large language model.

    For the first time, developers can directly access Apple Intelligence’s on-device large language model to create private, fast, and offline-capable intelligent experiences within their apps. This move is expected to spark a wave of new AI-powered app features that respect user privacy

    Since its initial launch in 2024, Apple Intelligence has expanded to support multiple languages including Chinese, French, German, Japanese, Korean, Portuguese, Spanish, and Vietnamese, and is available on iPhone, iPad, Mac, Apple Watch, and Apple Vision Pro. The latest updates continue broadening device and language support.

    Apple Intelligence is deeply integrated into iOS 26, iPadOS 26, macOS Tahoe 26, watchOS 26, and visionOS 26, bringing a more expressive design and intelligent features system-wide, including improvements in Phone, Messages, CarPlay, Apple Music, Maps, Wallet, and the new Apple Games app.

    Apple is evolving Siri into a more natural, context-aware assistant that can understand screen content and provide personalized responses. Apple Intelligence 2.0 plans to integrate third-party AI models such as Google’s Gemini and others like Perplexity, expanding AI capabilities beyond Apple’s own models.

    As a result, Apple Intelligence 2.0 in 2025 brings powerful AI features like live translation, enhanced visual intelligence, and developer access to on-device LLMs, alongside expanded language/device support and deeper system integration, all while emphasizing privacy and offline functionality. Siri is becoming more intelligent and integrated, and Apple is opening the platform to third-party AI models, signaling a major evolution in its AI ecosystem

  • Claude Code for VSCode (June 2025)

    Anthropic officially released the Claude Code for VSCode extension on June 19, 2025. This extension integrates Anthropic’s AI coding assistant Claude directly into Visual Studio Code, allowing developers to leverage Claude’s advanced reasoning and code generation capabilities inside their IDE.

    The extension requires VS Code version 1.98.0 or later and supports features such as:

    • Automatically adding selected text in the editor to Claude’s context for more accurate suggestions.
    • Displaying code changes directly in VS Code’s diff viewer instead of the terminal.
    • Keyboard shortcuts to push selected code to Claude prompts.
    • Awareness of open tabs/files in the editor.

    However, Windows is not natively supported yet; Windows users need to run it via WSL or similar workarounds.

    • Anthropic recently announced new features for Claude Code users (June 18, 2025), enhancing AI-powered coding with improved code generation, debugging, and integrations, positioning Claude Code as a leading enterprise AI coding platform7.
    • There is a known bug reported (June 18, 2025) where the “Fix with Claude Code” action in VSCode does not apply code edits as expected. This issue is under investigation4.
    • A security advisory was issued on June 23, 2025, for Claude Code IDE extensions including the VSCode plugin. Versions below 1.0.24 are vulnerable to unauthorized websocket connections that could expose files, open tabs, and selection events. Users are urged to update to version 1.0.24 or later to mitigate this risk8.
    • The March 2025 VS Code update (v1.99) introduced “Agent mode,” which can be extended with Model Context Protocol (MCP) tools, potentially complementing AI coding assistants like Claude Code
    • Anthropic’s Claude Code now supports background tasks via GitHub Actions and native integrations with VS Code and JetBrains, displaying edits directly in files for seamless pair programming. The latest Claude 4 models power these capabilities, offering advanced coding and reasoning.

    Claude Code for VSCode is a newly released, actively developed AI coding assistant extension that integrates deeply with VS Code, offering advanced AI coding features but still maturing with some bugs and security patches being addressed. It significantly enhances developer productivity by embedding Claude’s AI capabilities directly into the IDE workflow

  • Claude Artifacts and how to use them?

    Claude Artifacts are substantial, self-contained pieces of content generated by Anthropic’s Claude AI that appear in a dedicated window separate from the main chat. They are typically over 15 lines long and designed for users to edit, iterate on, reuse, or reference later without needing additional conversation context.

    Key characteristics of Claude Artifacts:

    • Significant, standalone content such as documents (Markdown or plain text), code snippets, single-page websites (HTML), SVG images, diagrams, flowcharts, and interactive React components.
    • Editable and version-controlled; users can ask Claude to modify artifacts, with changes reflected directly in the artifact window. Multiple versions can be managed and switched between without affecting Claude’s memory of the original content.
    • Accessible via a dedicated artifacts space in the sidebar for Free, Pro, and Max plan users; Claude for Work users create artifacts directly in chat but lack a sidebar space.
    • Artifacts can be published publicly on separate websites, shared, and remixed by other users who can build upon and modify them in new Claude conversations.
    • Some artifacts can embed AI capabilities, turning them into interactive apps powered by Claude’s text-based API, allowing users to interact with AI-powered features without needing API keys or incurring costs for the creator.

    How to create and manage artifacts:

    • Users describe what they want, and Claude generates the artifact content.
    • Artifacts can be created from scratch or by customizing existing ones.
    • Users can view underlying code, copy content, download files, and work with multiple artifacts simultaneously within a conversation.

    Editing artifacts:

    • With the analysis tool enabled, Claude supports targeted updates (small changes to specific sections) or full rewrites (major restructuring), each creating new versions accessible via a version selector.

    Use cases:

    • Generating and iterating on code snippets, documents, websites, visualizations, and interactive components.
    • Building shareable apps, tools, and content that can be collaboratively developed and reused.

    Claude Artifacts transform AI interactions into a dynamic, collaborative workspace where users can generate, edit, publish, and share complex, reusable content efficiently within the Claude AI environment.

  • Gemini CLI brings Gemini directly into developers’ terminals

    Google has just released Gemini CLI, an open-source AI agent that brings the power of Gemini 2.5 Pro directly into developers’ terminals across Windows, macOS, and Linux. This tool enables developers to interact with Gemini models locally via the command line, allowing natural language requests to explain code, write new features, debug, run shell commands, manipulate files, and automate workflows within the terminal environment.

    Key highlights of Gemini CLI include:

    • Massive 1 million token context window, enabling it to query and edit large codebases or entire repositories, surpassing typical AI coding assistants.
    • Multimodal capabilities, such as generating new applications from PDFs or sketches and integrating media generation tools like Imagen for images and Veo for video directly from the terminal.
    • Built-in integration with Google Search for real-time context grounding and support for the Model Context Protocol (MCP) to extend functionality.
    • Free usage tier offering up to 60 model requests per minute and 1,000 requests per day with a personal Google account, which is among the industry’s largest allowances for free AI coding tools.
    • Open-source availability under the Apache 2.0 license, hosted on GitHub, inviting community contributions, bug reports, and feature requests.

    Developers can install Gemini CLI with a single command and authenticate using their Google account to start using it immediately. For professional or enterprise use, higher limits and additional features are available via Google AI Studio or Vertex AI keys and Gemini Code Assist licenses.

    This release marks a significant step by Google to embed its Gemini AI models directly into developers’ existing workflows, competing with tools like OpenAI’s Codex CLI and Anthropic’s Claude Code, while transforming the terminal into an AI-powered workspace beyond just code completion.

  • Gemini robotic on device

    Gemini Robotics On-Device is a new AI model from Google DeepMind designed to run directly on robots without needing an internet connection. It is a compact, efficient version of the earlier Gemini Robotics model, optimized to perform complex tasks locally on a robot’s built-in hardware, enabling low-latency, reliable operation even in environments with weak or no connectivity.

    Key features include:

    • Ability to perform detailed physical tasks such as folding clothes, unzipping bags, and assembling parts, with high precision146.
    • Runs entirely on-device, processing all data locally, which enhances privacy and security—important for sensitive fields like healthcare and industrial automation17.
    • Can adapt to new tasks quickly after as few as 50 to 100 demonstrations, showing flexibility and responsiveness in unfamiliar situations148.
    • Originally trained on Google’s ALOHA robot but successfully adapted to other robots including the bi-arm Franka FR3 and the humanoid Apollo by Apptronik348.
    • While it lacks built-in semantic safety tools present in the cloud-based Gemini Robotics model, developers are encouraged to implement their own safety controls1.

    Google has also released a software development kit (SDK) to allow developers to test and fine-tune the model on various robotic platforms34.

    Gemini Robotics On-Device enables advanced, privacy-conscious, and reliable robotic AI functionality locally on devices, marking a significant step for robotics in offline or connectivity-limited environments1368.

  • New AI model embedded in Windows 11: Microsoft Mu

    Microsoft has recently launched a new artificial intelligence model called “Mu,” which is embedded in Windows 11 to power an AI agent within the Settings app. This model represents a significant advancement in on-device AI technology for operating systems. Here are the key details about Mu:

    Overview of Mu:

    • Mu is a small language model (SLM) with 330 million parameters, designed specifically for efficient, local operation on Windows 11 devices equipped with a Neural Processing Unit (NPU), such as Microsoft’s Copilot+ PCs/
    • It uses an encoder-decoder transformer architecture, which improves efficiency by separating input token encoding from output token decoding. This design reduces latency and memory overhead, resulting in about 47% lower first-token latency and 4.7 times faster decoding speed compared to similarly sized decoder-only models.
    • Mu runs entirely on the device (offline), ensuring user privacy and low-latency responses, with the ability to process over 100 tokens per second. It delivers response times under 500 milliseconds, enabling seamless natural language interaction with Windows system settings.

    Functionality and Use Case:

    • The primary use of Mu is to power an AI assistant integrated into the Windows 11 Settings app, allowing users to control hundreds of system settings through natural language commands. For example, users can say “turn on dark mode” or “increase screen brightness,” and Mu will directly execute these commands without manual navigation through menus.
    • The AI agent is integrated into the existing search box in Settings, providing a smooth and intuitive user experience. It understands complex queries and maps them accurately to system functions, substantially simplifying system configuration for users.
    • Mu is particularly optimized for multi-word queries and more complex input-output relationships, while shorter or partial-word inputs still rely on traditional lexical and semantic search results within the Settings app.

    Development and Training:

    • Mu was trained on NVIDIA A100 GPUs using Azure Machine Learning, starting with pre-training on hundreds of billions of high-quality educational tokens to learn language syntax, semantics, and world knowledge.
    • The model was further refined through distillation from Microsoft’s larger Phi models, enabling Mu to achieve comparable performance to a Phi-3.5-mini model despite being only one-tenth its size.
    • Extensive fine-tuning was performed with over 3.6 million training samples covering hundreds of Windows settings, using advanced techniques like synthetic data labeling, prompt tuning, noise injection, and smart sampling to meet precision and latency targets.

    Strategic Importance:

    • Mu exemplifies Microsoft’s push toward privacy-preserving, local AI processing, reducing reliance on cloud connectivity and enhancing user data security by keeping all processing on-device.
    • This approach also improves responsiveness and usability, making Windows 11 more accessible and user-friendly, especially for those who may find traditional settings menus complex or cumbersome.
    • Mu builds on Microsoft’s earlier on-device AI efforts, such as the Phi Silica model, and signals a broader strategy to embed efficient AI capabilities directly into hardware-equipped PCs, particularly those with dedicated NPUs.

    Availability:

    • The AI-powered Settings agent powered by Mu is currently available for testing to Windows Insiders in the Dev Channel on Copilot+ PCs running Windows 11 Build 26120.3964 (KB5058496) or later.

    Microsoft’s Mu is a cutting-edge, compact AI language model embedded in Windows 11 that enables natural language control of system settings with high efficiency, privacy, and responsiveness. It marks a significant step forward in integrating intelligent, local AI agents into mainstream operating systems.

  • PROSE, a new AI technique introduced by Apple

    Apple PROSE is a new AI technique introduced by Apple researchers that enables AI to learn and mimic a user’s personal writing style by analyzing past emails, notes, and documents. The goal is to make AI-generated text—such as emails, messages, or notes—sound more natural, personalized, and consistent with the user’s unique tone, vocabulary, and sentence structure.

    How PROSE Works

    • PROSE (Preference Reasoning by Observing and Synthesising Examples) builds a detailed writing profile by studying a user’s historical writing samples.
    • It iteratively refines AI-generated drafts by comparing them against the user’s real writing, adjusting tone and style until the output closely matches the user’s natural voice.
    • The system uses a new benchmark called PLUME to measure how well it captures and reproduces individual writing styles.
    • PROSE can adapt to different writing contexts, whether formal, casual, emoji-rich, or precise.

    Integration and Impact

    • Apple plans to integrate PROSE into apps like Mail and Notes, allowing AI-assisted writing to feel more authentic and personalized.
    • When combined with foundation models like GPT-4o, PROSE has shown significant improvements in personalization accuracy—outperforming previous personalization methods by a notable margin.
    • This approach aligns with Apple’s broader AI strategy focused on privacy, on-device intelligence, and delivering AI experiences that feel personal and user-centric.

    Apple PROSE represents a shift from generic AI writing toward truly personalized AI assistance that writes “like you.” By learning from your own writing style, it promises to make AI-generated text more natural, consistent, and reflective of individual personality—enhancing everyday communication while maintaining Apple’s strong privacy standards.

  • Veo 3 directly into YouTube Shorts later 2025 summer

    Google is set to integrate its advanced AI video generation tool, Veo 3, directly into YouTube Shorts later this summer (2025). This integration will allow creators to generate full-fledged short-form videos—including both visuals and audio—using only text prompts, significantly lowering the barrier to content creation on Shorts

    Key Details of the Integration

    • Launch Timing: Expected over the summer of 2025.
    • Capabilities: Veo 3 can create complete short videos with sound based on text inputs, advancing beyond Google’s earlier Dream Screen feature that only generated AI backgrounds.
    • Impact on Creators: This will empower more creators, including those without extensive video production skills, to produce engaging Shorts content quickly and creatively.
    • YouTube Shorts Growth: Shorts currently averages over 200 billion daily views, making it a crucial platform for short-form video content and a prime target for AI-powered content creation tools.
    • Access and Cost: Currently, Veo 3 video generation requires subscription plans like Google AI Pro or AI Ultra. It is not yet clear if or how this will affect cost or accessibility for typical Shorts users.
    • Content Concerns: The integration has sparked debate about originality, content quality, and the potential flood of AI-generated videos, which some critics call “AI slop.” YouTube is reportedly working on tools to prevent misuse, such as unauthorized deepfakes of celebrities.
    • Technical Adjustments: Veo 3’s output is being adapted to fit Shorts’ vertical video format and length constraints (up to 60 seconds).

    Google’s Veo 3 and YouTube Shorts will indeed be integrated, creating a powerful synergy where creators can produce AI-generated short videos directly within the Shorts platform. This move aims to unlock new creative possibilities and democratize video content creation, while also raising questions about content authenticity and platform dynamics.

  • Google Experiments with AI Audio Search Results

    Google has recently launched an experimental feature called Audio Overviews in its Search Labs, which uses its latest Gemini AI models to generate short, conversational audio summaries of certain search queries.

    Key Features of Google Audio Overviews

    • Audio Summaries: The feature creates 30 to 45-second podcast-style audio explanations that provide a quick, hands-free overview of complex or unfamiliar topics, helping users “get a lay of the land” without reading through multiple pages.
    • Conversational Voices: The audio is generated as a dialogue between two AI-synthesized voices, often in a question-and-answer style, making the summary engaging and easy to follow.
    • Playback Controls: Users can control playback with play/pause, volume, mute, and adjust the speed from 0.25x up to 2x, enhancing accessibility and convenience.
    • Integration with Search Results: While listening, users see relevant web pages linked within the audio player, allowing them to easily explore sources or fact-check information.
    • Availability: Currently, Audio Overviews are available only to users in the United States who opt into the experiment via Google Search Labs. The feature supports English-language queries and appears selectively for search topics where Google’s system determines it would be useful.
    • Generation Time: After clicking the “Generate Audio Overview” button, it typically takes up to 40 seconds for the audio clip to be created and played.

    Use Cases and Benefits

    • Ideal for multitasking users who want to absorb information hands-free.
    • Helps users unfamiliar with a topic quickly understand key points.
    • Provides an alternative to reading long articles or search result snippets.
    • Enhances accessibility for users who prefer audio content.

    Audio Overviews first appeared in Google’s AI note-taking app, NotebookLM, and were later integrated into the Gemini app with additional interactive features. The current Search Labs version is a simpler implementation focused on delivering quick audio summaries directly within Google Search.

    Google’s Audio Overviews in Search Labs represent a novel AI-powered approach to transforming search results into engaging, podcast-like audio summaries, leveraging Gemini models for natural conversational delivery and offering a convenient, hands-free way to consume information