Category: News

  • OpenAI is developing its own AI inference chip

    Reports confirm OpenAI is advancing its first custom AI chip, focused on inference (running trained models for predictions and decisions), in collaboration with Broadcom for design and intellectual property (IP) and TSMC for manufacturing on a 3nm process node. Mass production is targeted for 2026, aligning with the details in your query. The project is led by former Google engineer Richard Ho, who heads a team of about 40 specialists, many with experience from Google’s Tensor Processing Units (TPUs). This initiative aims to reduce OpenAI’s heavy reliance on Nvidia GPUs, which dominate the AI hardware market but face shortages and high costs.

    Key Developments from Recent Reports (September 2025)

    • Partnership Confirmation and $10B Deal: On September 5, 2025, the Financial Times and Reuters reported that OpenAI is finalizing the chip design in the coming months, with Broadcom providing engineering support and TSMC handling fabrication. Broadcom’s CEO Hock Tan disclosed a $10 billion order from a new AI client (widely identified as OpenAI) during an earnings call, boosting Broadcom’s AI revenue projections for fiscal 2026. This deal focuses on custom “XPUs” (AI processors) for internal use, not commercial sale, emphasizing inference workloads with potential for scaled training. OpenAI has scaled back earlier ambitions to build its own foundries due to costs exceeding hundreds of millions per iteration, opting instead for this partnership model.
    • Team and Technical Specs: Led by Richard Ho (ex-Google TPU head), the team includes engineers like Thomas Norrie. The chip features a systolic array architecture (similar to Google’s TPUs for efficient matrix computations), high-bandwidth memory (HBM, possibly HBM3E or HBM4), and integrated networking. It’s optimized for OpenAI’s models like GPT-4 and beyond, with initial small-scale deployment for inference to test viability. Analysts note risks, including potential delays or underperformance on the first tape-out (design finalization for production), as seen in other custom chip efforts by Microsoft and Meta.
    • Market Impact: Broadcom shares surged over 10% on September 5, reaching a $1.7 trillion market cap, while Nvidia and AMD dipped ~2-3% amid concerns over custom silicon eroding Nvidia’s 80%+ market share. HSBC analysts predict the custom AI chip market could surpass Nvidia’s GPU business by 2026. OpenAI’s move ties into broader AI infrastructure pushes, including the $500B Stargate project (with Oracle) and collaborations like Microsoft’s Maia chips.

    Broader Context and Challenges

    OpenAI’s compute costs are massive—projected $5B loss in 2024 on $3.7B revenue—driving this diversification. The company is also integrating AMD’s MI300X chips via Azure for training, complementing Nvidia. Geopolitical risks (e.g., TSMC’s Taiwan base) and high development costs (~$500M+ per chip version, plus software) loom, but success could enhance bargaining power and efficiency. No official OpenAI statement yet, but industry sources indicate tape-out soon, with prototypes possible by late 2025.

    This positions OpenAI alongside Google, Amazon, and Meta in the custom silicon race, potentially reshaping AI hardware dynamics. Updates could emerge from upcoming tech conferences or earnings.

  • Qwen3-Max-Preview is the preview release of Alibaba’s Qwen3-Max is now live on OpenRouter

    Qwen3-Max-Preview is the preview release of Alibaba’s Qwen3-Max, the flagship model in the Qwen3 series developed by Alibaba Cloud’s Qwen team. It’s a massive Mixture-of-Experts (MoE) large language model with over 1 trillion parameters, designed for advanced reasoning, instruction following, and multimodal tasks. Key features include:

    • Improvements over prior versions: Major gains in math, coding, logic, science accuracy; better multilingual support (100+ languages, including strong Chinese/English handling); reduced hallucinations; higher-quality open-ended responses for Q&A, writing, and conversation.
    • Optimizations: Excels in retrieval-augmented generation (RAG), tool calling, and long-context understanding (up to 256K tokens, extendable to 1M). It lacks a dedicated “thinking” mode but focuses on efficient, reliable outputs.
    • Architecture: Built on Qwen3’s MoE framework, pretrained on trillions of tokens with Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). It’s positioned as a high-capacity model for complex, multi-step tasks, competing with top closed-source LLMs like GPT-4 or Claude 3.5.

    This preview allows early testing before full release, emphasizing production usability over experimental features.

    News: Now Live on OpenRouter

    As of September 5, 2025, Qwen3-Max-Preview became available on OpenRouter, a unified API platform for 400+ AI models. Alibaba’s official Qwen account confirmed the launch, highlighting its strengths in reasoning and tool use. OpenRouter integration enables easy access via OpenAI-compatible APIs, with token-based pricing (e.g., tiered by input/output length; specifics vary by provider but start low for previews). Users can route requests through OpenRouter for vendor-agnostic setups, avoiding lock-in.

    • Access Details: Available at openrouter.ai/models (search “Qwen3-Max”) or directly via API endpoint. Free tiers may have limits; paid starts at ~$1.60/M input tokens. It’s also accessible via Qwen Chat (interactive UI) and Alibaba Cloud (enterprise IAM).
    • Community Buzz: Early X posts praise its potential for coding/programming (e.g., “saves my programmer life?”), with calls for benchmarks. No major issues reported yet, but expect high compute costs due to scale.

    This rollout positions Qwen3-Max-Preview as a key player in the open-weight AI race, with full Qwen3 updates (e.g., thinking modes) expected soon.

  • Rumors About Gemini 3.0 on OpenRouter (Sonoma Alpha and Sonoma Sky Alpha)

    As of September 6, 2025, there’s active speculation in the AI community that Google’s upcoming Gemini 3.0 model (or an early version of it) has been quietly released on OpenRouter under disguised names. OpenRouter, a platform aggregating access to hundreds of AI models via a unified API, announced two new “stealth models” yesterday: Sonoma Alpha and Sonoma Sky Alpha (also referred to as Sonoma Dusk Alpha in some posts). These are free to use, support a massive 2 million token context window, and are described as “maximally intelligent” with prompts logged by the creator for training—features that align closely with expected Gemini 3 specs.

    Key Details from the Announcement and Speculation

    • Announcement: OpenRouter posted about these models on September 5, 2025, calling them “stealth” (implying anonymity to avoid direct attribution). They emphasize high intelligence, 2M context (double the 1M seen in Gemini 2.5 Pro), and free access, but note that the provider handles logging for improvement.
    • Why Gemini 3.0?
      • Leaks from July-August 2025 referenced “gemini-beta-3.0-pro” and “gemini-beta-3.0-flash” in Google’s internal code (e.g., Gemini CLI repo), hinting at variants with enhanced reasoning (“Deep Think”) and multi-million token contexts—matching the 2M here.
      • Community tests and posts suggest strong performance in reasoning, speed, and multimodal tasks, outperforming current Gemini 2.5 models but falling short of full Gemini-like polish in some outputs (e.g., one user called it “disappointing” compared to known Gemini quality).
      • The “Sonoma” naming (evoking California’s wine country, near Google’s HQ) fuels the theory, as does the free tier—Google has previously offered experimental Gemini models for free on OpenRouter to gather data (e.g., Gemini 2.5 Pro Experimental in March 2025).
    • Alternative Theories: Not everyone agrees—some speculate it’s an xAI Grok variant (due to “maximally intelligent” phrasing echoing xAI’s ethos) or a new Chinese model. However, the 2M context and free logging point more toward Google testing pre-release.

    Broader Gemini 3.0 Context

    Google hasn’t officially announced Gemini 3.0, but rumors from mid-2025 predict a late 2025 release (preview in December, full in early 2026), building on Gemini 2.5’s “thinking” mode with:

    • Trillion-parameter scale for superior reasoning in code, math, and multimodality (text, images, video, 3D).
    • Integrated self-correction to reduce hallucinations.
    • On-device variants like Gemini Nano 3 for Pixels.

    These stealth models could be betas, allowing Google to benchmark against rivals like GPT-5 or Grok 4 without fanfare. OpenRouter’s history (e.g., hosting “Quasar Alpha” in April 2025, speculated as GPT-5/Gemini 3) supports this pattern of anonymous drops.

  • OpenAI for Science: Pioneering AI-Driven Scientific Discovery

    OpenAI announced the launch of OpenAI for Science, an ambitious initiative to accelerate scientific discovery through artificial intelligence, as revealed in a post by Chief Product Officer Kevin Weil on X. The program aims to build an AI-powered platform described as “the next great scientific instrument,” leveraging the advanced reasoning capabilities of GPT-5 to assist researchers in formulating hypotheses, designing experiments, and analyzing data. While a specific timeline for the platform’s release remains undisclosed, OpenAI plans to share more details in the coming months, signaling a transformative step toward automating scientific processes.

    The initiative builds on OpenAI’s prior successes in applying AI to science, notably its collaboration with Retro Biosciences, where a custom model, GPT-4b micro, achieved a 50x improvement in stem cell reprogramming markers, outperforming human-designed proteins. This work, published on August 22, 2025, demonstrated AI’s potential to accelerate breakthroughs in biology, with applications in longevity research. OpenAI for Science extends this vision, targeting fields like physics, chemistry, and mathematics, where GPT-5 has shown promise, such as suggesting proof ideas in theoretical physics. The platform will integrate tools like the “deep research” model, launched in February 2025, which synthesizes cited, multi-page reports from web data, aiding literature reviews and niche information retrieval.

    OpenAI is recruiting “AI-pilled” academics to join the effort, emphasizing interdisciplinary collaboration to tackle high-impact challenges. The initiative complements existing programs like NextGenAI, launched in March 2025, which provided $50 million to institutions like MIT and Harvard for AI-driven research in healthcare, education, and more. Unlike NextGenAI’s focus on institutional partnerships, OpenAI for Science prioritizes a unified platform to streamline scientific workflows, potentially reducing the 45% of researcher time spent on grant writing.

    Sentiment on X is optimistic, with users like @BorisMPower praising the initiative’s potential to revolutionize science, though some express concerns about GPT-5’s mixed reception, citing its inconsistent performance compared to GPT-4o. Critics also highlight OpenAI’s shift from safety-focused commitments, referencing a July 2025 lawsuit in Hawaii alleging inadequate safeguards in ChatGPT’s deployment. Despite these concerns, the initiative’s focus on verifiable outputs with clear citations aims to address transparency issues.

    OpenAI for Science positions the company as a leader in AI-driven discovery, competing with Google’s DeepMind, whose AlphaFold won a Nobel Prize. By harnessing GPT-5’s reasoning and integrating it into a dedicated platform, OpenAI aims to empower researchers globally, though its success hinges on addressing technical and ethical challenges in the rapidly evolving AI landscape.

  • Tencent’s HunyuanWorld-Voyager: Open-Source AI Turns Images into 3D Worlds

    Tencent’s Hunyuan team released HunyuanWorld-Voyager, an open-source AI model that transforms single images into explorable 3D worlds, marking a breakthrough in generative AI. Announced on X by @TencentHunyuan, the model generates 3D-consistent RGB-D video sequences and point clouds, enabling users to navigate virtual environments with user-defined camera paths. Available on GitHub and Hugging Face, HunyuanWorld-Voyager has topped the WorldScore benchmark with a score of 77.62, surpassing competitors like WonderWorld (72.69) and CogVideoX-I2V (62.15), excelling in style consistency (84.89) and object control (66.92).

    The model’s core innovation lies in its ability to create geometry-consistent 3D scenes from a single image, bypassing traditional modeling pipelines. It uses a video diffusion framework with synchronized RGB and depth outputs, supported by a world-caching system and autoregressive sampling to maintain spatial coherence over long camera trajectories. This enables applications in game development, virtual reality (VR), and augmented reality (AR), allowing developers to prototype immersive worlds or generate cinematic fly-throughs rapidly. For instance, a user can upload an image of a forest and explore it as a 3D environment with accurate depth and perspective, exportable as meshes for Unity or Unreal Engine.

    HunyuanWorld-Voyager builds on Tencent’s HunyuanWorld 1.0, released in July 2025, which focused on static 3D mesh generation from text or images. Voyager extends this by offering dynamic, long-range exploration with real-time depth estimation, ideal for VR experiences and robotic navigation. However, its high computational demands—requiring at least 60GB of GPU memory for 540p resolution—limit accessibility to well-equipped labs or enterprises. Licensing restrictions also prohibit use in the EU, UK, and South Korea, and commercial applications with over 100 million monthly users require separate approval.

    X users, like @Hathibel, have shared demos, such as a 3D Alaskan town generated from a text prompt, praising its visual quality despite high VRAM usage (33GB). Critics note that the model produces 2D video frames mimicking 3D movement rather than true 3D models, with each generation limited to 49 frames (about two seconds), though clips can be chained for longer sequences. Compared to Google’s Genie 3 or Dynamics Lab’s Mirage 2, Voyager’s open-source nature and direct 3D reconstruction set it apart, though it lags slightly in camera control (85.95 vs. WonderWorld’s 92.98).

    Tencent’s open-source strategy, including code, weights, and documentation, aims to democratize 3D content creation, fostering collaboration in gaming, VR, and simulation. As the first open-source model of its kind, HunyuanWorld-Voyager challenges proprietary systems, but its hardware demands and regional restrictions may hinder widespread adoption.

  • DeepSeek’s Advanced AI Agent Set to Challenge OpenAI by Q4 2025

    Bloomberg reported that Chinese AI startup DeepSeek is developing an advanced artificial intelligence agent model aimed at rivaling U.S. giants like OpenAI, with a planned release in the fourth quarter of 2025. The Hangzhou-based company, founded by Liang Wenfeng in July 2023, is designing this model to perform complex, multi-step tasks with minimal human input, learning and improving from past actions. This move positions DeepSeek at the forefront of the global race to create autonomous AI agents, considered the next evolution of AI technology.

    Unlike traditional chatbots, DeepSeek’s new model will execute sophisticated tasks such as researching travel plans or debugging code, aligning with industry trends seen in recent agent-focused releases from OpenAI, Anthropic, and Microsoft. The model builds on DeepSeek’s January 2025 release of DeepSeek-R1, a reasoning-focused model that matched OpenAI’s o1 in benchmarks like MATH-500, costing just $6 million to train compared to over $100 million for OpenAI’s GPT-4. DeepSeek’s efficiency stems from innovative techniques like mixture-of-experts (MoE) layers and optimized chip use, despite U.S. export controls limiting access to advanced Nvidia chips. The company leveraged a stockpile of 10,000 A100 chips and lower-power H800 chips to achieve this.

    The upcoming agent model, not yet named, is expected to enhance DeepSeek’s reputation for cost-effective, high-performing AI. X posts reflect excitement, with users like @zijing_wu noting China’s push to triple AI chip production to support DeepSeek’s ambitions, including native support for the UE8M0 FP8 format for faster processing. However, some skepticism persists, with posts citing DeepSeek’s relatively slow pace of updates compared to rivals like Alibaba’s Qwen. The company has also implemented strict policies, mandating visible and hidden markers like “AI-generated” labels to prevent misuse, backed by Chinese regulations.

    DeepSeek’s focus on AI agents aligns with a broader industry shift toward automation, though current agents often require significant oversight. The Q4 2025 release could intensify competition, especially as OpenAI faces scrutiny over high development costs. If successful, DeepSeek’s model may further disrupt the AI landscape, building on its R1 success that triggered a $1 trillion tech stock sell-off in January 2025, including a record $593 billion single-day loss for Nvidia. As DeepSeek advances, its open-source approach and efficiency could redefine global AI innovation.

  • OpenAI to Launch Jobs Platform and Certification Program Focused on AI Skills

    OpenAI unveiled plans to launch an AI-powered jobs platform in mid-2026, designed to connect employers with candidates skilled in artificial intelligence, alongside a certification program to train workers in AI fluency. Announced by Fidji Simo, OpenAI’s CEO of Applications, during a White House task force meeting on AI and education hosted by First Lady Melania Trump, these initiatives aim to address the growing demand for AI expertise while mitigating job market disruptions. OpenAI’s goal is to certify 10 million Americans by 2030, partnering with major organizations like Walmart, John Deere, Boston Consulting Group, and Indeed to ensure relevance and impact.

    The OpenAI Jobs Platform will go beyond traditional job boards, using AI to match candidates with businesses and government agencies based on verified AI skills. It includes a dedicated track for small businesses and local governments, fostering inclusive access to AI talent. Unlike LinkedIn, which OpenAI’s backer Microsoft owns, this platform emphasizes AI-specific competencies, potentially positioning OpenAI as a direct competitor in professional networking. The certification program, an extension of the free OpenAI Academy launched in 2024, offers training from basic AI usage to advanced skills like prompt engineering. Candidates can prepare using ChatGPT’s Study Mode, and companies can integrate certifications into their learning programs. Walmart will provide free training to its 1.6 million U.S. employees, who already use AI for scheduling and inventory management.

    The announcement aligns with the White House’s push for AI literacy, with OpenAI committing to support broader economic opportunities. Simo acknowledged AI’s disruptive potential, noting it could eliminate up to 50% of entry-level white-collar jobs by 2030, as per Anthropic’s CEO Dario Amodei. However, she emphasized that AI will also create new roles, with studies from Lightcast showing AI-skilled workers earn higher salaries. X posts reflect enthusiasm, with users like @fidjissimo praising the initiative’s potential to empower workers, though some express skepticism about accessibility and certification costs for non-partnered organizations.

    OpenAI’s move comes amid challenges, including a lawsuit over ChatGPT’s safety and accusations of talent poaching by Meta. By focusing on upskilling and job matching, OpenAI aims to shape the AI-driven economy, but questions remain about the platform’s scalability and global reach. As the company strengthens ties with Washington, including a $200 million Department of Defense contract, these initiatives signal a strategic expansion beyond ChatGPT, aiming to define how workers navigate an AI-transformed workplace.

  • Google Photos Enhances Photo-to-Video with Veo 3 AI Model for Stunning Clips

    Google announced a significant upgrade to Google Photos’ photo-to-video feature, integrating its advanced Veo 3 AI model to deliver higher-quality video clips. The update, rolling out to U.S. users on Android and iOS, enhances the existing tool that transforms still images into short videos, now producing sharper, more realistic four-second clips without audio. Accessible via the new Create tab in the Google Photos app, the feature is free with a limited number of daily generations, while Google AI Pro and Ultra subscribers enjoy higher limits. With over 1.5 billion monthly active users as of May 2025, Google Photos is leveraging Veo 3 to solidify its position as a creative powerhouse.

    The photo-to-video tool, first introduced in July 2025 with the Veo 2 model, allows users to select a photo and choose between two prompts: “Subtle movement” for realistic animations or “I’m feeling lucky” for dynamic effects like dancing subjects or confetti showers. Veo 3, unveiled at Google’s I/O conference in May, improves resolution and realism, outpacing competitors like OpenAI’s Sora, according to TechRadar. All generated videos include a visible “Veo” watermark and an invisible SynthID digital watermark to ensure transparency about their AI-generated origin. The Create tab also houses other AI-driven tools, such as Remix for transforming photos into styles like anime or 3D animations, collage creation, cinematic 3D photos, and GIF-making.

    X posts reflect excitement about the update, with users like @heyshrutimishra praising the ease of creating animated clips for social media or presentations. However, some express frustration over the U.S.-only rollout and the lack of audio support, hoping for global expansion soon, as seen with Google’s NotebookLM now supporting 80 languages. Google’s strategy to embed Veo 3 across platforms like YouTube Shorts and Google Vids, as noted by @GoogleWorkspace, underscores its push to make AI accessible, though free-tier limitations may nudge users toward paid subscriptions.

    The Veo 3 integration transforms Google Photos from a storage app into a creative suite, enabling users to reimagine memories dynamically. While the four-second clip duration and daily generation caps for free users pose constraints, the enhanced realism and centralized Create tab make it a compelling tool for casual and professional creators alike. As Google continues to refine its AI offerings, this update signals a broader vision to integrate generative AI into everyday consumer experiences, setting the stage for further innovations.

  • Apple’s AI Search Tool: Google Partnership Fuels Siri Overhaul

    On September 3, 2025, Apple announced plans to launch an AI-powered web search tool in 2026, internally dubbed “World Knowledge Answers,” intensifying competition with OpenAI and Perplexity AI. The tool will be integrated into Siri, with potential expansion to Safari and iPhone’s Spotlight search, marking Apple’s boldest move into AI-driven search. A key element of this initiative is a formal agreement with Google, signed this week, allowing Apple to test Google’s Gemini AI model to power parts of the revamped Siri. This partnership, reported by Bloomberg, leverages Google’s expertise in generative AI while Apple maintains control over user data through its Private Cloud Compute servers.

    The new search system aims to transform Siri into an “answer engine,” offering text, photo, video, and local point-of-interest results with AI-powered summarization for faster, more accurate responses. Unlike the current Siri, which handles basic queries, this overhaul—codenamed Linwood and LLM Siri—will tap web and personal data for contextual answers and improved device navigation. Apple is also exploring Anthropic’s Claude and its own Apple Foundation Models for specific functions, ensuring privacy for user data searches. The initiative follows a May 2025 disclosure by Apple’s Eddy Cue, who noted a dip in Safari searches due to growing AI tool usage, hinting at partnerships with AI providers like OpenAI and Perplexity.

    This move comes amid a shifting relationship with Google. Apple’s $20 billion annual deal to make Google the default Safari search engine faced scrutiny in a U.S. Justice Department antitrust lawsuit, but a September 2 ruling preserved the agreement, easing Apple’s urgency to develop a fully in-house solution. X posts reflect mixed sentiment: some users, like @amitisinvesting, see the Google partnership as a bullish sign for both companies, while others, like @ns123abc, speculate it signals Apple’s lag in AI development. Critics argue Apple’s reliance on external models could compromise its privacy-first ethos, though its on-device processing aims to mitigate this.

    The search tool’s launch, expected with iOS 26.4 in spring 2026, aligns with a broader Siri redesign, including a visual overhaul and plans for a health AI agent in 2026. Apple’s stock rose 3.8% to $238.47 on the news, marking its biggest single-day gain in a month. As Apple races to catch up in AI, this Google partnership underscores a pragmatic approach to bolster its ecosystem, but questions remain about balancing innovation with privacy and reducing dependency on external tech.

  • OpenAI Enhances ChatGPT Free Tier with Projects, File Uploads, Customization Tools, and More

    On September 3, 2025, OpenAI announced a significant expansion of features for ChatGPT’s free tier, making advanced tools previously exclusive to paid plans accessible to all users. The update includes access to Projects, larger file upload limits, new customization options, and project-specific memory, aligning with OpenAI’s mission to democratize AI. These enhancements, detailed in a post by OpenAI on X, aim to improve organization, productivity, and personalization for students, researchers, and casual users alike.

    Projects for All: The Projects feature, initially launched for paid subscribers, is now available to free-tier users. Projects act as smart workspaces, allowing users to group related chats, upload files, and set custom instructions to maintain context for long-term tasks like research or writing. Free users can create unlimited projects, with a limit of five file uploads per project, compared to 25 for Plus and 40 for Pro/Business/Enterprise users. This feature ensures ChatGPT stays on-topic, referencing only project-specific chats and files, making it ideal for tasks like creating an “AP Biology study guide” with attached PDFs.

    Larger File Uploads: Free-tier users can now upload up to five files per project, a step up from previous restrictions, enabling analysis of documents, spreadsheets, or images. While paid tiers support more uploads (25 for Plus, 40 for Pro), this change allows free users to leverage GPT-4o’s multimodal capabilities for tasks like summarizing PDFs or analyzing charts, though with stricter rate limits.

    Customization Tools: New customization options let users personalize projects with colors and icons, enhancing organization and navigation. This feature, available across all tiers, helps users visually distinguish projects, streamlining workflows for recurring tasks like weekly research or content drafting.

    Project-Specific Memory: A standout addition is project-specific memory, which allows ChatGPT to reference previous chats and files within a project for contextually relevant responses. Unlike global memory, which personalizes responses based on user preferences, project-specific memory is isolated, ensuring external conversations don’t influence project interactions. This is particularly useful for sensitive or focused work, though it requires the Personal Memory setting to be enabled. Currently, this feature is limited to the ChatGPT website and Windows app, with mobile support planned soon.