Author: admin

  • Google Pixel 10: Reserving 3.5GB RAM for AI Features (permanently allocated to the AI Core service and Tensor Processing Unit (TPU))

    Google’s Pixel 10 series, launched at the 2025 Made by Google event, introduces a bold shift in smartphone design by reserving approximately 3.5GB of its 12GB RAM exclusively for AI tasks. This decision, driven by the new Tensor G5 chip and Gemini Nano model, prioritizes on-device AI performance but has sparked debate about its impact on long-term usability.

    The Pixel 10, priced at $799, comes with 12GB of RAM, but only about 8.5GB is available for apps and games. The remaining 3.5GB is permanently allocated to the AICore service and Tensor Processing Unit (TPU), ensuring AI features like Magic Cue, Voice Translate, and Pixel Journal launch instantly. Magic Cue, for instance, proactively pulls data from apps like Gmail and Calendar to suggest actions, such as sharing flight details during a call. Voice Translate offers real-time call translation in languages like Spanish, Hindi, and Japanese, mimicking the user’s voice for seamless communication. These features rely on the Gemini Nano model, which demands significant memory to stay resident in RAM for quick access.

    This approach marks a departure from last year’s Pixel 9, where the base model left all 12GB of RAM available for general use, loading AI models only when needed. The Pixel 9 Pro, with 16GB of RAM, reserved 2.6GB for AI, a strategy now extended to the base Pixel 10. Google’s decision reflects its focus on making AI a core part of the Pixel experience, leveraging the Tensor G5’s 60% faster TPU and 34% improved CPU performance. The result is snappy, responsive AI tools that enhance daily tasks, from photo editing to contextual suggestions.

    However, reserving nearly a quarter of the Pixel 10’s RAM raises concerns about future-proofing. Google promises seven years of OS and security updates, meaning the Pixel 10 must remain capable through 2032. As apps and Android versions grow more resource-intensive, 8.5GB of usable RAM may feel limiting for heavy multitaskers or gamers. In contrast, the Pixel 10 Pro and Pro XL, with 16GB of RAM, retain 12.5GB for general use after the same 3.5GB AI allocation, offering more flexibility.

    Critics argue Google’s marketing could be clearer, as the “12GB RAM” spec implies full availability, not a partitioned 8.5GB. A transparent framing, like “8.5GB for apps plus 3.5GB for AI,” might better set expectations. For casual users, 8.5GB is sufficient for now, but power users who rarely use AI may see the reserved RAM as wasted potential.

    Google’s gamble prioritizes instant AI responsiveness over maximizing system memory. Whether this trade-off pays off depends on how users value AI features versus traditional performance over the phone’s lifespan. As AI becomes central to smartphones, the Pixel 10’s approach may set a precedent, but its long-term success hinges on balancing innovation with practicality.

  • Google NotebookLM’s Video Overviews are now available in 80 languages

    Google has recently announced a significant update to its AI-powered note-taking platform, NotebookLM, expanding its Video Overviews feature to support 80 languages worldwide. This development, revealed on August 25, 2025, marks a major step toward making educational and research tools more accessible to a global audience. Alongside this, Google has enhanced its Audio Overviews, particularly for non-English languages, to provide more comprehensive and detailed summaries. These updates are designed to cater to students, researchers, and professionals who rely on NotebookLM to distill complex information into digestible formats.

    Introduced last month, NotebookLM’s Video Overviews feature transforms user-uploaded notes, PDFs, and images into concise, AI-narrated video presentations. These presentations incorporate visuals such as images, diagrams, quotes, and data to simplify complex topics. Initially available only in English, the feature now supports a diverse range of languages, including French, German, Spanish, Japanese, Arabic, Chinese, Hindi, and several Indian regional languages like Tamil, Telugu, and Kannada. This expansion allows non-English speakers to generate visual summaries in their native languages, broadening the platform’s reach and utility.

    The update also enhances NotebookLM’s Audio Overviews, which previously offered only brief summaries in non-English languages. Now, users can access full-length audio discussions in over 80 languages, matching the depth and nuance of the English versions. For those who prefer quick insights, shorter summary options remain available. This flexibility ensures that users can choose between in-depth explorations or concise highlights, depending on their needs. The updates are rolling out globally and will be fully available to all users within a week.

    NotebookLM’s multilingual expansion is particularly significant for global learners. Students preparing for exams can now review lecture notes in their preferred language, while researchers can generate summaries of dense academic papers in languages like Portuguese or Russian. Professionals in multilingual teams can share AI-generated video or audio summaries, streamlining collaboration across linguistic boundaries. By grounding summaries in user-uploaded content, NotebookLM ensures accuracy and relevance, distinguishing it from generic AI tools that rely on web-based data.

    To use the Video Overviews feature, users can upload their sources to a notebook, select the Video Overview option in the Studio panel, and customize the output language or focus via prompts. The process is intuitive, making it accessible to users with varying technical expertise. Google’s commitment to inclusivity through this update aligns with its broader mission to make information universally accessible.

    This expansion positions NotebookLM as a powerful tool for global education and research. By supporting 80 languages, Google is breaking down language barriers, enabling users worldwide to engage with complex material in a more engaging and understandable format. As the platform continues to evolve, it promises to further empower learners and professionals in diverse linguistic and cultural contexts.

  • Google’s mystery ‘nano banana’ AI model revealed in Gemini

    Google’s mystery “nano banana” AI model has been revealed as Gemini 2.5 Flash Image, a state-of-the-art image generation and editing model developed by Google DeepMind and integrated into the Gemini app. This model has quickly gained attention for its exceptional ability to maintain subject consistency across multiple edits, ensuring that the likeness of people, pets, or products remains intact even after numerous transformations. It allows users to make precise and natural edits to images using simple natural language prompts, such as changing backgrounds, adjusting poses, or merging multiple images seamlessly. The nano banana model also leverages Gemini’s world knowledge to better understand and generate images suitable for creative and practical applications. It is now available for free use in the Gemini app and accessible to developers via the Gemini API, Google AI Studio, and Vertex AI.

    Here is the Key Features of Google Nano Banana (Gemini 2.5 Flash Image):

    • Maintains subject consistency to avoid “drift” across multiple edits.
    • Allows blending and merging of multiple input images with natural language instructions.
    • Enables precise transformations like changing styles, outfits, or even adding color to black-and-white photos.
    • Uses Gemini’s world knowledge to enhance image generation accuracy.
    • Available for both consumer use and developer integration.

    This model marks a significant improvement in AI image editing by solving one of the biggest challenges—keeping the core attributes and identity of the subjects intact while enabling flexible creative control and realism in edits.

  • Microsoft Released VibeVoice-1.5B: An Open-Source Text-to-Speech Model that can Synthesize up to 90 Minutes of Speech with Four Distinct Speakers

    Microsoft has released VibeVoice-1.5B, an open-source text-to-speech (TTS) model capable of synthesizing up to 90 minutes of continuous speech involving four distinct speakers. This cutting-edge model leverages a novel architecture combining a Large Language Model backbone with acoustic and semantic tokenizers to enable extended multi-speaker conversations with natural turn-taking and consistent vocal identities.

    VibeVoice-1.5B is available under the MIT license, making it accessible to researchers and developers. It requires about 7 GB of GPU memory, allowing users with consumer-grade GPUs like the RTX 3060 to run multi-speaker synthesis. Supported languages are English and Chinese, and the model can also perform cross-lingual synthesis and singing voice generation.

    Microsoft plans to expand this line with a larger, 7-billion-parameter streaming-optimized model in the future, while also embedding safety measures like audio watermarks and restrictions against misuse such as voice impersonation or disinformation. This release marks a significant democratization of advanced TTS technology for extended, natural, multi-speaker audio generation.

  • Anthropic has released a new AI agent called “Claude for Chrome” that works in a side panel within the browser

    Anthropic has released a new AI agent called “Claude for Chrome,” which integrates directly into the Google Chrome browser as an extension. This agent is powered by Anthropic’s Claude AI models and is currently in a research preview phase. It is available to a limited group of 1,000 subscribers on Anthropic’s Max plan, with a waitlist for others interested.

    Claude for Chrome works in a side panel within the browser, allowing it to maintain context on what users are viewing and interact with web pages by clicking buttons and filling out forms upon user permission. This integration aims to make the AI assistant more useful by helping with tasks like managing calendars, scheduling meetings, drafting emails, and more directly within the browser environment.

    Anthropic emphasizes safety and security due to the potential risks posed by AI agents operating in browsers, such as prompt injection attacks where malicious instructions could be hidden on websites. The company has implemented several safeguards to reduce such risks, including site-level permissions for users to control Claude’s access and action confirmations for sensitive tasks. Despite improvements, Anthropic continues testing and refining its defenses before wider release.

    Overall, Claude for Chrome represents Anthropic’s effort to bring AI assistance directly into the user’s browsing experience while prioritizing safety and control.

  • Apple Event Confirmed for September 9 — iPhone 17, Apple Watch 11, AirPods Pro 3 and more

    Apple has officially confirmed its next big event for Tuesday, September 9, 2025. This eagerly awaited event will unveil the new iPhone 17 series, Apple Watch Series 11, AirPods Pro 3, and several other product updates. The event sets the stage for Apple’s latest innovations in hardware and software, including the highly anticipated iPhone 17 Air, which features a notably thin design, and the Apple Watch Ultra 3 with advanced health monitoring features. Additionally, Apple’s iOS 26 with a new “Liquid Glass” design will be showcased. While excitement is high, some analysts expect possible stock volatility following the event due to tempered expectations for revolutionary upgrades in the iPhone lineup. The event will be held at the Steve Jobs Theater in Cupertino and livestreamed globally .

  • Apple considers Google Gemini to power next-gen Siri

    Apple is in early discussions with Google to potentially use Google’s Gemini AI as the core technology to power a redesigned, next-generation Siri voice assistant. The company approached Google with the idea of creating a customized AI model that would run on Apple’s servers and serve as the foundation for the revamped Siri expected to launch in 2026.

    Currently, Apple is exploring multiple options for Siri’s AI “brain.” It is developing two versions simultaneously: one using its own internal AI models (codenamed Linwood) and another based on external technology (codenamed Glenwood), which could be Google’s Gemini or others like Anthropic’s Claude and OpenAI’s ChatGPT. Apple has not yet finalized any agreements or decided whether to fully adopt an external partner or rely on its own AI models. The talks with Google are still preliminary, and Google is training a customized Gemini model for potential use on Apple infrastructure.

    This move aims to catch up with competitors like Google and Samsung, who have integrated generative AI capabilities into their assistants. Apple’s revamped Siri has faced delays and challenges, but the new architecture promises more advanced and personalized AI features.

    In summary, Apple is considering licensing and integrating Google’s Gemini AI to power next-gen Siri but is still weighing its options among several AI providers and has not yet made a final decision.

  • Meta partners with Midjourney to license AI image and video technology

    Meta has entered into a partnership with Midjourney, an AI startup known for its advanced image and video generation technology, to license Midjourney’s “aesthetic technology” for integration into Meta’s future AI models and products. This collaboration involves a technical partnership between the research teams of both companies and aims to help Meta develop AI-powered creative tools that can compete with industry rivals like OpenAI and Google.

    Meta’s Chief AI Officer, Alexandr Wang, described the partnership as part of a comprehensive strategy to deliver the best AI products by combining top talent, ambitious computing resources, and collaborations with leading industry players. Midjourney’s technology, which includes highly advanced models for generating images from text prompts and recently released video models, will enhance Meta’s offerings in AI-generated imagery and video.

    Midjourney remains an independent, community-supported lab with no external investors. The licensing deal signifies a major step in Meta’s AI ambitions, complementing its existing in-house tools such as the AI image generator “Imagine” and AI video editor “Movie Gen.” Meta’s CEO Mark Zuckerberg has heavily invested in AI, acquiring talent and companies to boost its capabilities.

    The partnership could lead to new AI creative tools integrated across Meta’s platforms, potentially improving functionalities in apps like Facebook, Instagram, and WhatsApp by leveraging Midjourney’s unique aesthetic AI technology. The terms and timeline of the partnership’s full rollout have not been disclosed yet, but the collaboration marks a significant move for Meta in the competitive AI space.

  • Elon Musk teases Grok 5, says it could be the first real step toward true AGI (Artificial General Intelligence)

    Elon Musk has teased that his company’s upcoming AI model, Grok 5, could be “a real shot at being a true AGI” (Artificial General Intelligence) and is scheduled to launch before the end of 2025. Musk describes Grok 5 as potentially “crushingly good,” hinting it may surpass previous models and even outperform OpenAI’s GPT-5 according to recent comparisons.

    AGI refers to a type of AI that matches or surpasses human cognitive abilities across virtually all tasks, a milestone that AI companies such as OpenAI and Google are still striving to achieve. Musk’s bold claims about Grok 5 signify a strong belief that it could represent the first genuine step toward AGI, which would be a pivotal moment in AI development.

    Grok-4, the predecessor, has already received praise for faster response times, advanced multimodal support, and strong performance in mathematics and physics. Musk suggests Grok 5 will take this further, enhancing xAI’s position in the competitive AI landscape. So far, detailed capabilities of Grok 5 remain undisclosed, but anticipation is high that it will significantly raise the bar in AI intelligence and functionality.

    In summary, Musk claims Grok 5 AI could be true AGI by year-end 2025, marking a possible breakthrough in AI technology and competition with other leading AI models.

  • The AI Chatbots Big Bang: The Full Study at a Glance, study by OneLittleWeb

    The 2025 study by OneLittleWeb, titled “The AI ‘Big Bang’ Study 2025,” provides a comprehensive analysis of the top 10 AI chatbots based on web traffic, media citations, and user engagement from August 2024 to July 2025. The study, utilizing data from sources like Semrush, aitools.xyz, MuckRack, and app stores, ranks chatbots across eight key performance indicators, offering insights into their market presence, growth, and user experience. The AI tools market, encompassing over 10,500 tools, recorded nearly 100 billion web visits, with the top 10 chatbots capturing 55.88 billion visits, or 58.8% of the total, highlighting significant market consolidation.

    ChatGPT, developed by OpenAI, dominates with 46.59 billion visits and a 48.36% market share, maintaining its position as the most popular chatbot due to its robust performance in language tasks, accessibility, and free availability. Grok, created by xAI, emerges as a surprising second-place contender with 686.91 million visits and a 1.17% market share, driven by its integration into the X platform and rapid user base growth. Other notable chatbots include DeepSeek, Gemini, Perplexity, Claude, Microsoft Copilot, Blackbox AI, Monica, and Meta AI, with DeepSeek and Grok showing the fastest growth rates, the former with a 113,007% year-over-year increase.

    The study reveals a 123.35% year-over-year traffic growth for these top chatbots, adding 30.9 billion visits compared to the previous year, underscoring the rising popularity of conversational AI. Media coverage significantly influences traffic, with peaks in January and February 2025 (817.6K and 1.1M citations) correlating with traffic surges to 4.3 billion and 4.4 billion visits, respectively, and a high of 5.8 billion in March. However, some chatbots like DeepSeek experienced a 39.5% traffic drop over five months, reflecting volatility tied to media attention.

    The methodology emphasizes transparency, using weighted scores across visibility, growth, and user experience metrics to ensure a balanced ranking. This approach helps identify trusted and high-performing chatbots, offering actionable insights for users and businesses. For instance, chatbots like Poe attract users by providing access to multiple AI models, while Meta AI benefits from integration into platforms like Instagram and WhatsApp, likely underrepresenting its true usage.

    For businesses, the study suggests a dual strategy: optimizing for traditional search engines, which still dominate with 1.86 trillion visits, while adapting content for AI-driven platforms through structured data and high-quality, concise answers. The findings dispel the notion that chatbots are replacing search engines, showing they complement them by serving distinct user needs, such as creative tasks versus navigational searches. As AI chatbots continue to evolve, their role in reshaping online discovery is undeniable, but search engines remain dominant for now.

    Details here