Everything about AI and Technology

Google Introducing Gemini 2.5 Flash Image, the state-of-the-art image model (aka nano banana)
The “Banana model” refers to Google’s Gemini 2.5 Flash Image model, which is nicknamed “Nano Banana.” It is a state-of-the-art AI image generation and editing model developed by Google DeepMind integrated into Gemini.

Here is the key highlights about Nano Banana include:
- It excels in lightning-fast image generation and editing, with each image costing about 4 cents to generate.
- The model supports precise and natural language-driven editing, enabling users to make targeted modifications such as changing objects or blending multiple images while maintaining character and object consistency.
- It is capable of multi-turn editing where previous instructions are remembered for seamless progressive edits.
- Nano Banana is ideal for creating marketing assets, product visualizations, social media content, and interactive experiences without complex manual design.
- Available via Google AI Studio, Gemini API, and Vertex AI, developers can build custom apps and workflows around the model.
- The model also supports combining images with text inputs, enhancing creative possibilities.
- It is praised for its quality, speed, and low cost, positioning it as a powerful tool for creative professionals and businesses.
- Practical uses demonstrated include transforming selfies with costume changes, blending photos naturally, and virtual try-ons for ecommerce.
Overall, Nano Banana brings a significant advancement to AI-driven image generation and editing with user-friendly control, real-time performance, and rich creative applications.
30.08.2025
Robinhood CEO Vlad Tenev: AI-related job losses will trigger investment boom

In a recent episode of Fortune’s Leadership Next podcast, hosted by Diane Brady and Kristin Stoller, Robinhood CEO Vlad Tenev discussed the future of investing in a post-AI world, the company’s evolution, and its response to past controversies. The conversation highlighted Robinhood’s transformative journey from a scrappy startup to a multifaceted financial platform, addressing its role in the 2021 GameStop saga, the impact of AI and cryptocurrency on investing, and the importance of financial education for future generations.

Tenev emphasized that as AI advances, traditional labor may become less reliable for generating income, making investing a critical skill for financial stability. He suggested that both private companies like Robinhood and government initiatives, such as the proposed Invest America Act, will play vital roles in promoting early investment education. This shift underscores the growing importance of capital over labor in an AI-driven economy, with Robinhood aiming to make investing more accessible to the masses.

Robinhood, founded by Tenev and Baiju Bhatt, gained prominence as a commission-free trading platform that democratized investing for younger and less affluent users. However, its reputation faced challenges during the January 2021 GameStop trading frenzy, when retail investors on platforms like Reddit drove up the stock’s price, clashing with institutional investors. Robinhood’s decision to temporarily restrict trading in GameStop and other “meme stocks” sparked widespread backlash on social media platforms like Reddit and YouTube, with critics accusing the company of siding with Wall Street. The controversy, later depicted in the film Dumb Money, portrayed Tenev unflatteringly and fueled public distrust. Despite this, Tenev defended the decision as necessary to stabilize the platform during unprecedented volatility, a period that coincided with Robinhood’s public debut and a challenging economic climate marked by high inflation and declining stock prices.

Since then, Robinhood has rebounded, expanding its offerings beyond stock trading to include wealth management, credit cards, cryptocurrency trading, and deposit accounts. This diversification reflects the company’s ambition to become a comprehensive financial services provider. Tenev highlighted the integration of AI and crypto as transformative forces in investing, enabling more sophisticated tools and broader access to alternative assets. The company’s recovery is detailed in a Fortune cover story by Jeff John Roberts, which chronicles Robinhood’s resilience and strategic expansion after nearly facing collapse.

The podcast also touched on Robinhood’s cultural impact, particularly its appeal to a new generation of investors. Tenev stressed the importance of teaching children about investing early, aligning with Robinhood’s mission to empower retail investors. Despite past controversies, the company has regained momentum, leveraging its user-friendly platform and innovative features to maintain relevance in a competitive market. By addressing the evolving financial landscape and embracing technological advancements, Robinhood aims to redefine wealth-building in an era where traditional income sources may diminish.

29.08.2025
Google Pixel 10: Reserving 3.5GB RAM for AI Features (permanently allocated to the AI Core service and Tensor Processing Unit (TPU))

Google’s Pixel 10 series, launched at the 2025 Made by Google event, introduces a bold shift in smartphone design by reserving approximately 3.5GB of its 12GB RAM exclusively for AI tasks. This decision, driven by the new Tensor G5 chip and Gemini Nano model, prioritizes on-device AI performance but has sparked debate about its impact on long-term usability.

The Pixel 10, priced at $799, comes with 12GB of RAM, but only about 8.5GB is available for apps and games. The remaining 3.5GB is permanently allocated to the AICore service and Tensor Processing Unit (TPU), ensuring AI features like Magic Cue, Voice Translate, and Pixel Journal launch instantly. Magic Cue, for instance, proactively pulls data from apps like Gmail and Calendar to suggest actions, such as sharing flight details during a call. Voice Translate offers real-time call translation in languages like Spanish, Hindi, and Japanese, mimicking the user’s voice for seamless communication. These features rely on the Gemini Nano model, which demands significant memory to stay resident in RAM for quick access.

This approach marks a departure from last year’s Pixel 9, where the base model left all 12GB of RAM available for general use, loading AI models only when needed. The Pixel 9 Pro, with 16GB of RAM, reserved 2.6GB for AI, a strategy now extended to the base Pixel 10. Google’s decision reflects its focus on making AI a core part of the Pixel experience, leveraging the Tensor G5’s 60% faster TPU and 34% improved CPU performance. The result is snappy, responsive AI tools that enhance daily tasks, from photo editing to contextual suggestions.

However, reserving nearly a quarter of the Pixel 10’s RAM raises concerns about future-proofing. Google promises seven years of OS and security updates, meaning the Pixel 10 must remain capable through 2032. As apps and Android versions grow more resource-intensive, 8.5GB of usable RAM may feel limiting for heavy multitaskers or gamers. In contrast, the Pixel 10 Pro and Pro XL, with 16GB of RAM, retain 12.5GB for general use after the same 3.5GB AI allocation, offering more flexibility.

Critics argue Google’s marketing could be clearer, as the “12GB RAM” spec implies full availability, not a partitioned 8.5GB. A transparent framing, like “8.5GB for apps plus 3.5GB for AI,” might better set expectations. For casual users, 8.5GB is sufficient for now, but power users who rarely use AI may see the reserved RAM as wasted potential.

Google’s gamble prioritizes instant AI responsiveness over maximizing system memory. Whether this trade-off pays off depends on how users value AI features versus traditional performance over the phone’s lifespan. As AI becomes central to smartphones, the Pixel 10’s approach may set a precedent, but its long-term success hinges on balancing innovation with practicality.

29.08.2025
Google NotebookLM’s Video Overviews are now available in 80 languages

Google has recently announced a significant update to its AI-powered note-taking platform, NotebookLM, expanding its Video Overviews feature to support 80 languages worldwide. This development, revealed on August 25, 2025, marks a major step toward making educational and research tools more accessible to a global audience. Alongside this, Google has enhanced its Audio Overviews, particularly for non-English languages, to provide more comprehensive and detailed summaries. These updates are designed to cater to students, researchers, and professionals who rely on NotebookLM to distill complex information into digestible formats.

Introduced last month, NotebookLM’s Video Overviews feature transforms user-uploaded notes, PDFs, and images into concise, AI-narrated video presentations. These presentations incorporate visuals such as images, diagrams, quotes, and data to simplify complex topics. Initially available only in English, the feature now supports a diverse range of languages, including French, German, Spanish, Japanese, Arabic, Chinese, Hindi, and several Indian regional languages like Tamil, Telugu, and Kannada. This expansion allows non-English speakers to generate visual summaries in their native languages, broadening the platform’s reach and utility.

The update also enhances NotebookLM’s Audio Overviews, which previously offered only brief summaries in non-English languages. Now, users can access full-length audio discussions in over 80 languages, matching the depth and nuance of the English versions. For those who prefer quick insights, shorter summary options remain available. This flexibility ensures that users can choose between in-depth explorations or concise highlights, depending on their needs. The updates are rolling out globally and will be fully available to all users within a week.

NotebookLM’s multilingual expansion is particularly significant for global learners. Students preparing for exams can now review lecture notes in their preferred language, while researchers can generate summaries of dense academic papers in languages like Portuguese or Russian. Professionals in multilingual teams can share AI-generated video or audio summaries, streamlining collaboration across linguistic boundaries. By grounding summaries in user-uploaded content, NotebookLM ensures accuracy and relevance, distinguishing it from generic AI tools that rely on web-based data.

To use the Video Overviews feature, users can upload their sources to a notebook, select the Video Overview option in the Studio panel, and customize the output language or focus via prompts. The process is intuitive, making it accessible to users with varying technical expertise. Google’s commitment to inclusivity through this update aligns with its broader mission to make information universally accessible.

This expansion positions NotebookLM as a powerful tool for global education and research. By supporting 80 languages, Google is breaking down language barriers, enabling users worldwide to engage with complex material in a more engaging and understandable format. As the platform continues to evolve, it promises to further empower learners and professionals in diverse linguistic and cultural contexts.

27.08.2025
Google’s mystery ‘nano banana’ AI model revealed in Gemini
Google’s mystery “nano banana” AI model has been revealed as Gemini 2.5 Flash Image, a state-of-the-art image generation and editing model developed by Google DeepMind and integrated into the Gemini app. This model has quickly gained attention for its exceptional ability to maintain subject consistency across multiple edits, ensuring that the likeness of people, pets, or products remains intact even after numerous transformations. It allows users to make precise and natural edits to images using simple natural language prompts, such as changing backgrounds, adjusting poses, or merging multiple images seamlessly. The nano banana model also leverages Gemini’s world knowledge to better understand and generate images suitable for creative and practical applications. It is now available for free use in the Gemini app and accessible to developers via the Gemini API, Google AI Studio, and Vertex AI.

Here is the Key Features of Google Nano Banana (Gemini 2.5 Flash Image):
- Maintains subject consistency to avoid “drift” across multiple edits.
- Allows blending and merging of multiple input images with natural language instructions.
- Enables precise transformations like changing styles, outfits, or even adding color to black-and-white photos.
- Uses Gemini’s world knowledge to enhance image generation accuracy.
- Available for both consumer use and developer integration.
This model marks a significant improvement in AI image editing by solving one of the biggest challenges—keeping the core attributes and identity of the subjects intact while enabling flexible creative control and realism in edits.
27.08.2025
Microsoft Released VibeVoice-1.5B: An Open-Source Text-to-Speech Model that can Synthesize up to 90 Minutes of Speech with Four Distinct Speakers

Microsoft has released VibeVoice-1.5B, an open-source text-to-speech (TTS) model capable of synthesizing up to 90 minutes of continuous speech involving four distinct speakers. This cutting-edge model leverages a novel architecture combining a Large Language Model backbone with acoustic and semantic tokenizers to enable extended multi-speaker conversations with natural turn-taking and consistent vocal identities.

VibeVoice-1.5B is available under the MIT license, making it accessible to researchers and developers. It requires about 7 GB of GPU memory, allowing users with consumer-grade GPUs like the RTX 3060 to run multi-speaker synthesis. Supported languages are English and Chinese, and the model can also perform cross-lingual synthesis and singing voice generation.

Microsoft plans to expand this line with a larger, 7-billion-parameter streaming-optimized model in the future, while also embedding safety measures like audio watermarks and restrictions against misuse such as voice impersonation or disinformation. This release marks a significant democratization of advanced TTS technology for extended, natural, multi-speaker audio generation.

27.08.2025
Anthropic has released a new AI agent called “Claude for Chrome” that works in a side panel within the browser

Anthropic has released a new AI agent called “Claude for Chrome,” which integrates directly into the Google Chrome browser as an extension. This agent is powered by Anthropic’s Claude AI models and is currently in a research preview phase. It is available to a limited group of 1,000 subscribers on Anthropic’s Max plan, with a waitlist for others interested.

Claude for Chrome works in a side panel within the browser, allowing it to maintain context on what users are viewing and interact with web pages by clicking buttons and filling out forms upon user permission. This integration aims to make the AI assistant more useful by helping with tasks like managing calendars, scheduling meetings, drafting emails, and more directly within the browser environment.

Anthropic emphasizes safety and security due to the potential risks posed by AI agents operating in browsers, such as prompt injection attacks where malicious instructions could be hidden on websites. The company has implemented several safeguards to reduce such risks, including site-level permissions for users to control Claude’s access and action confirmations for sensitive tasks. Despite improvements, Anthropic continues testing and refining its defenses before wider release.

Overall, Claude for Chrome represents Anthropic’s effort to bring AI assistance directly into the user’s browsing experience while prioritizing safety and control.

27.08.2025
Apple Event Confirmed for September 9 — iPhone 17, Apple Watch 11, AirPods Pro 3 and more

Apple has officially confirmed its next big event for Tuesday, September 9, 2025. This eagerly awaited event will unveil the new iPhone 17 series, Apple Watch Series 11, AirPods Pro 3, and several other product updates. The event sets the stage for Apple’s latest innovations in hardware and software, including the highly anticipated iPhone 17 Air, which features a notably thin design, and the Apple Watch Ultra 3 with advanced health monitoring features. Additionally, Apple’s iOS 26 with a new “Liquid Glass” design will be showcased. While excitement is high, some analysts expect possible stock volatility following the event due to tempered expectations for revolutionary upgrades in the iPhone lineup. The event will be held at the Steve Jobs Theater in Cupertino and livestreamed globally .

27.08.2025
Apple considers Google Gemini to power next-gen Siri

Apple is in early discussions with Google to potentially use Google’s Gemini AI as the core technology to power a redesigned, next-generation Siri voice assistant. The company approached Google with the idea of creating a customized AI model that would run on Apple’s servers and serve as the foundation for the revamped Siri expected to launch in 2026.

Currently, Apple is exploring multiple options for Siri’s AI “brain.” It is developing two versions simultaneously: one using its own internal AI models (codenamed Linwood) and another based on external technology (codenamed Glenwood), which could be Google’s Gemini or others like Anthropic’s Claude and OpenAI’s ChatGPT. Apple has not yet finalized any agreements or decided whether to fully adopt an external partner or rely on its own AI models. The talks with Google are still preliminary, and Google is training a customized Gemini model for potential use on Apple infrastructure.

This move aims to catch up with competitors like Google and Samsung, who have integrated generative AI capabilities into their assistants. Apple’s revamped Siri has faced delays and challenges, but the new architecture promises more advanced and personalized AI features.

In summary, Apple is considering licensing and integrating Google’s Gemini AI to power next-gen Siri but is still weighing its options among several AI providers and has not yet made a final decision.

23.08.2025
Meta partners with Midjourney to license AI image and video technology

Meta has entered into a partnership with Midjourney, an AI startup known for its advanced image and video generation technology, to license Midjourney’s “aesthetic technology” for integration into Meta’s future AI models and products. This collaboration involves a technical partnership between the research teams of both companies and aims to help Meta develop AI-powered creative tools that can compete with industry rivals like OpenAI and Google.

Meta’s Chief AI Officer, Alexandr Wang, described the partnership as part of a comprehensive strategy to deliver the best AI products by combining top talent, ambitious computing resources, and collaborations with leading industry players. Midjourney’s technology, which includes highly advanced models for generating images from text prompts and recently released video models, will enhance Meta’s offerings in AI-generated imagery and video.

Midjourney remains an independent, community-supported lab with no external investors. The licensing deal signifies a major step in Meta’s AI ambitions, complementing its existing in-house tools such as the AI image generator “Imagine” and AI video editor “Movie Gen.” Meta’s CEO Mark Zuckerberg has heavily invested in AI, acquiring talent and companies to boost its capabilities.

The partnership could lead to new AI creative tools integrated across Meta’s platforms, potentially improving functionalities in apps like Facebook, Instagram, and WhatsApp by leveraging Midjourney’s unique aesthetic AI technology. The terms and timeline of the partnership’s full rollout have not been disclosed yet, but the collaboration marks a significant move for Meta in the competitive AI space.

23.08.2025