Category: Technology

  • NVIDIA Nemotron – Foundation Models for Agentic AI

    NVIDIA Nemotron is a family of multimodal foundation models designed specifically for building enterprise-grade agentic AI with advanced reasoning capabilities. These models enable AI agents that can perform complex tasks such as graduate-level scientific reasoning, advanced math, coding, instruction following, tool calling, and visual reasoning.

    Let’s have a look at the key Features of NVIDIA Nemotron:

    • Agentic Reasoning: Nemotron models excel in reasoning tasks, enabling AI systems to understand, plan, and act autonomously with a level of cognitive reasoning close to human logic. They combine structured thinking with contextual awareness for dynamic and adaptable AI behaviors.

    • Multimodal Capabilities: These models handle both text and vision tasks, such as enterprise optical character recognition (OCR) and complex instruction or tool use.

    • Model Variants Optimized for Different Environments:

      • Nano: Optimized for cost-efficiency and edge deployment, suitable for RTX AI PCs and workstations.

      • Super: Balanced for high accuracy and compute efficiency on a single GPU.

      • Ultra: Designed for maximum accuracy and throughput in multi-GPU data center environments.

    • Open and Customizable: Built on popular open-source reasoning models (notably Llama), Nemotron models are post-trained with high-quality datasets to align with human-like reasoning. They are available under an open license for enterprises to customize and control data, with models and training data openly published on platforms like Hugging Face.

    • Compute Efficiency: Using techniques such as pruning of larger models and NVIDIA’s TensorRT-LLM optimization, Nemotron achieves top compute efficiency, delivering high throughput and low latency across devices from edge to data center.

    • Integration and Deployment: Nemotron models are available as optimized NVIDIA NIM microservices, facilitating peak inference performance, flexible deployment, security, privacy, and portability. They are integrated with tools like NVIDIA NeMo for customizing agentic AI, NVIDIA Blueprints for accelerating development, and NVIDIA AI Enterprise for enterprise-grade production readiness.

    • Industry Adoption: NVIDIA collaborates with leading AI agent platform providers like SAP and ServiceNow to adopt Nemotron models for practical enterprise deployment.

    • Foundation for LLM-based AI Agents: An example in the Nemotron family is the “llama-3.1-nemotron-70b-instruct” large language model, which enhances LLM helpfulness and agentic task performance through specialization.

    NVIDIA Nemotron models provide a commercially viable, highly optimized, and open foundation modeling solution tailored for creating advanced agentic AI systems capable of reasoning, acting, and interacting with complex environments with human-like intelligence and scalability across hardware platforms.

  • A former OpenAI engineer describes what it’s really like to work there

    A former OpenAI engineer has publicly shared a detailed blog post reflecting on a tumultuous yet formative year at the company, describing it as one of both chaos and significant growth. The post sheds light on the intense challenges and rapid developments experienced internally as OpenAI scaled its AI research, deployment, and safety measures.

    Highlights from the Engineer’s Reflections:

    • Intense Work Environment: The engineer described a fast-paced, high-pressure atmosphere with frequent pivots in priorities and strategy to keep up with AI advancements and competitive pressures.

    • Rapid Technical Progress: Despite operational challenges, the team witnessed groundbreaking progress in large language models, multimodal AI, and deployment at scale.

    • Internal and External Challenges: The period was marked by balancing ambitious goals with safety and ethical concerns, managing resource constraints, and addressing coordination issues as the organization grew quickly.

    • Focus on AI Safety: Substantial attention was dedicated to safety research and iterative testing to mitigate AI risks before releasing models broadly.

    • Personal Growth and Team Dynamics: The engineer reflected on strong camaraderie mixed with the stress of meeting aggressive deadlines and expectations.

    This insider account aligns with the public narrative of AI companies racing to push the boundaries of capability while wrestling with the societal implications and operational complexities of deploying powerful AI systems. It also highlights the tensions between open collaboration and competitive secrecy that shape the AI research ecosystem.The former OpenAI engineer’s blog offers a candid, behind-the-scenes view of a landmark year characterized by both significant innovation and organizational growing pains, demonstrating the human side of building cutting-edge AI technology under intense scrutiny and expectations.

  • GPUHammer: New RowHammer Attack Variant Degrades AI Models on NVIDIA GPUs

    The GPUHammer attack is a newly demonstrated hardware-level exploit targeting NVIDIA GPUs, specifically those using GDDR6 memory like the NVIDIA A6000. It is an adaptation of the well-known RowHammer attack technique, which traditionally affected CPU DRAM, but now for the first time has been successfully applied to GPU memory.

    What is GPUHammer?

    • GPUHammer exploits physical vulnerabilities in GPU DRAM by repeatedly accessing (“hammering”) specific memory rows, causing electrical interference that flips bits in adjacent rows.

    • These bit flips can silently corrupt data in GPU memory without direct access, potentially altering critical information used by AI models or other computations running on the GPU.

    • The attack can degrade the accuracy of AI models drastically. For instance, an ImageNet-trained AI model’s accuracy was shown to drop from around 80% to under 1% after the attack corrupted its parameters.

    Technical Challenges Overcome

    • GPU memory architectures differ significantly from CPU DRAM with higher refresh rates and latency, making traditional RowHammer attacks ineffective.

    • The researchers reverse-engineered memory mappings and developed GPU-specific hammering techniques to bypass existing memory protections such as Target Row Refresh (TRR).

    Impact on AI and Data Integrity

    • A single bit flip caused by GPUHammer can poison training data or internal AI model weights, leading to catastrophic failures in model predictions.

    • The attack poses a specific risk in shared computing environments, such as cloud platforms or virtualized desktops, where multiple tenants share GPU resources, potentially enabling one user to corrupt another’s computations or data.

    • Unlike CPUs, GPUs often lack certain hardware security features like instruction-level access control or parity checking, increasing their vulnerability.

    NVIDIA’s Response and Mitigations

    NVIDIA has issued an advisory urging customers to enable system-level Error Correction Codes (ECC), which can help detect and correct some memory errors caused by bit flips, reducing the risk of exploitationUsers of affected GPUs, such as A6000, may experience a performance penalty (up to ~10%) when enabling ECC or other mitigations.Newer NVIDIA GPUs like the H100 and RTX 5090 currently do not appear susceptible to this variant of the attack.

    The GPUHammer attack reveals a serious new hardware security threat to AI infrastructure and GPU-driven computing, highlighting the need for stronger hardware protections as GPUs become central to critical AI workloads

  • Scientists create biological ‘artificial intelligence’ system,PROTEUS

    Australian scientists, primarily at the University of Sydney’s Charles Perkins Centre, have developed a groundbreaking biological artificial intelligence system named PROTEUS (PROTein Evolution Using Selection) that can design and evolve molecules with new or improved functions directly inside mammalian cells.

    How PROTEUS Works

    • Biological AI via Directed Evolution: PROTEUS harnesses the technique of directed evolution, which mimics natural evolution by iteratively selecting molecules with desired traits. Unlike traditional directed evolution that operates mainly in bacterial cells and takes years, PROTEUS accelerates this process drastically—from years to just weeks—directly within mammalian cells.

    • Problem-Solving Mode: Similar to how users input prompts to AI platforms, PROTEUS can be tasked with complex biological problems with uncertain solutions, for example, how to efficiently switch off a human disease gene in the body. It then explores millions of molecular sequences to find molecules highly adapted to solve that problem.

    • Mammalian Cell Environment: The ability to evolve molecules inside mammalian cells is unique and significant because it allows developing molecules that function well in the human body’s physiological context, improving therapeutic relevance.

    Applications and Implications

    • Drug Development and Gene Therapies: PROTEUS can create highly specific research tools and gene therapies, including improving gene editing technologies like CRISPR by enhancing their effectiveness and precision.

    • Molecule Enhancement: Researchers have already used PROTEUS to develop better-regulated proteins and nanobodies (small antibody fragments) that detect DNA damage, which is critical in cancer.

    • Broad Potential: The technology is not limited to these examples and holds promise for designing virtually any protein or molecule with enhanced or new functions to solve biotech and medical challenges

    This fusion of biological systems and AI represents a shift in bioengineering, enabling rapid, in vivo molecular evolution that was previously impossible. PROTEUS dramatically shortens development timelines for novel medicines and biological tools, potentially revolutionizing precision medicine and biotechnology.PROTEUS is a revolutionary AI-driven biological system that uses directed evolution inside mammalian cells to quickly discover and engineer molecules optimized for medical and biotech solutions. By combining AI-style problem-solving with accelerated biological evolution, this technology opens new frontiers in drug design, gene therapy, and molecular biology tailored to function effectively within the human body.

  • Broadcom launches new Tomahawk Ultra networking chip in AI battle against Nvidia

    Broadcom has launched the Tomahawk Ultra, a groundbreaking Ethernet switch chip designed specifically to accelerate high-performance computing (HPC) and artificial intelligence (AI) workloads. It aims to challenge Nvidia’s dominance in AI networking by providing an open, ultra-low latency, and high-throughput solution for tightly coupled AI clusters and HPC environments.

    Let’s have a look at the Key Features of Tomahawk Ultra:

    • Latency and Throughput: The chip delivers an ultra-low latency of 250 nanoseconds and a massive throughput of 51.2 terabits per second (Tbps) at 64-byte packet sizes, enabling rapid data transfer between numerous chips in close proximity, such as inside a server rack.
    • Lossless Ethernet Fabric: Implements advanced technologies like Link Layer Retry (LLR) and Credit-Based Flow Control (CBFC) to eliminate packet loss, creating a lossless network fabric, which is crucial for AI training workloads.
    • In-Network Compute: Supports in-network collective operations (e.g., AllReduce, Broadcast), offloading compute tasks from XPUs (accelerators) onto the switch itself, speeding up AI job completion and reducing network congestion.
    • Optimized Ethernet Headers: Reduces Ethernet header overhead from 46 bytes to as low as 10 (or 6 bytes per some sources) for enhanced efficiency while maintaining full Ethernet compatibility, which significantly improves network performance.
    • Topology Awareness: Supports complex HPC network topologies, including Dragonfly, Mesh, and Torus, via topology-aware routing.
    • Compatibility: The chip is pin-compatible with previous-generation Tomahawk switches, enabling straightforward upgrades for data centers already using Broadcom networking hardware.
    • Manufacturing: Produced using Taiwan Semiconductor Manufacturing Company’s 5-nanometer process technology.

    Strategic Importance vs. Nvidia

    • Broadcom’s Tomahawk Ultra targets the scale-up AI computing market, where many processors must be linked to handle massive AI models. It competes directly with Nvidia’s NVLink Switch chip, with the key differentiator being the Tomahawk Ultra’s ability to connect four times as many chips using an enhanced Ethernet protocol rather than proprietary links.
    • The chip supports standard Ethernet infrastructure, fostering openness and potentially lower costs compared to Nvidia’s proprietary InfiniBand-based solutions, making it attractive to cloud providers and enterprise AI data centers.
    • The move reflects Broadcom’s broader push into AI infrastructure, leveraging its switching expertise to take on Nvidia’s dominance in GPU and AI interconnect technologies.

    Market Reception and Availability

    • Broadcom has started shipping the Tomahawk Ultra in July 2025, with volume production and deployment expected in 2026. Leading cloud providers and networking partners like Quanta Cloud Technology and Arista are involved in sample testing and early adoption plans.
    • Market analysts see this launch as a significant escalation in competition against Nvidia in the AI data center networking segment, potentially giving customers more choice in scaling AI workloads efficiently.

    Broadcom’s Tomahawk Ultra Ethernet switch chip is a major innovation targeting the AI and HPC markets with exceptional latency, throughput, and lossless performance. It is built to rival Nvidia’s proprietary interconnects by leveraging advanced Ethernet to support next-generation AI scale-up, potentially reshaping the landscape of AI hardware networking

  • ByteDance is reportedly working on mixed reality goggles

    ByteDance, the parent company of TikTok, is developing a new pair of lightweight mixed reality (MR) goggles that aim to compete directly with Meta and Apple’s leading devices in the spatial computing and augmented reality space. This move signals ByteDance’s ambitions to establish itself as a serious player in next-generation wearable technology.

    Let’s have a look at the key Features of ByteDance’s MR Goggles:

    • Developed by Pico: The MR goggles are being built by ByteDance’s virtual reality subsidiary, Pico, which previously produced the Pico 4 VR headset.

    • Lightweight Design: The upcoming device is expected to be compact and as lightweight as the Bigscreen Beyond VR headset (~0.28 pounds), making it more comfortable than bulkier headsets such as Meta Quest or Apple’s Vision Pro.

    • Tethered Compute ‘Puck’: Instead of containing all hardware in the headset, most processing is offloaded to a small “puck” connected by a wire. This puck manages computational tasks, similar to Meta’s latest prototype and reminiscent of Apple’s early patents for AR devices connected to iPhones or Macs.

    • Specialized Chips: Pico is reportedly working on custom chips specifically designed to minimize latency—reducing the delay between physical movement and what the user sees in AR, a feature critical for immersive experiences and inspired by Apple’s Vision Pro R1 chip.

    • Focus on Comfort and Portability: This approach prioritizes making MR glasses that are practical for everyday wear, addressing increasing consumer preference for more discreet, glasses-like wearables.

    Industry Context: Competing with Meta and Apple

    • Meta: With its Quest and forthcoming MR glasses (codenamed Phoenix), Meta is currently the biggest player in consumer mixed reality devices. It is moving towards lightweight smart glasses and wearable AI devices, moving beyond traditional, bulky VR headsets.

    • Apple: Apple’s Vision Pro, while powerful and feature-rich, is heavier and more expensive, targeting prosumers and developers rather than the mainstream. Apple had previously envisioned lighter AR glasses, but those efforts were paused in favor of the Vision Pro’s current form.

    • ByteDance’s strategy is to bridge the gap—offering a lighter, more affordable, and easier-to-wear device that could appeal to a larger base of consumers.

    Status and Market Impact

    • The MR goggles are currently in development, with no confirmed release date or target markets announced. Reports note that ByteDance’s previous forays into VR hardware, such as the Pico 5, saw mixed results, but the company is refocusing on lightweight, practical devices under the project codename “Swan”.

    • A significant advantage for ByteDance is its TikTok user base, which offers a massive potential audience for MR experiences integrated with social media and entertainment apps.

    • ByteDance could play a major role in shaping the market alongside Meta, Apple, and others such as Samsung, Google, and Snap, all racing to win consumer preference for smart eyewear and AR/MR glasses in the coming years.

    ByteDance is poised to intensify competition in the mixed reality market by developing lightweight, efficient MR goggles through its Pico division, directly rivaling products from Meta and Apple. With a significant focus on comfort and blending into everyday life, ByteDance’s approach could accelerate mainstream adoption of spatial computing technologies.

  • Windsurf’s leadership has moved to Google

    Windsurf’s leadership has moved to Google following the collapse of OpenAI’s planned $3 billion acquisition of the AI coding startup. Windsurf CEO Varun Mohan, co-founder Douglas Chen, and several key members of the research and development team have joined Google’s DeepMind division to work on advanced AI coding projects, particularly focusing on Google’s Gemini initiative.

    As part of the arrangement, Google is paying $2.4 billion in licensing fees for nonexclusive rights to use certain Windsurf technologies, but it has not acquired any ownership or controlling interest in Windsurf. The startup itself remains independent, with most of its approximately 250 employees staying on and Jeff Wang appointed as interim CEO to continue developing Windsurf’s enterprise AI coding solutions.

    This deal represents a strategic “reverse acquihire” where Google gains top AI coding talent and technology licenses without fully acquiring the company, allowing Windsurf to maintain its autonomy and license its technology to others. The move comes after OpenAI’s acquisition talks fell through due to disagreements, including concerns about Microsoft’s access to Windsurf’s intellectual property.

    The transition of Windsurf’s leadership to Google highlights the intense competition among AI companies to secure talent and technology in the rapidly evolving AI coding sector.

  • Intel spins out AI robotics company RealSense with $50 million raise

    Intel has officially spun out its RealSense computer vision division into an independent company, completing the transition in July 2025. Alongside the spinout, RealSense secured a $50 million Series A funding round led by investors including Intel Capital and the MediaTek Innovation Fund. This move aims to accelerate RealSense’s growth and innovation in the rapidly expanding fields of robotics, AI vision, and biometrics.

    RealSense, originally part of Intel’s Perceptual Computing division since 2013, specializes in depth-sensing cameras and AI-powered computer vision technologies that enable machines to perceive their environment in 3D. Its products are widely used in autonomous mobile robots and humanoid robots, with about 3,000 active customers globally. The company’s latest camera, the D555, features integrated AI and can transmit power and data through a single cable.

    The spinout allows RealSense to operate with greater independence and focus on expanding its product roadmap, including innovations in stereo vision, robotics automation, and biometric AI hardware and software. Nadav Orbach, a longtime Intel executive, has taken the role of CEO for the new entity. He highlighted the increasing market demand for physical AI and robotics solutions, noting that external financing was prudent to capitalize on these opportunities.

    This strategic separation follows Intel’s broader IDM 2.0 strategy to focus on core businesses while enabling RealSense to pursue growth in specialized AI and computer vision sectors. The company plans to scale manufacturing and enhance its global market presence to meet rising demand in robotics and automation industries.

  • Samsung is exploring new AI wearables such as earrings and necklaces

    Samsung is actively exploring the development of AI-powered wearable devices in new form factors such as earrings and necklaces, aiming to create smart accessories that users can wear comfortably without needing to carry traditional devices like smartphones.

    Won-joon Choi, Samsung’s chief operating officer for the mobile experience division, explained that the company envisions wearables that allow users to communicate and perform tasks more efficiently through AI, without manual interaction such as typing or swiping. These devices could include not only earrings and necklaces but also glasses, watches, and rings.

    The goal is to integrate AI capabilities into stylish, ultra-portable accessories that provide seamless, hands-free interaction with AI assistants, real-time voice commands, language translation, health monitoring, and notifications. This approach reflects Samsung’s strategy to supplement smartphones rather than replace them, offering users more natural and constant connectivity with AI.

    Currently, these AI jewelry concepts are in the research and development stage, with no official product launches announced yet. Samsung is testing prototypes and exploring possibilities as part of a broader push to expand AI use in daily life through innovative hardware.

    This initiative aligns with industry trends where companies like Meta have found success with AI-enabled smart glasses, indicating strong market interest in wearable AI devices that require less manual input than smartphones.

  • MedSigLIP, a lightweight, open-source medical image and text encoder developed by Google

    MedSigLIP is a lightweight, open-source medical image and text encoder developed by Google DeepMind and released in 2025 as part of the MedGemma AI model suite for healthcare. It has approximately 400 million parameters, making it much smaller and more efficient than larger models like MedGemma 27B, yet it is specifically trained to understand medical images in ways general-purpose models cannot.

    Let’s have a llok at the key Characteristics of MedSigLIP:
    Architecture: Based on the SigLIP (Sigmoid Loss for Language Image Pre-training) framework, MedSigLIP links medical images and text into a shared embedding space, enabling powerful multimodal understanding.

    Training Data: Trained on over 33 million image-text pairs, including 635,000 medical examples from diverse domains such as chest X-rays, histopathology, dermatology, and ophthalmology.

    Capabilities:

    • Supports classification, zero-shot labeling, and semantic image retrieval of medical images.
    • Retains general image recognition ability alongside specialized medical understanding.

    Performance: Demonstrates strong results in dermatology (AUC 0.881), chest X-ray analysis, and histopathology classification, often outperforming larger models on these tasks.

    Use Cases: Ideal for medical imaging tasks that require structured outputs like classification or retrieval rather than free-text generation. It can also serve as the visual encoder foundation for larger MedGemma models.

    Efficiency: Can run on a single GPU and is optimized for deployment on edge devices or mobile hardware, making it accessible for diverse healthcare settings.

    MedSigLIP is a featherweight yet powerful medical image-text encoder designed to bridge images and clinical text for tasks such as classification and semantic search. Its open-source availability and efficiency make it a versatile tool for medical AI applications, complementing the larger generative MedGemma models by focusing on embedding-based image understanding rather than text generation.