NVIDIA CEO Jensen Huang's Vision for the Future

TL;DR

This interview with NVIDIA CEO Jensen Huang explains how a shift from sequential to parallel computing — pioneered in GPUs and made broadly accessible by CUDA — sparked the modern AI era, and how that foundation is now being extended into robotics, digital biology and climate science with tools like Omniverse and Cosmos. Jensen outlines the technical reasoning, the risky long-term bets NVIDIA made (and funded), current limits (mostly energy), and practical advice for people who want to prepare for the coming decade of AI-powered tools.

Questions & Answers

1. How did GPUs start and why were they such a big idea?

In the early 1990s Jensen and his team observed that a small portion of program code (around 10%) does about 99% of the processing and that much of that work can be done in parallel. Traditional CPUs were optimized for sequential processing; GPUs were designed to do a lot of parallel work at once. That combination — a system that can do both sequential and parallel processing — was the guiding insight that led NVIDIA to build modern GPUs and to tackle computer problems that normal computers couldn’t. The GPU therefore unlocked a new class of compute capability by enabling massive parallelism.


2. Why did NVIDIA focus on video games first?

NVIDIA chose gaming because 3D graphics required extensive parallel processing (many similar calculations at once), and the gaming market was large enough to justify heavy R&D. Jensen explains that gaming was both personally appealing (simulating virtual worlds) and strategically useful: because the market could be very large, NVIDIA could invest heavily in R&D and iterate the technology — creating a positive flywheel where market scale funded better technology, which in turn expanded uses beyond gaming.


3. What did Jensen mean when he called a GPU a “time machine”?

Calling a GPU a “time machine” refers to making computations so fast that you can simulate or predict futures you otherwise couldn’t within your lifetime. Jensen gives the example of a quantum chemistry researcher who could do foundational work in a lifetime thanks to GPU acceleration. He also points to simulations — weather prediction or virtual cities for self-driving cars — as forms of time travel because they let you see future outcomes much faster than before.


4. What is CUDA and why was it important for non-graphics uses of GPUs?

CUDA is the platform NVIDIA created to let programmers use GPUs for general-purpose parallel processing with familiar languages like C. Early adopters had to “trick” GPUs into solving non-graphics problems; CUDA made that far easier and accessible. Jensen describes CUDA as a deliberate company bet: by exposing GPU parallelism via a programming model people already knew, the GPUs could become the highest-volume parallel processors in the world and unlock many new applications beyond graphics.


5. Why was the 2012 AlexNet moment so important, and how did GPUs factor into it?

AlexNet (2012) showed a deep neural network dramatically outperforming previous image-recognition techniques. Crucially, the researchers trained AlexNet on NVIDIA GPUs (they used a GeForce GTX 580 via CUDA), which provided the necessary parallel compute. That performance leap signaled that deep neural networks could scale and that GPUs were the engines that made large-scale training feasible — a turning point that helped spark the modern AI boom in vision, speech, language and more.


6. Why did it take around a decade between that breakthrough and the large-scale AI moment we’re in now?

Jensen frames the decade as the result of long-term commitment based on core beliefs. After building CUDA and seeing early successes (like AlexNet), NVIDIA stayed invested because the engineering and scientific evidence supported scaling. He emphasizes that commitment sometimes meant investing tens of billions of dollars long before broad adoption. The middle years entailed continuing to improve architectures, systems and the software stack; there were hard moments, but the core principles didn’t change so they kept betting and iterating until the wider ecosystem caught up.


7. What core beliefs guided NVIDIA’s long-term bets?

Jensen summarizes several core beliefs: (1) accelerated computing — combining parallel processors (GPUs) with sequential processors — is essential; (2) deep neural networks (DNNs) can learn more as models and data scale, and that empirical scaling is true; (3) data is digital human experience and models can learn across modalities (images, audio, text, molecular sequences); and (4) building an architecture and stack that enables researchers and developers to innovate is crucial. Those beliefs justified long-term investment and a re-engineering of the computing stack (e.g., DGX).


8. What is DGX and why was the 2016 DGX delivery significant?

DGX is NVIDIA’s AI system/supercomputer product family. Jensen notes the DGX1 (delivered to OpenAI in 2016) was a $250,000 AI supercomputer, marking the company’s move to offer full-stack re-engineered systems for AI. He contrasts that early DGX with a modern small form-factor ‘mini’ DGX: the new versions are far more energy- and compute-efficient, and NVIDIA aims to make AI supercomputing accessible to broader audiences (eventually $3,000 versions for students and developers).


9. What are Omniverse and Cosmos, and how do they work together to train robots?

Omniverse is a 3D simulation platform that uses physics-based simulation (principled solvers and Newtonian physics) to create realistic virtual environments. Cosmos is described as a ‘world language model’ or world foundation model — analogous to how ChatGPT is a language model for text. Jensen explains the combination: Cosmos provides a world model with physical common sense (gravity, friction, object permanence, cause and effect); Omniverse provides grounding via high-fidelity physics simulation. Together (Omniverse + Cosmos) generate large, physically plausible datasets and stories that robots can be trained on digitally and that are grounded in simulated physical truth.


10. How does training in simulation help robots learn faster or safer than real-world training?

Training in simulation lets robots experience vastly more repetitions and a wider variety of conditions (lighting, blockages, times of day, etc.) without real-world wear, damage or manual demonstration. Jensen gives the factory example: instead of manually guiding a robot through many routes (days of work and wear-and-tear), you can simulate all routes digitally much faster and with many variations. Because Omniverse uses physics-based simulation, those digital experiences can be grounded in physical laws, improving transfer to the real world.


11. What does Jensen predict the future will look like for robots and everyday life?

Jensen forecasts a future where "everything that moves will be robotic someday and it will be soon." He expects cars to be robotic, lawnmowers to be robotic, and humanoid robots to appear in the near future. He imagines personal robotic assistants (an R2-D2 metaphor) that accompany people across devices — in glasses, phones, PCs and physical embodiments at home. He sees robots doing dangerous and mundane tasks everywhere and expects robots to learn in simulation (Omniverse/Cosmos) before entering the physical world.


12. What are the main safety and misuse concerns Jensen highlights about AI and robots?

He outlines several risk categories: bias and toxicity, hallucination (confident but incorrect outputs), impersonation (AI convincingly pretending to be specific humans), and failures in engineering that cause harm (e.g., sensors failing in a self-driving car). Jensen stresses that some risks need deep research and engineering to ensure systems function correctly and safely. He argues for architecting safety with redundancy and community-wide safeguards — analogous to triple-redundant flight computers, pilots, and air traffic control — so that failures don’t put people in harm’s way.


13. How does NVIDIA think about hardware specialization (e.g., chips optimized for transformers) versus designing for flexibility?

NVIDIA’s stance, as explained by Jensen, is shaped by a core belief that current AI architectures (like transformers) are likely steps in an evolving sequence rather than the final form. Because software and algorithms change rapidly, NVIDIA prefers architectures that enable flexibility and invention so researchers can experiment with new ideas (different attention mechanisms, hierarchical methods, etc.). In short, they favor designing programmable, flexible hardware that supports ongoing innovation rather than locking too narrowly into one fixed design.


14. What are the fundamental technical limits NVIDIA worries about today, and how has energy/efficiency changed?

Jensen frames the fundamental limit as how much work you can do given available energy: transporting and flipping bits costs energy, and that ultimately constrains what’s possible. However, he says we’re far from a fundamental limit and continue to improve energy efficiency. He cites a claimed 10,000x energy-efficiency improvement in AI computing since the 2016 DGX1 delivery, noting modern systems are far more powerful and efficient. Because energy efficiency is central to enabling larger, smarter systems, improving it remains their top technical priority.


15. How should individuals prepare for this moment — what practical steps does Jensen recommend?

Jensen’s practical advice is straightforward: learn to use AI. He emphasizes prompting and interacting with models (ChatGPT, Gemini Pro, Grok) as essential skills — like being good at asking questions. He recommends everyone get an AI tutor to learn and accelerate their development in any field (law, medicine, science, programming). His core message: ask how AI can make you better at your job and adopt tools that empower you — the next generation should ask “how can I use AI to do my job better?” rather than whether to use it.


16. What bets is NVIDIA making now for the next decade?

Jensen lists several active bets: (1) fusion of Omniverse and Cosmos to enable generative physically grounded world models for robotics and beyond; (2) human robotics tooling, training systems and demonstration systems — he expects the next five years to be very interesting for humanoid robots; (3) digital biology — learning the language of molecules, cells and potentially building digital twins of human biology; (4) climate science and high-resolution regional weather prediction; and (5) continued democratization of AI compute (making powerful DGX-like systems far more accessible). He describes these as extensions of NVIDIA’s time-machine instrument to let scientists and engineers see and optimize future outcomes.

By wang

Published: January 3, 2026

Last updated: January 3, 2026