TelcoNews Asia - Telecommunications news for ICT decision-makers
Story image

NVIDIA launches Cosmos platform for AI & robotics

Yesterday

NVIDIA has introduced the Cosmos platform, which includes generative world foundation models, designed to advance the development of physical AI systems such as autonomous vehicles and robots. The platform consists of new state-of-the-art models, video tokenisers, and an accelerated data processing pipeline optimised for NVIDIA Data Centre GPUs.

To address the expensive costs and extensive data requirements of developing physical AI models, Cosmos world foundation models (WFMs) provide developers the capability to generate extensive amounts of photoreal, physics-based synthetic data for model training and evaluation. Developers also have the option to customise models by fine-tuning the Cosmos WFMs.

"The ChatGPT moment for robotics is coming. Like large language models, world foundation models are fundamental to advancing robot and AV development, yet not all developers have the expertise and resources to train their own," stated Jensen Huang, Founder and CEO of NVIDIA. "We created Cosmos to democratise physical AI and put general robotics in reach of every developer."

The open model license of Cosmos WFMs is aimed at accelerating progress within the robotics and AV community. Initial models are available for developers to preview on the NVIDIA API catalogue, or to download along with a fine-tuning framework from NVIDIA NGC's catalogue or Hugging Face.

NVIDIA Cosmos supports customisation of WFMs with datasets, such as recordings from AV journeys or robots in warehouses, catering to specific application needs. The models are tailored for physical AI R&D, enabling generation of physics-based videos from varying inputs.

Jensen Huang highlighted potential uses of Cosmos models, including video search and understanding, synthetic data generation, model development and evaluation, and foresight or multiverse simulation during a recent keynote.

Developing physical AI models requires significant video data and compute hours. To mitigate associated costs, Cosmos features an AI and CUDA-accelerated data processing pipeline powered by NVIDIA NeMo Curator. This setup enables substantial processing efficiency gains over traditional CPU-only pipelines.

"Data scarcity and variability are key challenges to successful learning in robot environments," commented Pras Velagapudi, Chief Technology Officer at Agility. "Cosmos' text-, image- and video-to-world capabilities allow us to generate and augment photorealistic scenarios for a variety of tasks that we can use to train models without needing as much expensive, real-world data capture."

In addition to the technical advancements, NVIDIA is partnering with companies such as Waabi and Uber. Dara Khosrowshahi, CEO of Uber, stated, "Generative AI will power the future of mobility, requiring both rich data and very powerful compute. By working with NVIDIA, we are confident that we can help supercharge the timeline for safe and scalable autonomous driving solutions for the industry."

The development of Cosmos aligns with NVIDIA's trustworthy AI principles, focusing on privacy, safety, security, transparency, and the mitigation of bias. These principles are intended to foster innovation and maintain user trust. Cosmos models include guardrails and watermarking to enhance safety and authenticity.

With Cosmos WFMs now available under an open model license, NVIDIA is also offering supporting tools such as the NeMo Curator and DGX Cloud for accelerated processing and deployment. In the enterprise realm, NVIDIA introduced Llama Nemotron large language models and Cosmos Nemotron vision language models for various industry applications.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X