Dev Notes
Posts
NVIDIA Launches Blackwell GPUs and DGX SuperPOD for Generative AI

NVIDIA Launches Blackwell GPUs and DGX SuperPOD for Generative AI

Meghanadh Vasireddy
March 20, 2024

Sponsored by

Good Morning! NVIDIA unveiled its new Blackwell GPU architecture and DGX SuperPOD built for generative AI workloads. A team from MIT, Stanford, and industry experts are developing DBOS, a database operating system that treats all system data and processes as part of the core database kernel. Vultr launched Cloud Inference, a serverless platform for simplified deployment and global scaling of AI inference models across its NVIDIA GPU infrastructure.

— Forrest Knight & Meghanadh Vasireddy

NVIDIA Launches Blackwell GPUs and DGX SuperPOD for Generative AI

At its big GTC event, NVIDIA revealed its brand new Blackwell GPU architecture built for generative AI workloads. The Blackwell platform has several innovative technologies:

A massive 208 billion transistor GPU chip made using TSMC's advanced 4N process
A second-generation transformer engine with support for 4-bit precision AI inference
5th gen NVLink interconnect running at 1.8 TB/s to link multiple GPUs
A reliability and uptime (RAS) engine for high availability
Hardware security for confidential computing
A dedicated engine to accelerate data decompression

Blackwell can run real-time inference on models up to 10 trillion parameters at one-fourth the cost of previous generations.

NVIDIA also debuted the DGX SuperPOD, a liquid-cooled super-system powered by Blackwell GB200 Grace Superchips. Each Superchip combines two Blackwell GPUs and one Grace CPU. One SuperPOD rack delivers 11.5 exaflops of AI horsepower, 240 TB of fast memory, and predictive maintenance.

Major clouds like AWS, Google, Microsoft and AI leaders like OpenAI plan Blackwell instances and services. NVIDIA says Blackwell will "power an industrial revolution" driving breakthroughs across industries.

Read More Here

Data Power-Up with Bright Data

Bright Data elevates businesses by collecting web data into turning it into actionable insights. Our global proxy network and web unblocking tools enable businesses to build datasets in real time and at scale. They provide a competitive edge in Ecommerce, Travel, Finance, and beyond.

Tap into the value of clean and structured data for market research, ML/AI development, and strategic decision-making. With our scalable solutions, you can efficiently enhance your data strategy, ensuring you're always one step ahead. Start with our free trial and see the difference yourself.

Database-Based Operating System 'DBOS' Does Things Linux Can't

A team from MIT, Stanford, and seasoned experts like Michael Stonebraker are creating a brand new kind of operating system called DBOS (Database Operating System). Unlike traditional OS architectures that layer databases on top, DBOS flips the model and uses a distributed database as the core kernel managing all system data and processes.

The Key Innovation

DBOS treats all OS information like memory, files, messages etc. as data stored in the database kernel itself.
This unique approach enables some special abilities:
- Reliably restarting programs exactly where they left off after interruptions
- Rewinding the full system state to debug issues ("time travel")
- Built-in system metrics tracking and logging

Initially using the FoundationDB data store at its core, DBOS is an open source project with its kernel services programmed in TypeScript. The team claims it can match Linux performance at massive scale up to 1 million cores, while also reducing security vulnerabilities by shrinking the attack surface.

Early Offerings & The Backdrop: A startup company is providing early commercial cloud versions of DBOS, billed as a database-focused serverless computing platform. While reminiscent of past database/OS combos like Pick and IBM's AS/400, DBOS aims to reinvent the concept for modern cloud computing needs.

Whether DBOS succeeds is yet to be seen. However, the pedigree of the MIT/Stanford expert team involving pioneers like Stonebraker makes this an intriguing project to follow in the operating systems space.

Read More Here

Vultr Launches Cloud Inference for Simplified AI Model Deployment

Vultr has launched a new serverless Inference-as-a-Service offering called Cloud Inference, designed to simplify the deployment and scaling of AI inference workloads globally. Cloud Inference allows users to seamlessly integrate their AI models trained on any platform into Vultr's NVIDIA GPU-powered infrastructure spanning 32 locations across 6 continents.

A major challenge has been optimizing AI models across distributed infrastructure while ensuring high availability and low latency. Cloud Inference addresses this through intelligent real-time model serving on optimized NVIDIA hardware, with automated global scaling handling peak demands. The serverless architecture eliminates infrastructure management overhead.

Key benefits of Cloud Inference include:

Flexibility to use models trained anywhere
Reduced infrastructure complexity via serverless deployment
Automated scaling of inference-optimized GPU resources
Cost efficiency by only paying for resources used
Compliance with data residency regulations across global locations
Isolated environments for sensitive workloads

Vultr has partnered with NVIDIA to leverage their GPU technology. As AI moves from training to inference at scale, solutions like Cloud Inference cater to the growing need for optimized, globally distributed inference platforms. The launch expands Vultr's AI/ML offerings alongside existing cloud compute, storage, and CDN services.

Read More Here

🔥 More Notes

Google DeepMind co-founder joins Microsoft as CEO of its new AI division
After raising $1.3B, Inflection got eaten alive by its biggest investor, Microsoft
How Tinder Scaled to 1.6 Billion Swipes per Day

Youtube Spotlight

Why Most Programmers DON'T Last

Click to Watch

The video discusses strategies for longevity in a programming career, emphasizing the need to adapt, simplify technology, manage commitments, avoid career stagnation, choose battles wisely, network consistently, recognize one’s role, and evolve skillsets. He suggests embracing imposter syndrome, simplifying technology for better communication, buffering estimates, avoiding rigid career leveling, strategically picking job battles, prioritizing networking, accepting roles as code-centric or strategic, and evolving beyond mere coding proficiency to ensure lasting success in the industry.

Was this forwarded to you? Sign Up Here