May 6, 2020 SOURCE: PR NewsWire

Run:AI Creates First Fractional GPU Sharing for Kubernetes Deep Learning Workloads

TEL AVIV, Israel, May 6, 2020 /PRNewswire/ -- Run:AI, a company virtualizing AI infrastructure, today released the first fractional GPU sharing system for deep learning workloads on Kubernetes. Especially suited for lightweight AI tasks at scale such as inference, the fractional GPU system transparently gives data science and AI engineering teams the ability to run multiple workloads simultaneously on a single GPU, enabling companies to run more workloads such as computer vision, voice recognition and natural language processing on the same hardware, lowering costs.

Today's de facto standard for deep learning workloads is to run them in containers orchestrated by Kubernetes. However, Kubernetes is only able to allocate whole physical GPUs to containers, lacking the isolation and virtualization capabilities needed to allow GPU resources to be shared without memory overflows or processing clashes.

Run:AI's fractional GPU system effectively creates virtualized logical GPUs, with their own memory and computing space that containers can use and access as if they were self-contained processors. This enables several deep learning workloads to run in containers side-by-side on the same GPU without interfering with each other. The solution is transparent, simple and portable; it requires no changes to the containers themselves.

To create the fractional GPUs, Run:AI had to modify how Kubernetes handled them. "In Kubernetes, a GPU is handled as an integer," said Dr. Ronen Dar, co-founder and CTO of Run:AI. "You either have one or you don't. We had to turn GPUs into floats, allowing for fractions of GPUs to be assigned to containers." Run:AI also solved the problem of memory isolation, so each virtual GPU can run securely without memory clashes.

A typical use-case could see 2-4 jobs running on the same GPU, meaning companies could do four times the work with the same hardware. For some lightweight workloads, such as inference, more than 8 jobs running in containers can comfortably share the same physical chip.

The addition of fractional GPU sharing is a key component in Run:AI's mission to create a true virtualized AI infrastructure, combining with Run:AI's existing technology that elastically stretches workloads over multiple GPUs and enables resource pooling and sharing.

"Some tasks, such as inference tasks, often don't need a whole GPU, but all those unused processor cycles and RAM go to waste because containers don't know how to take only part of a resource," said Run:AI co-founder and CEO Omri Geller. "Run:AI's fractional GPU system lets companies unleash the full capacity of their hardware so they can scale up their deep learning more quickly and efficiently."

About Run:AI

Run:AI has built the world's first virtualization layer for AI workloads. By abstracting workloads from underlying infrastructure, Run:AI creates a shared pool of resources that can be dynamically provisioned, enabling full utilization of expensive GPU compute. IT teams retain control and gain real-time visibility - including seeing and provisioning run-time, queueing and GPU utilization - from a single web-based UI. This virtual pool of resources enables IT leaders to view and allocate compute resources across multiple sites - whether on premises or in the cloud. The Run:AI platform is built on top of Kubernetes, enabling simple integration with existing IT and data science workflows.

Media Contact
Lazer Cohen
lazer@westraycommunications
+1 347-753-8256

View original content:http://www.prnewswire.com/news-releases/runai-creates-first-fractional-gpu-sharing-for-kubernetes-deep-learning-workloads-301053864.html

SOURCE Run:AI

Related News

Run:AI Supports AI Centre to Speed Up Machine Learning, Particularly in the Fight Against Covid-19

EIZO Releases Rugged XMC Graphics/GPGPU Card with Field Programmable Configuration of Analog and Digital Outputs

Inspur Releases 5 New AI Servers Powered by NVIDIA A100 Tensor Core GPUs

EIZO Releases 3U VPX Graphics/GPGPU Card Based on NVIDIA Turing (TU104) for AI Applications in the Defense Market

Supermicro Expands Portfolio with Fully Integrated NVIDIA A100 GPU-Powered Systems Delivering 5 PetaFLOPS of AI Performance in a Single 4U Server

EIZO Releases Industry’s First Rugged PCIe Graphics/GPGPU Card for Airborne/Naval ISR Applications

Crystal Group Joins NVIDIA Partner Network to Further Support its Mission to Deliver AI to the Edge

cnvrg.io Data Science Platform Certified by NVIDIA DGX-Ready Software Program

View all News

Latest Procurements

Herbicide resistance status of grain and cotton cropping regions - strategic insights for RDE

Photo Award Submission Management System

Provision for a SaaS Photo Management System

Employee Assistance Program (EAP)

The procurement project NS1230 for the Employee Assistance Program (EAP) underscores ABC's unwavering commitment to employee well-being and organisational resilience. The main purpose of this project is to secure a...

View all Procurementsp

Agenda

June

FranceParc des Expositions de Paris Nord Villepinte ZAC Paris Nord 2, 93420 Villepinte – France (Halls 5a – 5b – 6)

Eurosatory 2024, 17 - 21 June 2024, Parc des Expositions de Paris Nord Villepinte ZAC Paris Nord 2, 93420 Villepinte – France (Halls 5a – 5b – 6)

Eurosatory 2024 There can be no sustainable world without peace and security. Founded in 1967, the exhibition...

May

Paris Marriott Rive Gauche & Conference Center

Future Artillery, 21-23 May 2024, Paris Marriott Rive Gauche & Conference Center, France

DELIVERING FIRES IN THE MULTI-DOMAIN BATTLESPACE As artillery reasserts itself as the ‘King of Battlefield’ in the wake...

May

United Arab EmiratesADNEC (Abu Dhabi National Exhibition Centre), Abu Dhabi, Al Khaleej Al Arabi St - Al Rawdah - Al Ma'arid - Abu Dhabi, UAE

ISNR Abu Dhabi 2024, 21-23 May 2024, ADNEC (Abu Dhabi National Exhibition Centre), Abu Dhabi, Al Khaleej Al Arabi St - Al Rawdah - Al Ma'arid - Abu Dhabi, UAE

ISNR ACCELERATING TRANSFORMATION IN THE NATIONAL SECURITY ECOSYSTEM ISNR Abu Dhabi is the region’s most trusted...

View All Events

Companies

Industria Metalurgica Friuli

ICZ a.s.

Honeywell Aerospace (Mexico)

View All Companies