Skip to content

November 26, 2024

Artificial Intelligence, Future Computing

Iterative Combinatorial Brain Surgeon: Scalable Pruning of Large Language and Vision Models (LLVMs)

Elton Zhu

All Posts

FCAT collaborated with Amazon Quantum Solutions Lab to propose a new scalable pruning algorithm for large language and vision models.

The Challenge

State-of-the-art large language and vision models (LLVMs) have seen tremendous success, but their massive scale comes with a hefty price in terms of computational resources. The need to balance performance and efficiency has led to a growing interest in model compression techniques. By using methods like pruning, quantization, or distillation, researchers aim to streamline these models without sacrificing their impressive accuracy.

The Impact

With the integration of advanced methods — such as the one proposed below — and specialized hardware support for sparse models, we can significantly decrease the computational power and energy required to run AI models, all while maintaining their original performance. This can enable the deployment of smaller, more efficient models directly on devices, rather than relying on server-side processing — ultimately helping to enhance data privacy.

The Outcomes

We proposed iterative Combinatorial Brain Surgeon (iCBS), a scalable iterative pruning algorithm that optimizes over small blocks of weights in neural networks using block gradient descent. This blockwise approach can allow iCBS to scale to very large models, including LLVMs with billions of parameters, while helping to achieve higher performance compared to existing one-shot pruning techniques.

For further details on this project, read the full paper.

References & Disclaimers

1176959.1.0

Related posts

Technology & Society, Artificial Intelligence

The Cost of Manipulating AI

Sarah Hoffman

June 23, 2021

We’ve all received email messages with misspellings, designed to outsmart AI-driven spam filters. Perhaps you’ve also heard about how a two-inch piece of tape tricked Tesla cars into speeding up 50 miles per hour.1 What these and hundreds of other anecdotes demonstrate is that it’s relatively easy to manipulate AI systems. Indeed, efforts to "fool" AI have already impacted our industry, where we find numerous examples of people trying to manipulate:

Technology & Society

Tech and Art Have Never Been Closer: A Review of SXSW by FCAT Design

FCAT Design Team

May 5, 2023

A small team from FCAT Design (Kseniya Galper, Susan Fabry, and Will Reed) travelled to Austin, TX for the 2023 South by Southwest (SXSW) conference in March. SXSW bills itself as “the premier destination for dreamers, innovators, storytellers, and global thought leaders. Each March, this diverse group of creatives across the tech, film, and music industries converge for 10 days of collaboration.”

Fintech, Emerging Technology

The Promise and Peril of the Latest Tiny Technologies

Sarah Hoffman

October 30, 2020

Technologies have been getting smaller, and smarter, for a long time. But recent developments in miniaturization have changed the game. In medicine, nanotherapies – where nanoparticles (larger than an atom or a molecule, but smaller than a bulk solid) deliver a drug to a given location in the body – are within reach. Researchers at Stanford School of Medicine found that drug-coated nanoparticles reduce the buildup of plaque in mouse arteries without causing negative side effects.1 Drones also are shrinking, some as small as a quarter. And advances in materials design mean that sensors and networking capabilities can be baked right into walls and desks for stronger, more energy efficient buildings. Even space exploration is getting small; CubeSat’s mini satellite is a mere 10 x 10 x 10 cm.