What is it about?

The amazing progress in the performance of models deep learning (DL) models and foundation models like large language models (LLMs) comes at the price of significant training time, inference latency, energy consumption, and monetary costs. This paper shows how to orchestrate calls between powerful models and cheap proxies, guaranteeing performance while keeping cost in check.

Featured Image

Why is it important?

Unchecked use of powerful ML models costs a significant overhead on computational and monetary resources, energy consumption, and ultimately the environment. Distributing the load intelligently between powerful models and their weaker and cheaper counterparts, called proxies, can help achieve a comparable performance while keeping the various overall costs in check.

Perspectives

Addressing practically relevant issues in a principled paradigm and being able to provide provable guarantees is cool.

Laks Lakshmanan, V.S.
University of British Columbia

Read the Original

This page is a summary of: On Efficient Approximate Queries over Machine Learning Models, Proceedings of the VLDB Endowment, December 2022, ACM (Association for Computing Machinery),
DOI: 10.14778/3574245.3574273.
You can read the full text:

Read

Contributors

The following have contributed to this page