1st Workshop on Efficient Generative AI (EffGenAI)
Abstract
Generative AI has revolutionized fields such as language generation and visual content creation; however, their computational and storage cost presents significant challenges for both cloud and edge deployment. As generative models increase in complexity, a combination of algorithmic and system-level strategies becomes essential in deploying these models. Moreover, practical applications of generative AI are increasingly multi-modal, encompassing text, images, and audio. Therefore, tackling the efficiency challenge of generative AI requires integrating algorithmic and system-level perspectives, while drawing on expertise from various application domains.
Background & Aims & Workshop Design
Generative AI algorithms have garnered significant attention and success across numerous applications, revolutionizing fields such as language-related generation (GPT, LLaMa, PanGu) and visual content creation (Stable Diffusion, SORA, and Flux). Their ability to generate high-quality, contextually relevant outputs has made them indispensable tools in the modern age.
However, the computational and storage demands of these algorithms present significant challenges for both inference and training, whether deployed on cloud platforms or edge devices. Addressing these efficiency issues requires the collaborative efforts of researchers from diverse fields, specifically:
- As model sizes for generative AI applications continue to scale, we need to build larger and more complex systems to accommodate these models. Therefore, greater efforts at the system level are necessary to facilitate efficient and feasible deployment.
- Many researchers focus on accelerating generative models within specific modalities or paradigms. We aim to bring together experts from various modalities and generative paradigms to explore both the distinct efficiency challenges and shared efficiency techniques in these domains.
- Practical applications of generative AI are increasingly multi-modal, integrating models in agentic ways and unifying modalities in single generative models.
- Numerous open questions surround the topic of efficient generative AI, requiring collective wisdom from distinct research areas.
Open Questions
- How long of a context is necessary for different domains and applications?
- How to efficiently deploy agent frameworks?
- Do we need on-edge finetuning? If yes, which applications? How to design the efficient training algorithm?
- Is the paradigm of edge-cloud collaborative inference & training promising?
- Should we unify generative paradigms across different modalities to achieve efficient training and inference?
Workshop Goals
The EffGenAI workshop aims to create a platform for researchers from diverse fields to exchange innovative ideas and foster cross-domain collaboration, promoting impactful research in efficient generative AI.
Topics of Interest
- Efficient Inference and Sampling Methods for Generative Models
- Efficient Model Architecture Designs
- Model Compression Techniques
- Dynamic Inference Techniques
- Efficient Training Methods for Generative Models
- Efficient System-Level Optimization for Generative Models