Search:
Match:
1 results
Paper#llm🔬 ResearchAnalyzed: Jan 3, 2026 16:08

Splitwise: Adaptive Edge-Cloud LLM Inference with DRL

Published:Dec 29, 2025 08:57
1 min read
ArXiv

Analysis

This paper addresses the challenge of deploying large language models (LLMs) on edge devices, balancing latency, energy consumption, and accuracy. It proposes Splitwise, a novel framework using Lyapunov-assisted deep reinforcement learning (DRL) for dynamic partitioning of LLMs across edge and cloud resources. The approach is significant because it offers a more fine-grained and adaptive solution compared to static partitioning methods, especially in environments with fluctuating bandwidth. The use of Lyapunov optimization ensures queue stability and robustness, which is crucial for real-world deployments. The experimental results demonstrate substantial improvements in latency and energy efficiency.
Reference

Splitwise reduces end-to-end latency by 1.4x-2.8x and cuts energy consumption by up to 41% compared with existing partitioners.