Search:
Match:
1 results

Analysis

This paper addresses the slow inference speed of autoregressive (AR) image models, which is a significant bottleneck. It proposes a novel method, Adjacency-Adaptive Dynamical Draft Trees (ADT-Tree), to accelerate inference by dynamically adjusting the draft tree structure based on the complexity of different image regions. This is a crucial improvement over existing speculative decoding methods that struggle with the spatially varying prediction difficulty in visual AR models. The results show significant speedups on benchmark datasets.
Reference

ADT-Tree achieves speedups of 3.13x and 3.05x, respectively, on MS-COCO 2017 and PartiPrompts.