Predicting Data Efficiency for LLM Fine-tuning
Analysis
Key Takeaways
- •Addresses the problem of unknown data efficiency in LLM fine-tuning.
- •Proposes a method to predict data efficiency using gradient cosine similarity.
- •Aims to reduce the need for costly incremental annotation and retraining.
- •Achieves 8.6% error in data efficiency prediction on a diverse set of tasks.
“The paper proposes using the gradient cosine similarity of low-confidence examples to predict data efficiency based on a small number of labeled samples.”