Search: ほとんどの技術の組み合わせは、機能的な正確性をさらに向上させることはありませんが、コードの品質（コードの臭いと保守性）を向上させることができます。 - ai.jp.net

Paper #llm 🔬 ResearchAnalyzed: Jan 3, 2026 08:55

Training Data Optimization for LLM Code Generation: An Empirical Study

Published:Dec 31, 2025 02:30

•

1 min read

•

ArXiv

Analysis

This paper addresses the critical issue of improving LLM-based code generation by systematically evaluating training data optimization techniques. It's significant because it provides empirical evidence on the effectiveness of different techniques and their combinations, offering practical guidance for researchers and practitioners. The large-scale study across multiple benchmarks and LLMs adds to the paper's credibility and impact.

Key Takeaways

•Data synthesis is the most effective technique for improving functional correctness and reducing code smells.
•Data synthesis combined with data refactoring achieves the strongest overall performance.
•Most combinations of techniques do not further improve functional correctness but can enhance code quality (code smells and maintainability).

Reference

“Data synthesis is the most effective technique for improving functional correctness and reducing code smells.”

Permalink ArXiv

Training Data Optimization for LLM Code Generation: An Empirical Study

Analysis

Key Takeaways

📬 Get AI News Delivered

Browse by Category

Trending Topics

📬 Get AI News Delivered

Browse by Category

Trending Topics