使用短随机块分类长篇法律文件

Paper #llm 🔬 Research|分析: 2026年1月3日 06:15•

发布: 2025年12月31日 17:48

•

1分で読める

分析

本文解决了使用基于Transformer的模型对长篇法律文件进行分类的实际挑战。核心贡献是使用短的、随机选择的文本块来克服计算限制并提高效率的方法。使用Temporal的部署管道也是一个关键方面，突出了在实际应用中实现稳健可靠处理的重要性。报告的F-score和处理时间提供了有价值的基准。

引用 / 来源

"The best model had a weighted F-score of 0.898, while the pipeline running on CPU had a processing median time of 498 seconds per 100 files."

ArXiv2025年12月31日 17:48

* 根据版权法第32条进行合法引用。

Implementing a ChatGPT-like LLM from scratch, step by step

ETH Zurich and EPFL to release a LLM developed on public infrastructure