TabiBERT: A Modern BERT for Turkish NLP
Analysis
Key Takeaways
- •Introduces TabiBERT, a new Turkish language model based on ModernBERT.
- •Pre-trained on a large, curated corpus of one trillion tokens.
- •Offers improved inference speed and reduced GPU memory consumption.
- •Introduces TabiBench, a unified benchmarking framework for Turkish NLP.
- •Achieves state-of-the-art results on multiple Turkish NLP tasks.
“TabiBERT attains 77.58 on TabiBench, outperforming BERTurk by 1.62 points and establishing state-of-the-art on five of eight categories.”