Mistral Vibe + Devstral2 Small: Local LLM Performance
Published:Jan 4, 2026 03:11
•1 min read
•r/LocalLLaMA
Analysis
The article highlights the positive experience of using Mistral Vibe and Devstral2 Small locally. The user praises its ease of use, ability to handle full context (256k) on multiple GPUs, and fast processing speeds (2000 tokens/s PP, 40 tokens/s TG). The user also mentions the ease of configuration for running larger models like gpt120 and indicates that this setup is replacing a previous one (roo). The article is a user review from a forum, focusing on practical performance and ease of use rather than technical details.
Key Takeaways
Reference
““I assumed all these TUIs were much of a muchness so was in no great hurry to try this one. I dunno if it's the magic of being native but... it just works. Close to zero donkeying around. Can run full context (256k) on 3 cards @ Q4KL. It does around 2000t/s PP, 40t/s TG. Wanna run gpt120, too? Slap 3 lines into config.toml and job done. This is probably replacing roo for me.””