Self-hosting LLM on Multi-CPU and System RAM
Published:Dec 28, 2025 22:34
•1 min read
•r/LocalLLaMA
Analysis
The Reddit post discusses the feasibility of self-hosting large language models (LLMs) on a server with multiple CPUs and a significant amount of system RAM. The author is considering using a dual-socket Supermicro board with Xeon 2690 v3 processors and a large amount of 2133 MHz RAM. The primary question revolves around whether 256GB of RAM would be sufficient to run large open-source models at a meaningful speed. The post also seeks insights into expected performance and the potential for running specific models like Qwen3:235b. The discussion highlights the growing interest in running LLMs locally and the hardware considerations involved.
Key Takeaways
- •The post explores the viability of running large LLMs on older server hardware with significant RAM.
- •The author is specifically considering a dual-socket Xeon system with 256GB of RAM.
- •The primary concern is whether the system will provide acceptable performance for running open-source LLMs.
Reference
“I was thinking about buying a bunch more sys ram to it and self host larger LLMs, maybe in the future I could run some good models on it.”