Exciting Progress: Potential Fix Underway for GLM-4.7-Flash in llama.cpp!
Analysis
Key Takeaways
- •The current llama.cpp implementation of GLM-4.7-Flash was suspected to have issues.
- •Significant differences in logprobs were observed compared to vLLM.
- •A potential fix is actively being developed and available via a pull request.
“There is a potential fix already in this PR thanks to Piotr...”