Search: CodeLlama-34B - ai.jp.net

Research #llm 👥 CommunityAnalyzed: Jan 3, 2026 06:20

Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context

Published:Oct 31, 2023 17:40

•

1 min read

•

Hacker News

Analysis

The article announces a new Phind model that outperforms GPT-4 in coding tasks while being significantly faster. It highlights the model's performance on HumanEval and emphasizes its real-world helpfulness based on user feedback. The speed advantage is attributed to the use of NVIDIA's TensorRT-LLM library on H100s. The article also mentions the model's foundation on open-source CodeLlama-34B fine-tunes.

Key Takeaways

•Phind has released a new model that surpasses GPT-4 in coding ability.
•The new model is 5x faster than GPT-4.
•The model is built on CodeLlama-34B fine-tunes.
•The model achieves a HumanEval score of 74.7%.
•The speed advantage is due to TensorRT-LLM on H100s.

Reference

“The current 7th-generation Phind Model is built on top of our open-source CodeLlama-34B fine-tunes that were the first models to beat GPT-4’s score on HumanEval and are still the best open source coding models overall by a wide margin.”

Permalink Hacker News

Research #AI Code Generation 👥 CommunityAnalyzed: Jan 3, 2026 06:20

Fine-tuned CodeLlama-34B Beats GPT-4 on HumanEval

Published:Aug 25, 2023 22:08

•

1 min read

•

Hacker News

Analysis

The article reports on fine-tuning CodeLlama-34B and CodeLlama-34B-Python on a proprietary dataset to achieve higher pass@1 scores on HumanEval compared to GPT-4. The authors emphasize the use of instruction-answer pairs in their dataset, native fine-tuning, and the application of OpenAI's decontamination methodology to ensure result validity. The training process involved DeepSpeed ZeRO 3, Flash Attention 2, and 32 A100-80GB GPUs, completing in three hours. The article highlights a significant achievement in code generation capabilities.

Key Takeaways

•Fine-tuned CodeLlama models outperform GPT-4 on HumanEval.
•The models were trained on a proprietary dataset of instruction-answer pairs.
•OpenAI's decontamination methodology was applied to ensure result validity.
•Training utilized DeepSpeed ZeRO 3, Flash Attention 2, and 32 A100-80GB GPUs.

Reference

“We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%.”

Permalink Hacker News

Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context

Analysis

Key Takeaways

Fine-tuned CodeLlama-34B Beats GPT-4 on HumanEval

Analysis

Key Takeaways

📬 Get AI News Delivered

Browse by Category

Trending Topics

📬 Get AI News Delivered

Browse by Category

Trending Topics