Show HN: I made the slowest, most expensive GPT
Analysis
The article describes a project that uses multiple LLMs (ChatGPT, Perplexity, Gemini, Claude) to answer the same question, aiming for a more comprehensive and accurate response by cross-referencing. The author highlights the limitations of current LLMs in handling fluid information and complex queries, particularly in areas like online search where consensus is difficult to establish. The project focuses on the iterative process of querying different models and evaluating their outputs, rather than relying on a single model or a simple RAG approach. The author acknowledges the effectiveness of single-shot responses for tasks like math and coding, but emphasizes the challenges in areas requiring nuanced understanding and up-to-date information.
Key Takeaways
- •The project explores the use of multiple LLMs to improve answer quality by cross-referencing.
- •Highlights the limitations of current LLMs in handling fluid information and complex queries.
- •Focuses on an iterative approach of querying and evaluating different models.
- •Emphasizes the challenges in areas requiring nuanced understanding and up-to-date information, like online search.
“An example is something like "best ski resorts in the US", which will get a different response from every GPT, but most of their rankings won't reflect actual skiers' consensus.”