[P] The Story Of Topcat (So Far)
Analysis
This post from r/MachineLearning details a personal journey in AI research, specifically focusing on alternative activation functions to softmax. The author shares experiences with LSTM modifications and the impact of the Golden Ratio on tanh activation. While the findings are presented as somewhat unreliable and not consistently beneficial, the author seeks feedback on the potential merit of publishing or continuing the project. The post highlights the challenges of AI research, where many ideas don't pan out or lack consistent performance improvements. It also touches on the evolving landscape of AI, with transformers superseding LSTMs.
Key Takeaways
- •Exploration of alternative activation functions in neural networks.
- •Challenges in achieving consistent performance improvements in AI research.
- •The rapid evolution of AI architectures (LSTMs vs. Transformers).
“A story about my long-running attempt to develop an output activation function better than softmax.”