Search:
Match:
1 results
Research#llm👥 CommunityAnalyzed: Jan 3, 2026 08:53

Wordllama: Lightweight Utility for LLM Token Embeddings

Published:Sep 15, 2024 03:25
2 min read
Hacker News

Analysis

Wordllama is a library designed for semantic string manipulation using token embeddings from LLMs. It prioritizes speed, lightness, and ease of use, targeting CPU platforms and avoiding dependencies on deep learning runtimes like PyTorch. The core of the library involves average-pooled token embeddings, trained using techniques like multiple negatives ranking loss and matryoshka representation learning. While not as powerful as full transformer models, it performs well compared to word embedding models, offering a smaller size and faster inference. The focus is on providing a practical tool for tasks like input preparation, information retrieval, and evaluation, lowering the barrier to entry for working with LLM embeddings.
Reference

The model is simply token embeddings that are average pooled... While the results are not impressive compared to transformer models, they perform well on MTEB benchmarks compared to word embedding models (which they are most similar to), while being much smaller in size (smallest model, 32k vocab, 64-dim is only 4MB).