Research#NLP📝 BlogAnalyzed: Dec 29, 2025 08:27

Taming arXiv with Natural Language Processing w/ John Bohannon - TWiML Talk #136

Published:May 7, 2018 16:25
1 min read
Practical AI

Analysis

This podcast episode from Practical AI features John Bohannon, Director of Science at AI startup Primer. The discussion centers on Primer Science, a tool designed to manage the overwhelming volume of machine learning papers on arXiv. The tool uses unsupervised learning to categorize content, generate summaries, and track activity in different innovation areas. The conversation delves into the technical aspects of Primer Science, including its data pipeline, the tools employed, the methods for establishing 'ground truth' for model training, and the use of heuristics to enhance NLP processing. The episode highlights the challenges of keeping up with the rapid growth of AI research and the innovative solutions being developed to address this issue.

Reference

John and I discuss his work on Primer Science, a tool that harvests content uploaded to arxiv, sorts it into natural topics using unsupervised learning, then gives relevant summaries of the activity happening in different innovation areas.