Is 399 rows × 24 features too small for a medical classification model?
Analysis
Key Takeaways
- •The dataset size (399 samples, 24 features) is small, potentially limiting model performance.
- •Classical ML techniques are likely the most appropriate approach, given the dataset size.
- •Data augmentation for tabular data at this scale is questionable and may not yield significant improvements.
- •Focusing on robust validation and leakage prevention is crucial due to the risk of overfitting.
“The author is working on a disease prediction model with a small tabular dataset and is questioning the feasibility of using classical ML techniques.”