
Correctly splitting data into train, validation, and test sets is fundamental to building models that actually work in production.
Never touch the test set until you're done building your model. Using test data to guide decisions is data leakage — your reported performance won't hold in production.
When data is limited, rotate through k different train/validation splits — more reliable estimate of true performance.
Reference:
TaskLoco™ — The Sticky Note GOAT