There is no requirement for load development to be done in a statistically significant way. Indeed it is a waste of time for most of us.
There are common practices that are exposed by top tier competitive shooters that are probably not statistically valid but their common usage by those seeking best precision point the way to processes that will be very successful for the normal shooter.
By adopting these best practices and some very limited sample testing we can be confident that we have a load that will generally perform as we require provided we maintain the quality of the reloading process and shoot straight.
By controlling the commonly accepted variables and with limited testing the shooter can have some confidence that the load will perform as expected. No need to shoot heaps of rounds wasting time, components and barrel life.
Having said all that, these are not the practices I see discussed as routine on this forum. Neither are the variables often discussed the (currently accepted) best understanding of the key variables in reloading. The comment above about the relationship between powder weight and group size being a case in point.
I only ever shoot 2 shot groups, and not many of them either. I cannot shoot a group tighter than the first 2 shots, and it will either shoot to my requirements or not. The statistically valid grouping will be a bit different but so what? With control of my shooting and loading, it's not going to be hugely different. My total load development runs to a total of 10 rounds, unless I have a snafu. Is this statistically valid. Nope, but with one graph and a 2 shot group I can tell if I am using the correct powder and if the rifle will "shoot". If it doesn't I switch powder looking for something more reliable. (Generally Viht !)
To do this though, you need to invest in some gear and minimise the shooter in the testing.
Yer best resource on here is Laurie. I would hesitate to call him statistically significant

but if you follow his advice then results will follow