The gold standard of data science project management was originally published in towards data science on medium where people are continuing the conversation by highlighting and responding to this story.
Gold standard data science.
However getting a good data set is surprisingly tricky and takes longer than one expects.
Is really the appropriate form for reproducible data science.
The first stage in a data science project is often to collect training data.
The same software the same software environment the same hardware the same lighting in the room let s say.
Other times a gold standard is the most accurate test possible without restrictions.
A statistical or machine learning algorithm wants to predict a criterion which state isn t dependent on the algorithm otherwise criterion is contaminated.
If good quality single crystals of approximately 200 micrometers in size are grown then an atomic map can be constructed of the crystal lattice.
In medicine and statistics a gold standard test is usually the diagnostic test or benchmark that is the best available under reasonable conditions.
To leave a comment for the author please follow the link and comment on their blog.
The diffraction pattern gives a unique fingerprint of the crystal structure.
Jenny bryan a statistician and data scientist from ubc who has strong views on the proper layout of r scripts workflows and file organization naming said it.
This talk describes our.
The gold standard was widely used in the 19th and early part of the 20th century.
Gold standard is a particular case of external criterion.
Okay so i would claim that the gold standard for reproducibility is that others can use your exact environment as it was at the time of the experiment.
Both meanings are different because for example in medicine dealing with conditions that would require an autopsy to have a perfect diagnosis the gold standard test would be the best one that keeps the patient alive instead of the autopsy.
This is the gold standard for crystal structure analysis.
A number of other great resources i d read in the past 1 2 3 4 inspired me to create a github repo for a gold standard workflow for setting up a new data science project directory.