Protein Drug Stability Prediction
Elements: Machine learning, prediction, extract/transform/load, science writing, Mathematica, small dataset, hypothesis testing, permutation tests, interdisciplinary collaboration, ultraviolet spectroscopy, basic wet-lab methods, experimental method development
Links: Poster (PDF)
Pre-peer-review manuscript (PDF)
Published journal article (behind paywall)
Published article also available upon request.
Summary

Toward the end of my PhD, I was looking for a data set regarding protein formulation storage stability, one that would allow me to model long-term protein drug stability in terms of inexpensive, quick-to-obtain measurements. Around the same time, our lab was finishing a collaboration with a lab at the University of Munich in which we collected spectroscopic measurements of various formulations of a protein drug, while the German lab collected a long-term (2 year) stability data set for the same formulations. I heard of the data set from a colleague, requested permission to use it, and proceeded to see what I could do with it. Within a few weeks I demonstrated a few prediction techniques and added another chapter to my thesis.