Margaret Oakley Dayhoff
Margaret Oakley Dayhoff | |
---|---|
Born |
Margaret Belle Oakley March 11, 1925 Philadelphia, Pennsylvania |
Died |
February 5, 1983 57) Silver Spring, Maryland | (aged
Nationality | United States |
Fields | Bioinformatics |
Institutions | Columbia University |
Known for |
Substitution matrices one-letter code |
Influences | George Kimball |
Dr. Margaret Belle (Oakley) Dayhoff (March 11, 1925 – February 5, 1983) was an American physical chemist and a pioneer in the field of bioinformatics.[1] Dayhoff was a professor at Georgetown University Medical Center and a noted research biochemist at the National Biomedical Research Foundation (NBRF) where she pioneered the application of mathematics and computational methods to the field of biochemistry. She dedicated her career to applying the evolving computational technologies to support advances in biology and medicine, most notably the creation of protein and nucleic acid databases and tools to interrogate the databases. Her PhD degree was from Columbia University in the Department of Chemistry, where she devised computational methods to calculate molecular resonance energies of several organic compounds. She did postdoctoral studies at the Rockefeller Institute (now Rockefeller University) and the University of Maryland, and joined the newly established National Biomedical Research Foundation in 1959. She was the first woman to hold office in the Biophysical Society and the first person to serve as both Secretary and eventually President.[2] She originated one of the first substitution matrices, Point accepted mutations (PAM). The one-letter code used for amino acids was developed by her, reflecting an attempt to reduce the size of the data files used to describe amino acid sequences in an era of punch-card computing.
Early life
Dayhoff was born an only child in Philadelphia, but moved to New York City when she was ten.[3] Her academic promise was evident from the outset; she was valedictorian (class of 1942) at Bayside High School, Bayside, New York and from there received a scholarship to Washington Square College of New York University, graduating magna cum laude in mathematics in 1945 and being elected to Phi Beta Kappa.[4][5]
Research
Dayhoff began a PhD in quantum chemistry under George Kimball in the Columbia University Department of Chemistry. In her graduate thesis, Dayhoff pioneered the use of computer capabilities — i.e., mass-data processing — to theoretical chemistry; specifically, she devised a method of applying punched-card business machines to calculate the resonance energies of several polycyclic organic molecules. Her management of her research data was so impressive that she was awarded a Watson Computing Laboratory Fellowship. As part of this award, she received access to cutting-edge IBM electronic data processing equipment.[6] An article by Eleanor Krawitz that details the punch cards and processes in the Columbia Engineering Quarterly published in 1949 provides a look into how she might have used the technology at this time.
After completing her PhD, Dayhoff studied electrochemistry under Dr. Duncan A. MacInnes at the Rockefeller Institute from 1948 to 1951. In 1952, she moved to Maryland with her family and later received research fellowships from the University of Maryland (1957–1959), working on a model of chemical bonding with Ellis Lippincott. At Maryland, she gained her first exposure to a new high-speed computer: the IBM model 7094. After this ended, she joined the National Biomedical Research Foundation in 1960 as Associate Director (a position she held for 21 years).[5] At the NBRF, she began to work with Robert Ledley, a dentist who had obtained a degree in physics and become interested in the possibilities of applying computational resources to biomedical problems. He had authored one of the earliest studies of biomedical computation, "Report on the Use of Computer in Biology and Medicine."[7] With their combined expertise, they published a paper in 1962 entitled "COMPROTEIN: A computer program to aid primary protein structure determination" that described a "completed computer program for the IBM 7090" that aimed to convert peptide digests to protein chain data. They actually began this work in 1958, but were not able to start programming until late 1960.[7]
In the early 1960s, Dayhoff also collaborated with Ellis Lippincott and Carl Sagan to develop thermodynamic models of cosmo-chemical systems, including prebiological planetary atmospheres. She developed a computer program that could calculate equilibrium concentrations of the gases in a planetary atmosphere, enabling the study of the atmospheres of Venus, Jupiter, and Mars, in addition to the present day atmosphere and the primordial terrestrial atmosphere. Using this program, she considered whether the primordial atmosphere had the conditions necessary to generate life. Although she found that numerous small biologically important compounds can appear with no special nonequilibrium mechanism to explain their presence, there were compounds necessary to life that were scarce in the equilibrium model (such as ribose, adenine, and cytosine).[2]
Dayhoff also taught physiology and biophysics at Georgetown University Medical Center for 13 years, served as a Fellow of the American Association for the Advancement of Science and was elected councillor of the International Society for the Study of the Origins of Life in 1980 after 8 years of membership. Dayhoff also served on the editorial boards of three journals: DNA, Journal of Molecular Evolution and Computers in Biology and Medicine.[2]
In 1966, Dayhoff pioneered the use of computers in comparing protein sequences and reconstructing their evolutionary histories from sequence alignments. This work, co-authored with Richard Eck, was the first application of computers to infer phylogenies from molecular sequences. It was the first reconstruction of a phylogeny (evolutionary tree) by computers from molecular sequences using a maximum parsimony method. In later years, she applied these methods to study a number of molecular relationships, such as the catalytic chain and bovine cyclic AMP-dependent protein kinase and the src gene product of Rous avian and Moloney murine sarcoma viruses; antithrombin-III, alpha-antitrypsin, and ovalbumin; epidermal growth factor and the light chain of coagulation factor X; and apolipoproteins A-I, A-II, C-I and C-III.[2]
Based on this work, Dayhoff and her coworkers developed a set of substitution matrices called the PAM (Percent Accepted Mutation), MDM (Mutation Data Matrix), or Dayhoff. They are derived from global alignments of closely related protein sequences. The identification number included with the matrix (ex. PAM40, PAM100) refers to the evolutionary distance; greater numbers correspond to greater distances. Matrices using greater evolutionary distances are extrapolated from those used for lesser ones.[8] To produce a Dayhoff matrix, pairs of aligned amino acids in verified alignments are used to build a count matrix, which is then used to estimate at mutation matrix at 1 PAM (considered an evolutionary unit). From this mutation matrix, a Dayhoff scoring matrix may be constructed. Along with a model of indel events, alignments generated by these methods can be used in an iterative process to construct new count matrices until convergence.[9]
One of Dayhoff's most important contributions to bioinformatics was her Atlas of Protein Sequence and Structure, a book reporting all known protein sequences (totaling 65) that she published in 1965.[10] It was subsequently republished in several editions. This led to the Protein Information Resource database of protein sequences, the first online database system that could be accessed by telephone line and available for interrogation by remote computers.[11] The book has since been cited nearly 4,500 times.[2] It and the parallel effort by Walter Goad which led to the GenBank database of nucleic acid sequences are the twin origins of the modern databases of molecular sequences. The Atlas was organized by gene families, and she is regarded as a pioneer in their recognition. Frederick Sanger's determination of the first complete amino acid sequence of a protein (insulin) in 1955, led a number of researchers to sequence various proteins from different species. In the early 1960s, a theory was developed that small differences between homologous protein sequences (sequences with a high likelihood of common ancestry) could indicate the process and rate of evolutionary change on the molecular level. The notion that such molecular analysis could help scientists decode evolutionary patterns in organisms was formalized in the published papers of Emile Zuckerkandl and Linus Pauling in 1962 and 1965.
David Lipman, director of the National Center for Biotechnology Information, has called Dayhoff the "mother and father of bioinformatics".[12]
Marriage and family
Dr. Margaret Dayhoff's husband was Dr. Edward S. Dayhoff, an experimental physicist who worked with magnetic resonance and with lasers.[13] They had two daughters who became physicians, Ruth and Judith.[14]
Judith Dayhoff, PhD, has a Mathematical Biophysics PhD from University of Pennsylvania and is the author of Neural network architectures: An introduction and coauthor of Neural Networks and Pattern Recognition. She is also the author of many journal articles. She has 3 children.
Ruth Dayhoff, M.D., graduated summa cum laude in Mathematics from the University of Maryland and focused on Medical Informatics while doing her MD at Georgetown University School of Medicine. During medical school, she co-authored a paper and a chapter in The Atlas of Protein Sequence and Structure with her mother, describing a new way to measure how closely proteins are related.[13] Her husband Vincent Brannigan is Professor Emeritus of Law and Technology at the University of Maryland School of Engineering. Ruth was a founding Fellow of the American College of Medical Informatics. She is recognized throughout the world as a pioneer in the integration of Medical Imaging and invented the Vista Imaging System. She was chosen for the National Library of Medicine's project on the 200 women Physicians who "changed the face of medicine."[13] She serves as director of Digital Imaging in Medicine for the United States Department of Veteran's Affairs.[5]
Later life
Dayhoff's Atlas became a template for many indispensable tools in large portions of DNA or protein-related biomedical research. In spite of this significant contribution, Dayhoff was marginalized by the community of sequencers. The contract to manage GenBank (a technology directly related to her research), awarded in the early 1980s by the NIH, went to Walter Goad at the Los Alamos National Laboratory. The reason for this attitude was unknown, with theories ranging from sexism to a clash of values with the experimental science community. The majority of experimental scientists considered sequence information very valuable, as publishing it could bring much notoriety. They did not want to submit it to a commonly accessed database such as the Atlas.[6]
During the last few years of her life, she focused on obtaining stable, adequate, long-term funding to support the maintenance and further development of her Protein Information Resource. She envisioned an online system of computer programs and databases, accessible by scientists all over the world, for identifying protein from sequence or amino acid composition data, for making predictions based on sequences, and for browsing the known information. Less than a week before she died, she submitted a proposal to the Division of Research Resources at NIH for a Protein Identification Resource. After her death, her colleagues continuing working to make her vision a reality, and the protein database was fully operational by the middle of 1984.[2]
Margaret Oakley Dayhoff died of a heart attack at the age of 57 on February 5, 1983. A fund was established after her death in 1984 to endow the Margaret O. Dayhoff Award, one of the top national honors that one can receive in biophysics. The award is presented to a woman who "holds very high promise or has achieved prominence while developing the early stages of a career in biophysical research within the purview and interest of the Biophysical Society."[15] It is presented at the Annual Meeting of the Biophysical Society and includes an honorarium of $2,000.
She was survived by her husband, Dr. Edward S. Dayhoff of Silver Spring; two daughters, Dr. Ruth E. Dayhoff Brannigan of College Park, and Dr. Judith E. Dayhoff Goldberg of San Francisco, and her father, Kenneth W. Oakley of Silver Spring.[5]
Selected Publications
- Barker, W.C. and Dayhoff, M.O., 1980. Evolutionary and functional relationships of homologous physiological mechanisms. BioScience, 30(9), pp. 593–600.
- Barker, W.C. and Dayhoff, M.O., 1982. Viral src gene products are related to the catalytic chain of mammalian cAMP-dependent protein kinase. Proceedings of the National Academy of Sciences, 79(9), pp. 2836–2839.
- Barker, W.C. and Dayhoff, M.O., 1983, October. Identifying unknown proteins. In Proceedings of the Annual Symposium on Computer Application in Medical Care (p. 584). American Medical Informatics Association.
- Barker, W.C., Ketcham, L.K. and Dayhoff, M.O., 1978. A comprehensive examination of protein sequences for evidence of internal gene duplication. Journal of molecular evolution, 10(4), pp. 265–281.
- Barker, W.C., Ketcham, L.K. and Dayhoff, M.O., 1980. Origins of immunoglobulin heavy chain domains. Journal of molecular evolution, 15(2), pp. 113–127.
- Barnabas, J., Schwartz, R.M. and Dayhoff, M.O., 1982. Evolution of major metabolic innovations in the Precambrian. Origins of life, 12(1), pp. 81–91.
- Dayhoff, M.O., 1964. Computer Search for Active Site Configurations. Journal of the American Chemical Society, 86(11), pp. 2295–2297.
- Dayhoff, M.O., 1965. Computer aids to protein sequence determination. J. Theor. Biol. 8: 97–112. doi:10.1016/0022-5193(65)90096-2
- Dayhoff MO., 1969. Computer analysis of protein evolution. Scientific American. 221:86–95.
- Dayhoff, M.O., 1973. Atlas of Protein Sequence and Structure: Supplement No. 1; Edited [by] MO Dayhoff. National Biomedical Research Foundation.
- Dayhoff, M.O., 1983. Evolutionary connections of biological kingdoms based on protein and nucleic acid sequence evidence. Developments in Precambrian Geology, 7, pp. 191–210.
- Dayhoff, M.O. and Eck, R.V., 1970. MASSPEC: a computer program for complete sequence analysis of large proteins from mass spectrometry data of a single sample. Computers in biology and medicine, 1(1), pp. 5–28.
- Dayhoff, M.O., Eck, R.V., Lippincott, E.R. and Sagan, C., 1967. Venus: atmospheric evolution. Science, 155(3762), pp. 556–558.
- Dayhoff, M.O. and Kimball, G.E. 1949. Punched Card Calculation of Resonance Energies J. Chem. Phys. 17, 706–717, Ph.D. Thesis, Columbia University, Graduate School of Chemistry. DOI:10.1063/1.1747374
- Dayhoff M.O., Perlmann G.E., MacInnes D.A., 1952. The partial specific volumes, in aqueous solution, of three proteins. Journal of the American Chemical Society. 74(10):2515-7.
- Dayhoff, M.O. and Ledley, R.S., 1962. Comprotein: A Computer Program to Aid Primary Protein Structure Determination. In Proceedings of the Fall Joint Computer Conference, 1962, 262–274. Santa Monica, CA: American Federation of Information Processing Societies.http://portal.acm.org/citation.cfm?id=1461546
- Dayhoff, M.O., Lippincott, E.R. and Eck, R.V., 1964. Thermodynamic equilibria in prebiological atmospheres. Science, 146(3650), pp. 1461–1464.
- Dayhoff, M.O. et al., 1967. Thermodynamic equilibrium in prebiological atmospheres of C, H, O, N, P, S, and C1, Washington: Washington, D.C.: National Aeronautics and Space Administration, Office of Technology Utilization, Scientific and Technical Information Division. Foreword by Robert S. Ledley.
- Eck, R.V. and Dayhoff, M.O., 1966. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science, 152(3720), pp. 363–366.
- George, D.G. and Dayhoff, M.O., 1982. The chemical structure of DNA sequence signals for RNA transcription. Origins of Life and Evolution of Biospheres, 12(3), pp. 311–319.
- Hunt, L.T. and Dayhoff, M.O., 1972. The origin of the genetic material in the abnormally long human hemoglobin α and β chains. Biochemical and biophysical research communications, 47(4), pp. 699–704.
- Hunt, L.T., Barker, W.C. and Dayhoff, M.O., 1974. Epidermal growth factor: internal duplication and probable relationship to pancreatic secretory trypsin inhibitor. Biochemical and biophysical research communications, 60(3), pp. 1020–1028.
- Hunt, L.T. and Dayhoff, M.O., 1970. The occurrence in proteins of the tripeptides Asn-X-Ser and Asn-X-Thr and of bound carbohydrate. Biochemical and biophysical research communications, 39(4), pp. 757–765.
- Hunt, L.T. and Dayhoff, M.O., 1977. Amino-terminal sequence identity of ubiquitin and the nonhistone component of nuclear protein A24. Biochemical and biophysical research communications, 74(2), pp. 650–655.
- Hunt, L.T. and Dayhoff, M.O., 1980. A surprising new protein superfamily containing ovalbumin, antithrombin-III, and alpha1-proteinase inhibitor. Biochemical and biophysical research communications, 95(2), pp. 864–871.
- Hunt, L.T. and Dayhoff, M.O., 1982. Evolution of chromosomal proteins. In Macromolecular Sequences in Systematic and Evolutionary Biology (pp. 193–239). Springer US.
- Ledley, F.D., Mullin, B.R., Lee, G., Aloj, S.M., Fishman, P.H., Hunt, L.T., Dayhoff, M.O. and Kohn, L.D., 1976. Sequence similarity between cholera toxin and glycoprotein hormones: implications for structure activity relationship and mechanism of action. Biochemical and biophysical research communications, 69(4), pp. 852–859.
- Lippincott, E.R. and Dayhoff, M.O., 1960. Delta-function model of chemical binding. Spectrochimica Acta, 16(7), pp. 807–834.
- Lippincott, E.R., Eck, R.V., Dayhoff, M.O. and Sagan, C., 1967. Thermodynamic equilibria in planetary atmospheres. The Astrophysical Journal, 147, p. 753.
- MacInnes, D.A. and Dayhoff, M.O., 1952. The Partial Molal Volumes of Potassium Chloride, Potassium and Sodium Iodides and of Iodine in Aqueous Solution at 25°. Journal of the American Chemical Society, 74(4), pp. 1017–1020.
- MacInnes, D.A. and Dayhoff, M.O., 1953. The Apparent and Partial Molal Volumes of Potassium Iodide and of Iodine in Methanol at 25° from Density Measurements. Journal of the American Chemical Society, 75(21), pp. 5219–5220.
- MacInnes, D.A., Dayhoff, M.O. and Ray, B.R., 1951. A magnetic float method for determining the densities of solutions. Review of Scientific Instruments, 22(8), pp. 642–646.
- Orcutt, B.C. and Dayhoff, M.O., 1983, October. Protein identification system: methods of searching for similar sequences. In Proceedings of the Annual Symposium on Computer Application in Medical Care (p. 579). American Medical Informatics Association.
- Sagan, C.E., Lippincott, E.R., Dayhoff, M.O. and Eck, R.V., 1967. Organic molecules and the coloration of Jupiter. Nature 213, 273–274. doi:10.1038/213273a0
- Schwartz, R.M. and Dayhoff, M.O., 1978. An Outline of Biological Evolution Based on Macromolecular Sequences. In Comparative Planetology (Vol. 1, p. 225).
- Schwartz, R.M. and Dayhoff, M.O., 1979. Protein and nucleic Acid sequence data and phylogeny. Science (New York, NY), 205(4410), pp. 1038–1039.
References
- ↑ Hunt, Lois T. (1983). "Margaret O. Dayhoff 1925–1983". DNA and Cell Biology. Mary Ann Liebert, Inc. 2 (2): 97–8. doi:10.1089/dna.1983.2.97. ISSN 0198-0238. PMID 6347589. (subscription required (help)).
- 1 2 3 4 5 6 "Margaret Oakley Dayhoff 1925–1983". Bulletin of Mathematical Biology. 46 (4): 467–472. 1984-07-01. doi:10.1007/BF02459497. ISSN 0092-8240.
- ↑ Windsor, Laura Lynn (2002-01-01). Women in Medicine: An Encyclopedia. ABC-CLIO. ISBN 978-1-57607-392-6.
- ↑ "American National Biography Online". www.anb.org. Retrieved 2016-03-16.
- 1 2 3 4 "Biomedical Researcher Margaret Dayhoff Dies". The Washington Post. 1983-02-08. ISSN 0190-8286. Retrieved 2016-10-20.
- 1 2 November, Joseph A. (2012-05-22). Biomedical Computing: Digitizing Life in the United States. JHU Press. ISBN 978-1-4214-0665-7.
- 1 2 "Margaret Dayhoff, a founder of the field of bioinformatics | The OpenHelix Blog". blog.openhelix.eu. Retrieved 2016-10-20.
- ↑ "Substitution Matrices". arep.med.harvard.edu. Retrieved 2016-10-22.
- ↑ "How to Compute Mutation and Dayhoff Matrices". www.biorecipes.com. Retrieved 2016-10-22.
- ↑ "MARGARET OAKLEY DAYHOFF, 57; EXPERT ON PROTEIN STRUCTURES". The New York Times. 1983-02-09. ISSN 0362-4331. Retrieved 2016-03-16.
- ↑ "Oakley Margaret Dayhoff | Biographical summary". www.whatisbiotechnology.org. Retrieved 2016-10-20.
- ↑ Moody, Glyn (2004). Digital Code of Life: How Bioinformatics is Revolutionizing Science, Medicine, and Business. ISBN 978-0-471-32788-2.
- 1 2 3 "Changing the Face of Medicine | Dr. Ruth E. Dayhoff". www.nlm.nih.gov. Retrieved 2016-10-20.
- ↑ raylevy (2013-03-07). "Margaret & Ruth Dayhoff". Grandma Got STEM. Retrieved 2016-10-20.
- ↑ Society, Biophysical. "Society Awards". www.biophysics.org. Retrieved 2016-10-20.
External links
- Picture of Margaret Oakley Dayhoff, c. 1980. Owned by her daughter Ruth E. Dayhoff, M.D. Made available by the National Library of Medicine.
- Profile and photographs of Margaret O. Dayhoff in Grandma got STEM project. Information submitted to the project by Margaret Dayhoff's son-in-law Vincent, husband of Ruth E. Dayhoff. Also contains biographical information about descendants.
- Baby Joseph and Vrundha M. Nair .2012 Woman Innovator in Bioinformatics: Dr. Margaret Oakley Dayhoff . Adv Bio Tech:12(01)32-34
- http://videocast.nih.gov/Summary.asp?File=14412