Four Faces of the Pathogen: The Logic of Medical Microbiology and a Rationale for a Pathogens Database (see: http://bio.med.harvard.edu/~colgrove/pathogens_table.html) The study of microbial pathogens benefits from a vast abundance of fascinating material, but this very feature can make the field appear daunting and disconnected to those first learning the subject. In contrast with the -nowadays- clear conceptual framework of Immunology (the defense side of the host-pathogen "game"), Medical Microbiology (the offense) can seem to be just one impossibly long list of organisms and diseases. In one sense, this is not surprising. The vast majority of all life is microbial, and most branches of the phylogenetic tree of life on Earth contain pathogens; so studying Microbiology and Infectious Diseases can feel like trying to learn all of Biology all at once. There is, however, a deep organizing framework within which microbiologists and infectious diseases physicians think about the subject. It is simply not as apparent as that of some other fields, since it does not fit easily onto a chart or diagram. In a fashion analogous to Aristotle's enumeration of different kids of "causes", the great biologist Julian Huxley pointed out that all biological entities have three "faces": their history (phylogeny and embryology), their structure (anatomy and morphology), and their function (physiology and ecology). Medically important pathogens fall within this schema as well, with an additional feature being the nature of the diseases they cause (their pathology). Fortunately for students trying to learn the field -as well as for educators trying to teach it, scientists trying to advance it, and doctors trying to deal with it-, most of the seemingly limitless varieties of infectious diseases can be understood as combinations arising within a relatively limited number of these basic categories. Most of the remainder can be remembered within a manageable list of exceptions, whose very exceptionality often illuminates the nature of the patterns. This way of grappling with a complex topic, seeing the categories which in combination account for most of the data and remembering the small number of exceptions which break the pattern, can be very helpful in many areas of Biology and Medicine. Nowhere is this more true than for Microbiology and Infectious Diseases. Most biologists learn these patterns over many years, and the internal cognitive framework makes it much easier to remember old information and to absorb and retain new data within the field. Unlike the infections themselves, however, these frameworks are difficult to transmit between "hosts" using traditional educational tools. The advent of powerful and easy to use computer tools, especially internet resources, offers an important aid in this regard. Both the content of and the connections within a huge assembly of biological data can be encoded into a "database", linking textual information, images, case histories, and links to external sources of information. Now ubiquitous "web-based" tools can be used by students to sort and sift the data, and to travel link by link among related topics. In this way, the patterns can be more quickly and less painfully appreciated than by simply trying to memorize one long list after another. In hopes of bringing some of these advantages to the students of the current Immunology, Microbiology and Infectious Diseases course in the first year of the Harvard Medical School curriculum, I have created a database of pathogens organized by the phylogeny/morphology/ecology/pathology schema outlined above. An important part of this project is to limit the number of entries so that the database itself does not become unmanageable and likewise to pick a limited number of entries to be of the most use to students. There is no one overriding criterion for these choices but there are a few important principals to guide them: which pathogens are most important in the medical practices most students will enter, which are most important for the health of humanity at large, which have been most important in shaping the course of human history, which have been most important in advancing biomedical knowledge, which are most useful pedagogically as teaching examples, and which merit inclusion to insure at least one example of each important category. A key principal of complex system design, is to build up from simple modules and prototypes that can be tested and improved. This approach is nearly essential for any software project to get "off the ground" with good function and reliability and with reasonable expenditures of time and resources. In this spirit, the first incarnation of the pathogens database is being offered to the current "crop" of HMS students, both in the hope that they will find it useful and interesting, and with the request that they help with suggestions and comments to make it better and better over time. The full function of the database is available to anyone who wants or likes to use formal database query language but the intent is for the database itself to be accessible from behind an easy to use web-page front-end. The database software itself is able to generate sorted subsets of the data, and work to get that attractively displayed on a web-page is in progress. As a first pass, I have converted pre-sorted database queries into html tables for display within any web-browser. I have also generated a full database table in forms that can be downloaded and used within a spreadsheet program (Comma-Separated-Values, and Excel format). Once downloaded, the sort and filter tools of spreadsheet programs can be used locally to do many of the same things that can be accomplished with a database query. I hope some students will play with these functions and let me know what works and what does not. Once login permissions are worked out, the database itself will be available remotely to those who want to use the "raw" data and in addition, we hope individual database entries can be linked to cases in a very nice external site at the MGH. I have also assembled a large set of pathogen images that will appear as clickable hyperlinks in the database tables. Finally, I have also created a separate database table for the viruses (also present in the main table) to highlight some of the special properties of viruses (since I am a virologist, I have started with what I know best.). As an example of database use, if I wanted to know the names of pathogens in the database that cause pneumonia, I could use the raw database software (mySQL) to say: mysql> select name, pathology from pathogen_table3 where disease="Pneumonia" order by name; and get: +----------------------------+--------------------------------------------+ | name | pathology | +----------------------------+--------------------------------------------+ | Influenza virus | tissue destruction, respiratory epithelium | | Mycobacterium tuberculosis | inflammation, chronic, granuloma formation | | Streptococcus pneumoniae | inflammation | +----------------------------+--------------------------------------------+ An analogous sort from the virus table could be: mysql> select name, nucleotide_type from virus_table2 where cell_location="nuclear" order by genome_size; +----------------------+-----------------+ | name | nucleotide_type | +----------------------+-----------------+ | Herpes Simplex virus | DNA | | Cytomegalovirus | DNA | | Epstein-Barr virus | DNA | | Adenovirus | DNA | | Influenza virus | RNA | | HIV | RNA->DNA | | Hepatitis B virus | RNA->DNA | | Parvovirus | DNA | | JC virus | DNA | | Papilloma virus | DNA | +----------------------+-----------------+ The same sorts could be done locally by clicking on the cvs or excel format links on the pathogens page, saving the file, and loading into your favorite spreadsheet program. We hope to make this pathogens database project grow over time and to make it something students can both use while in the class and return to later for review. Please give it a try if you have a chance and let me know what you think. Students interested in contributing or working on this project should contact the course directors, Arlene Sharpe and David Knipe and/or e-mail me, robin@hms.harvard.edu . I hope you find this "first draft" useful and interesting and I hope you can help me improve it for future classes. Robin Colgrove Department of Microbiology and Molecular Genetics Harvard Medical School