Wednesday, May 16, 2012

Draft: The Centers for Defect Control and Prevention: Public Health and Epidemiology Principles for the Development of Information Systems

Introduction

Have you ever thought much about the following statement?

"CDC 24/7: Saving Lives, Protecting People, Saving Money through Prevention"

This is the banner headline on http://www.cdc.gov, the home page for the United States Centers for Disease Control and Prevention. It's an important statement that conveys the constant vigilance, goals, and primary mindset required in today's world needed to help keep people healthy!

Another thing you may have never thought about is the vast and varied number of information systems required for epidemiologists and other public health professionals to quickly and reliably perform the public health surveillance and other scientific work required to achieve their goals of improving human health. It's easy to understand why such systems are necessary, though. Simply consider how quickly people travel today from country to country and how quickly infectious diseases can spread. Recall the SARS pandemic from 2003 as an example.

In the world of public health, these systems operate all over the United States and world, at local, state, territorial, and federal levels and in collaboration across national boundaries. They empower the public health workforce to control and prevent disease in a variety of biological populations. Human health also depends upon animal health and the health of plants, trees, and ecosystems as a whole. The entire ecosystem is the shared environment, or context, within which we all live.

Disclaimer

I do not work directly for CDC or as a federal employee, so these opinions are based only in my own experience working with contracting companies on technology teams providing services to CDC and the public health community at large. I am also not a public health expert, so these ideas are a work-in-progress as my own understanding of public health and epidemiology evolves.

Article Series Goal: Building The Centers for Defect Control and Prevention

Having helped build CDC mission-critical information systems that protect the public's health, I feel is important to share ideas for improving those systems and the process undertaken to build them. This is the first of a multi-part series of articles that will create a vision for CDC's information systems acquisition and development process, a vision that applies the very principles of public health itself and epidemiology to guide those processes. As we'll see, there are already many parallel concepts between the disciplines. The goal is that CDC should also stand for Centers for Defect Control and Prevention when it comes to its information systems.

This first article will introduce several fundamental concepts of epidemiology and disease control and prevention while drawing parallels with the activities necessary for designing and developing successful, useful, and cost-effective information systems. 

Terms we'll introduce related to epidemiology are:
  • Epidemiology
  • Populations
  • Control (as in controlling health problems)
  • Disease
  • Determinant
  • Incidence
  • Prevalence
  • Incubation Period
  • Subclinical Infection
  • Quarantine and Isolation
For each of these concepts from the domain of epidemiology, which pertains to biological, chemical, ecological (ultimately physical) objects, we'll draw parallel models within the world of information systems which pertain, ultimately, to technological objects.

Definition: Epidemiology 

CDC defines epidemiology as:

The study of the distribution and determinants of health-related states in specified populations, and the application of this study to control health problems.


There is a lot more to say about that, but for this article, let's highlight these two parts:
  • Populations—One of the most important distinguishing characteristics of epidemiology is that it deals with groups of people rather than with individual patients.
  • Control—Although epidemiology can be used simply as an analytical tool for studying diseases and their determinants, it serves a more active role. Epidemiological data steers public health decision making and aids in developing and evaluating interventions to control and prevent health problems. This is the primary function of applied, or field, epidemiology.

Controlling and Preventing Information System Disease in Populations of Technological Objects 

Information systems are like ecosystems. But, instead of being composed of populations of biological objects, they're composed of populations of technological objects. Beyond that obvious differences in these types of populations are a great many similarities regarding control and prevention surveillance and intervention techniques needed to keep these populations healthy and free of disease.

Wait, can information systems really be diseased? I believe they can, and that all too many of them are. 

Here's a standard dictionary definition of the word "disease":

"a disordered or incorrectly functioning organ, part, structure, or system of the body resulting from the effect of genetic or developmental errors, infection, poisons, nutritional deficiency or imbalance, toxicity, or unfavorable environmental factors; illness; sickness; ailment."

Definition: Information System Disease 

Here's my adapted definition for "Information System Disease": 
"an incorrectly functioning or incomplete component, feature, sub-system, or unit of a an information system resulting from the effect of requirements, design, or developmental errors and defects, performance, usability, or capability deficiency, or unfavorable environmental factors such as network communications failures or operating system incompatibilities." 
Aside: With the increasing use of biotechnology and nanotechnology that interacts with our own biology, it will become increasing difficult to draw any clear distinctions between a designed technologically-augmented biological system and one that is strictly naturally evolved.

The phrase "developmental errors and defects" has a much catchier name: Bugs! That actually sounds a bit like the germ-theory of disease doesn't it? A lot of people refer to catching "the flu bug" or being "sick with some bug".

Here is a photo of the "first actual bug" found in 1946:


Trivia aside, our definition encompasses a lot of different types of "inputs", though not all, but it focuses in the beginning on one critical perception: 

an incorrectly functioning or incomplete component, feature, sub-system, or unit

This brings us to one more important definition before we move on.

Definition: Determinant 

any factor that brings about change in a health condition or in other defined characteristics

In epidemiology, a determinant can take on a broad range of concrete forms. In summary, the World Health Organization groups them into these categories: 
  • the social and economic environment,
  • the physical environment, and
  • the person's individual characteristics and behaviors. 

The Determinants of Information System Health are Almost Always Human-Caused 

Information systems differ from biological systems because they are specifically designed by humans to serve human needs or goals. Because information systems are designed by us, we have better internal control over the resulting  behavior, and thus the healthy status, of information systems. Compare this to the medical or epidemiology professions where purely naturalistic, biological systems are constrained only by the laws of nature, many of which we only partially understand and have only partial external control.

Since software development is entirely human-made, and consists of a closed set of concepts entirely understandable and controllableshould we understand and follow a few simple guiding principles that we'll introduce in the next article. Because of this fact, software development can be done in a way that defect prevention is this built in from the beginning. But for now, let's introduce a few more epidemiology terms and see how they apply to software development.

Definition: Incidence 

Incidence refers to the occurrence of new cases of disease or injury in a population over a specified period of time 

 Definition: Prevalence

Prevalence, sometimes referred to as prevalence rate, is the proportion of persons in a population who have a particular disease  or attribute at a specified point in time or over a specified period of time. Prevalence differs from incidence in that prevalence includes all cases, both new and preexisting, in the population at the specified time, whereas incidence is limited to new cases only.

Applying Incidence and Prevalence to Information System Development and Defect Control and Prevention 

We saw above that defect prevention can be built into the software development process from the beginning. While this is true and will be explained in detail in another article, we need to consider the all too common scenario that we are all used to: buggy software.

Let us equate a software defect, bug or otherwise "incorrectly functioning or incomplete component, feature, sub-system, or unit", with"disease or injury" from the definition of incidence.

Now, suppose an organization hires a contracting company to build a large information system. The contractor says the system will be ready to deploy to a production environment for use by the end of one year's time from project inception.

Next, suppose this company sets out to analyze and define all the requirements to build that system before building even a single small portion of the system. Suppose this process takes six months before any new code is written at all. The company delivers large requirements and design documents to their customer at the end of this process.

At this point, there may already be a high prevalence of undiagnosed defects inside of the requirements and design documents for that system! Thus, any ensuing "disease" has not yet had a "date of first occurrence" because none of the system's code has been written, tested, or used -- not even in prototype or proof-of-concept form!

Here are a few more epidemiological terms that draw immediate analogies:

Definition: Incubation Period 

A period of subclinical or unapparent pathologic changes following exposure, ending with the onset of symptoms of infectious disease.

Definition: Latency Period 

A period of subclinical or unapparent pathologic changes following exposure, ending with the onset of symptoms of chronic disease.

Defects Latent in Large Documents Have a Long Incubation Period Followed by Sudden Onset 

Now we can understand that when the contractor spent six months building a large requirements and design document, but built no physical code for others to review and use they raised the risk of "infection" which will likely result in a sudden, or acute, onset of a variety of problems. Ultimately, this will be measured as both a  high incidence and a high prevalence during the time period the defects are discovered.

Latent Defects are Like Subclinical Infections Until Onset 

Wikipedia defines a subclinical infection as follows:

"A subclinical infection is the asymptomatic (without apparent sign) carrying of an (infection) by an individual of an agent (microbe, intestinal parasite, or virus) that usually is a pathogen causing illness, at least in some individuals. Many pathogens spread by being silently carried in this way by some of their host population. Such infections occur both in humans and nonhuman animals."

Now we know such infections occur in humans, nonhuman animals, and large requirements and design documents not yet tested by tangible development. Keep in mind that "tangible development" does not mean 100% complete and ready for release, but it does mean, at minimum, prototyped and delivered in a visible, clickable, malleable form -- not just words on paper or promises in contractual agreements.

Applying Quarantine and Isolation Tactics Not Just at Borders 

Let's now consider quarantine and isolation practices, considering the SARS outbreak mentioned above. When SARS happened, public health officials acted quickly and implemented quarantine procedures to try to control and prevent the spread of the pathogen into their own populations. Consider this summation of quarantine measures from Taiwan:

During the 2003 Severe Acute Respiratory Syndrome (SARS) outbreak, traditional intervention measures such as quarantine and border control were found to be useful in containing the outbreak. We used laboratory verified SARS case data and the detailed quarantine data in Taiwan, where over 150,000 people were quarantined during the 2003 outbreak, to formulate a mathematical model which incorporates Level A quarantine (of potentially exposed contacts of suspected SARS patients) and Level B quarantine (of travelers arriving at borders from SARS affected areas) implemented in Taiwan during the outbreak. We obtain the average case fatality ratio and the daily quarantine rate for the Taiwan outbreak. Model simulations is utilized to show that Level A quarantine prevented approximately 461 additional SARS cases and 62 additional deaths, while the effect of Level B quarantine was comparatively minor, yielding only around 5% reduction of cases and deaths. The combined impact of the two levels of quarantine had reduced the case number and deaths by almost a half. The results demonstrate how modeling can be useful in qualitative evaluation of the impact of traditional intervention measures for newly emerging infectious diseases outbreak when there is inadequate information on the characteristics and clinical features of the new disease-measures which could become particularly important with the looming threat of global flu pandemic possibly caused by a novel mutating flu strain, including that of avian variety.


What this summary illustrates is that quarantine, when applied at a higher level in the chain of transmission led to a far better reduction in the incidence rate of infection. The other measure led to a more modest, 5% reduction of cases and deaths.

What would happen if we applied this kind of model to the development of information systems, and did it at many levels, in order to prevent large populations of infected, buggy, defect-ridden documents or code from becoming integrated with healthy, corrected, defect-free populations (of software objects)?

Defining the Quarantine Model of Integration

Let's define a simplified "Quarantine Model of Integration" that can apply to more than just humans with possible infections crossing borders, but can also apply to requirements documents, design documents, napkin sketches, whiteboard scrawling, information system releases or upgrades, specific system features, and certainly all the way down to discrete units of software code.

Population A: Some set of individual objects.
Population B: Another set of individual objects similar to Population A.
Population B-Harmful: Some potential subset of population B with harmful characteristics that would disrupt and weaken the integrity of desired characteristics if introduced into Population A.
Population B-Benign: Some potential subset of population B without harmful characters if integrated into Population A.
Mitigating Filter Procedures: A set of actions that can be taken upon Population B to identify Population B-Harmful and Population B-Benign, thus allowing Population B-Benign to be integrated into Population A without harming it (while also isolating and preventing Population B-Harmful from integrating)

Improving Outcomes by Applying the Quarantine Integration Model Throughout the Development of an Information System 

We will delve into the specifics of how to apply a model like this to control the development process in the next article. However, the type of control and prevention practices that are necessary when building an information system are different from what you might have seen in many large projects, such as the fictional one described above. Many projects undertaken by large corporations or governments attempt, with good intention, to prevent exposure to risks and defects by trying to define as many "requirements" and "design details" in large documents long before any of the software system is constructed. This is most often a mistake. It's a mistake, as we'll see, that goes back at least 42 years to 1970, but perhaps even further.

You probably remember that I earlier wrote:

Because information systems are designed by us, we have better internal control over the resulting behavior, and thus the healthy status, of information systems.

The key phrase there is "resulting behavior". What is unstated is that the process of creating that resulting behavior is itself can take a very meandering path that is very iterative (completed in multiple passes) and incremental (completed as a series of smaller divisions of a larger whole) 

It's often said that an empirical model of process control is needed to properly manage this kind of creative, evolutionary process. 

Definition: Empirical Process Control Model 

The empirical model of process control provides and exercises control through frequent inspection and adaptation for processes that are imperfectly defined and generate unpredictable and unrepeatable outputs.

Notice that an empirical process control model is a lot like the scientific method. In the next article, we'll also discuss how scientific knowledge advances through iterative, incremental, and evolutionary spurts. For example: we all know that one woman's hypothesis and experiment would not overturn the germ-theory of disease if she claimed that illness was caused by another mechanism. 

Peer Review is the Hallmark of Sound Science (And Also of a Sound Information Systems Development Process)


In the case above, we know the scientist's ideas must face the rigor of the peer review system that is the hallmark of science. The peer review process is just one implementation of the "Quarantine Model of Integration" we just defined. And, peer review is, in fact, the self-correcting mechanism built into the heart of science which differentiates it from countless other "ways of knowing" that our human species has and continues to utilize.

That peer-review system is also, naturally, at the heart of what CDC does in its constant effort to do sound science. And, as we'll preview next time, several types of peer review, and even-wider-review, are at the heart of any successful process for developing a winning, useful, and cost-effective information system.