Monday, May 28, 2012

Draft: Steps in a Defect Outbreak Investigation

Introduction

While it may not seem obvious at first thought, the steps required in the process of disease outbreak investigation have strong correlates with steps required for two separate software development processes:

Most obviously: the defect investigation and resolution process conducted against software already developed and made available to users
Less obviously, but more importantly: the software development process itself before it is formally complete or released to users

Investigating an Epidemic of Defect-Infected Software

After reading the following excerpt from a CDC education site, you can see a very clear analogy between disease outbreak investigation and "defect" outbreak investigation:

"In investigating an outbreak, speed is essential, but getting the right answer is essential, too. To satisfy both requirements, epidemiologists approach investigations systematically, using the following 10 steps:

Prepare for field work
Establish the existence of an outbreak
Verify the diagnosis
Define and identify cases
Describe and orient the data in terms of time, place, and person
Develop hypotheses
Evaluate hypotheses
Refine hypotheses and carry out additional studies
Implement control and prevention measures
Communicate findings

The steps are presented here in conceptual order. In practice, however, several may be done at the same time, or they may be done in a different order. For example, control measures should be implemented as soon as the source and mode of transmission are known, which may be early or late in any particular outbreak investigation."

Source: http://www.cdc.gov/excite/classroom/outbreak/steps.htm

Visualizing a Food-Borne Illness Investigation

To help set a mental picture of this process, and its iterative, feedback-driven nature, consider the following diagram from CDC's OutbreakNet web site:

Steps in a Foodborne Outbreak Investigation
Source: http://www.cdc.gov/outbreaknet/investigations/figure_outbreak_process.html

Notice the loop that occurs branching off from step four. It shows that controlling the outbreak, or learning how to prevent it from occurring again, involves a feedback cycle that involves evaluating the results and theoretical implications of hypotheses and experiments against continued empirical observations in the field. TODO: refine this.

Mapping Disease Outbreak Steps to Software Defect Resolution Steps

Assume below that there is a browser-based application that is used by epidemiologists and public health professionals in the field, investigating outbreaks. When these users discover problems with the software, they must report those problems to the development team that maintains the software.

So, the right side represents the actions that members of the development team must take to investigate and resolve problems when reported. Of course, this process is not at all specific to public health information systems, but can apply to just about any system that is "deployed in the field".

Disease Outbreak Investigation Step	Defect Root-Cause Investigation & Resolution Step
Prepare for field work Establish defect communication channels	Establish defect communication channels
Establish the presence of an outbreak	Establish confirmed presence of a defect
Verify the diagnosis	Verify the steps to reproduce
Define and identify cases	Define and identify type of issue
Describe and orient the data in terms of time, place, and person	Describe and orient the defect report in terms of time, browser, operating system, user role and other system-specific characteristics
Develop Hypotheses	Reproduce Defect under controlled conditions
Evaluate Hypotheses	Determine Root-Cause
Refine hypotheses and carry out additional studies	Develop a "fix" for the defect and perform system regression testing
Implement control and prevention measures	Implement control and prevention measures
Communicate findings	Deploy fix and communicate resolution

Details of Defect Root-Cause Investigation and Resolution Process Steps

Prepare for Field Work : Establish defect communication channels

End users are the ones "in the field", and management must provide communication channels (automatic error trapping within deployed systems and telephone, email, issue tracking systems) that enable users to provide their feedback.

Establish the presence of an outbreak : Establish confirmed presence of a defect

When users report an issue as a defect, the team must be able to verify whether this is true or perhaps some other kind of malfunction or user training issue (misunderstanding, lack of experience, etc)

Verify the diagnosis : Verify the steps to reproduce

In order to investigate the defect, it is required to document the precise sequence of actions taken by the user. These are needed to reproduce the same issue under controlled testing conditions. This normally involves using a dedicated "test" environment, often implemented using virtual machine software that can mimic the end-user's environment as close to identically as possible. There are, in fact, entire businesses that exist solely for providing the capability to other companies.

Define and identify cases : Define and identify type of issue

Not all reported issues fall neatly into the category of "defect". Other categories include:

Request for enhancement
Difficulty using feature
Annoyance with a feature

It is still important to classify each reported item for the purpose of improving the system based upon its users' experiences. This helps with a planning and prioritization process we will discuss in future articles.

Describe and orient the data in terms of time, place, and person : Describe and orient the defect report in terms of time, browser, operating system, user role and other system-specific characteristics

Document conditions that can help characterize the event for detailed investigation by the management team. For a browser-based application, this includes, at minimum, the characterizing features listed above. You might wonder why "time" would be important. It can be important for a number of reasons, including:

Patterns of system load (how many people are using it at the same time) which vary by time of day
Complications with other transactions undertaken by different users (concurrency violations, dead-locks, threading issues, etc)
System dependencies upon other, external systems -- many systems in corporate or government environments depend on "Single Sign On" servers being functional. When these systems fail, your own system may fail, but your users will still blame you :-)

Develop Hypotheses : Reproduce Defect within Controlled Environment

Following directly from the previous step, the team must reproduce the defect by performing the same set of steps that the user who encountered the defect reported. Reproducing a defect is not entirely the same as performing the steps in a scientific experiment, but it's pretty close! When performing a test, there are three parts:

Starting conditions (including any steps to take to prepare the conditions)
Series of steps to execute
Expected observations

After the test steps are executed, the actual observations are compared against the expected observations to evaluate the results. When attempting to reproduce a defect, if the actual observations do not match (meaning the bug cannot be reproduced!) it does not automatically mean that the user was incorrect or lying. It could be:

Faulty or incomplete starting conditions
Incorrect or incomplete execution of steps
Incorrect or improperly measured expected observations or actual observations

As you can see, this section is a subject for an entire article unto its own.

Evaluate Hypotheses : Determine Root-Cause

As foreshadowed in the previous step, this step may involve a number of trial-and-error "hypotheses" depending upon:

How well-reported the defect is
How well-executed the reproduction process is
How well-constructed the system itself is according to quality engineering practices

Refine hypotheses and carry out additional studies : Develop a "fix" for the defect and perform system regression testing

This simplifies a much larger series of steps, but the most important part is that a "regression test" is performed which verifies that the resolution of this specific defect does not introduce additional defects in other areas (or even in the same area!) This will be the subject of its own article as well.

Implement control and prevention measures : Implement control and prevention measures

In this step, the team must reflect upon the root-cause determinant of the defect and analyze what it can do to prevent similar defects from occurring again. I've sometimes heard this kind of activity described as a "post-mortem" in the realm of public health. Visions of autopsies and mortality are not always omens for success. So, in the realm of systems development it's often called a "retrospective". You can learn more about retrospectives at http://www.retrospectives.com/.

Communicate findings : Deploy fix and communicate resolution

This step involves updating the system with the defect corrected (and the rest of the potentially affected system fully regression tested). Naturally, this step is very system-specific. But, in a future article I will demonstrate how to structure a modern web application in such a way that its specific features, or modules, are independently version-controlled, and independently upgradable. These practices allow for teams to improve user experience with little to no "down-time" and even to experiment with new functionality without the cost of expensive, time-consuming "whole system rewrites".

Saturday, May 26, 2012

Defect Detection (and Prevention) in JavaScript and HTML Browser Applications

Background

Thanks to the connection from my coworker Zack Gao at CDC I've been introduced to Andy Dean and his OpenEpi.com open source project. I reached out to Andy and his collaborators offering my assistance and they were happy to accept the help.

Originally funded by a grant from the Bill and Melinda Gates Foundation to Emory University's Rollins School of Public Health, the project's goals are to offer free tools for health professionals everywhere:

OpenEpi provides statistics for counts and measurements in descriptive and analytic studies, stratified analysis with exact confidence limits, matched pair and person-time analysis, sample size and power calculations, random numbers, sensitivity, specificity and other evaluation statistics, R x C tables, chi-square for dose-response, and links to other useful sites.

OpenEpi is free and open source software for epidemiologic statistics. It can be run from a web server or downloaded and run without a web connection. A server is not required. The programs are written in JavaScript and HTML, and should be compatible with recent Linux, Mac, and PC browsers, regardless of operating system. (If you are seeing this, your browser settings are allowing JavaScript.) A new tabbed interface avoids popup windows except for help files.

Needless to say, this is a great project that benefits the epidemiology and medical community around the world. I am happy to get to contribute to it.

As mentioned above, the software is implemented in JavaScript and HTML and is thus extremely cross-platform. Sometimes it is hard to debug these kinds of applications, however. Since offering to help Andy and his team, I've tracked down and eliminated two defects related to dynamic content generation. It took me under and hour combined to remove those defects and relay the solution back to Andy. There are a number of tips and tricks for becoming proficient in JavaScript debugging I'll now share.

Tips for Tracking Down Defects in JavaScript & Browser Based Applications

When it comes to investigating a defect inside of a JavaScript program, there are some advanced tools available inside of most modern web browsers to help with this process. You should get intimately familiar with a few of the standard tool-belt items for the major browsers.

Internet Explorer: FireBug Lite and IE Developer Tools

I will cover FireBug in just a moment for Firefox, but be aware there is a "Firebug Lite" available for IE too. Find out about it here: https://getfirebug.com/firebuglite

Since the IE Developer Tools are built in, let's cover those here.

How to Find

Hit F12 or go to Tools menu > Developer Tools.

What You Get

This gives you a number of powerful features, like:

HTML Element DOM Inspection
CSS in-line editing
Console window (Like an immediate window in an IDE)
JavaScript debugger with interactive breakpoints, and

Watch
Locals
Callstack

Profiler
Network Request capture and playback

Screen Shot

Here's a screen capture of me debugging the CalcStats method on OpenEpi.com's Std.Mort.Ratio module:

Internet Explorer Developer Tools debugging OpenEpi.com Std.Mort.Ratio module

FireFox: FireBug and the Web Developer Toolbar

FireFox is probably still my favorite browser, but Chrome is getting closer every day. The godfather of powerful browser tools has to be The Web Developer Toolbar, but it's almost certainly been overthrown by FireBug.

How to Install

Visit http://getfirebug.com and install it, now. Right away. Do not delay.

You should watch a "pyroentomologist" give this introduction too:

Screen Shot

Here's me inspecting the HTML and CSS of an item in OpenEpi. Notice the CSS tab shows the explicit styles and the computed styles. This is extremely helpful information.


Using FireBug to inspect the table cell properties of an EStratum table in OpenEpi.com

You should also install and become familiar with the Web Developer Toolbar: https://addons.mozilla.org/en-US/firefox/addon/web-developer/

This allows you to do things like show ALL of the CSS or ALL of the JavaScript aggregated into a single output page. It can highlight box models or tables, block-level elements, etc. Many of these features exist in Firebug now, but the Web Developer Toolbar is still very helpful. See http://chrispederick.com/work/web-developer/features/ for more info too.

Google Chrome: Developer Tools or Firebug Lite

You can also run Firebug Lite in Chrome, but Chrome has its own built in tools as well. So let's just highlight that.

How to Find

Hit F12 or find it under the menu.

What you Get

You'll get all the same things we've discussed, but some of them more impressive or more fully featured. See the screen shot for an example of the box model metrics for the same element inspected above in Firebug.

Screen Shot

Chrome's developer tools with box model metrics

More Information

This covered the basic tools, but there is some more information here: http://javascript.open-libraries.com/development/debugging/9-great-javascript-debugging-tools

Thursday, May 24, 2012

Draft: The Centers for Defect Control and Prevention: Public Health and Epidemiology Principles for the Development of Information Systems

Introduction

Have you ever thought much about the following statement?

"CDC 24/7: Saving Lives, Protecting People, Saving Money through Prevention"

This is the banner headline on http://www.cdc.gov, the home page for the United States Centers for Disease Control and Prevention. It's an important statement that conveys the constant vigilance, goals, and primary mindset required in today's world needed to help keep people healthy!

Another thing you may have never thought about is the vast and varied number of information systems required for epidemiologists and other public health professionals to quickly and reliably perform the public health surveillance and other scientific work required to achieve their goals of improving human health. It's easy to understand why such systems are necessary, though. Simply consider how quickly people travel today from country to country and how quickly infectious diseases can spread. Recall the SARS pandemic from 2003 as an example.

In the world of public health, these systems operate all over the United States and world, at local, state, territorial, and federal levels and in collaboration across national boundaries. They empower the public health workforce to control and prevent disease in a variety of biological populations. Human health also depends upon animal health and the health of plants, trees, and ecosystems as a whole. The entire ecosystem is the shared environment, or context, within which we all live.

Disclaimer

I do not work directly for CDC or as a federal employee, so these opinions are based only in my own experience working with contracting companies on technology teams providing services to CDC and the public health community at large. I am also not a public health expert, so these ideas are a work-in-progress as my own understanding of public health and epidemiology evolves.

**Article Series Goal: Building The Centers for Defect Control and Prevention**

Having helped build CDC mission-critical information systems that protect the public's health, I feel is important to share ideas for improving those systems and the process undertaken to build them. This is the first of a multi-part series of articles that will create a vision for CDC's information systems acquisition and development process, a vision that applies the very principles of public health itself and epidemiology to guide those processes. As we'll see, there are already many parallel concepts between the disciplines. The goal is that CDC should also stand for Centers for Defect Control and Prevention when it comes to its information systems.

This first article will introduce several fundamental concepts of epidemiology and disease control and prevention while drawing parallels with the activities necessary for designing and developing successful, useful, and cost-effective information systems.

Terms we'll introduce related to epidemiology are:

Epidemiology
Populations
Control (as in controlling health problems)
Disease
Determinant
Incidence
Prevalence
Incubation Period
Subclinical Infection
Quarantine and Isolation

For each of these concepts from the domain of epidemiology, which pertains to biological, chemical, ecological (ultimately physical) objects, we'll draw parallel models within the world of information systems which pertain, ultimately, to technological objects.

Definition: Epidemiology

CDC defines epidemiology as:

The study of the distribution and determinants of health-related states in specified populations, and the application of this study to control health problems.

Source: http://www.cdc.gov/excite/classroom/outbreak/steps.htm

There is a lot more to say about that, but for this article, let's highlight these two parts:

Populations—One of the most important distinguishing characteristics of epidemiology is that it deals with groups of people rather than with individual patients.

Control—Although epidemiology can be used simply as an analytical tool for studying diseases and their determinants, it serves a more active role. Epidemiological data steers public health decision making and aids in developing and evaluating interventions to control and prevent health problems. This is the primary function of applied, or field, epidemiology.

Controlling and Preventing Information System Disease in Populations of Technological Objects

Information systems are like ecosystems. But, instead of being composed of populations of biological objects, they're composed of populations of technological objects. Beyond that obvious differences in these types of populations are a great many similarities regarding control and prevention surveillance and intervention techniques needed to keep these populations healthy and free of disease.

**Wait, can information systems really be diseased? I believe they can, and that all too many of them are.**

Here's a standard dictionary definition of the word "disease":

"a disordered or incorrectly functioning organ, part, structure, or system of the body resulting from the effect of genetic or developmental errors, infection, poisons, nutritional deficiency or imbalance, toxicity, or unfavorable environmental factors; illness; sickness; ailment."

Source: http://dictionary.reference.com/browse/disease

Definition: Information System Disease

Here's my adapted definition for "Information System Disease":

"an incorrectly functioning or incomplete component, feature, sub-system, or unit of a an information system resulting from the effect of requirements, design, or developmental errors and defects, performance, usability, or capability deficiency, or unfavorable environmental factors such as network communications failures or operating system incompatibilities."

Aside: With the increasing use of biotechnology and nanotechnology that interacts with our own biology, it will become increasing difficult to draw any clear distinctions between a designed technologically-augmented biological system and one that is strictly naturally evolved.

The phrase "developmental errors and defects" has a much catchier name: Bugs! That actually sounds a bit like the germ-theory of disease doesn't it? A lot of people refer to catching "the flu bug" or being "sick with some bug".

Here is a photo of the "first actual bug" found in 1946:

Trivia aside, our definition encompasses a lot of different types of "inputs", though not all, but it focuses in the beginning on one critical perception:

an incorrectly functioning or incomplete component, feature, sub-system, or unit

This brings us to one more important definition before we move on.

Definition: Determinant

any factor that brings about change in a health condition or in other defined characteristics

In epidemiology, a determinant can take on a broad range of concrete forms. In summary, the World Health Organization groups them into these categories:

the social and economic environment,
the physical environment, and
the person's individual characteristics and behaviors.

Source: http://www.who.int/hia/evidence/doh/en/

**The Determinants of Information System Health are Almost Always Human-Caused**

Information systems differ from biological systems because they are specifically designed by humans to serve human needs or goals. Because information systems are designed by us, we have better internal control over the resulting behavior, and thus the healthy status, of information systems. Compare this to the medical or epidemiology professions where purely naturalistic, biological systems are constrained only by the laws of nature, many of which we only partially understand and have only partial external control.

Since software development is entirely human-made, and consists of a closed set of concepts entirely understandable and controllableshould we understand and follow a few simple guiding principles that we'll introduce in the next article. Because of this fact, software development can be done in a way that defect prevention is this built in from the beginning. But for now, let's introduce a few more epidemiology terms and see how they apply to software development.

Definition: Incidence

Incidence refers to the occurrence of new cases of disease or injury in a population over a specified period of time

Source:
http://www.ihs.gov/medicalprograms/portlandinjury/pdfs/principlesofepidemiologyinpublichealthpractice.pdf

Definition: Prevalence

Prevalence, sometimes referred to as prevalence rate, is the proportion of persons in a population who have a particular disease or attribute at a specified point in time or over a specified period of time. Prevalence differs from incidence in that prevalence includes all cases, both new and preexisting, in the population at the specified time, whereas incidence is limited to new cases only.

Source: http://www.ihs.gov/medicalprograms/portlandinjury/pdfs/principlesofepidemiologyinpublichealthpractice.pdf

Applying Incidence and Prevalence to Information System Development and Defect Control and Prevention

We saw above that defect prevention can be built into the software development process from the beginning. While this is true and will be explained in detail in another article, we need to consider the all too common scenario that we are all used to: buggy software.

Let us equate a software defect, bug or otherwise "incorrectly functioning or incomplete component, feature, sub-system, or unit", with"disease or injury" from the definition of incidence.

Now, suppose an organization hires a contracting company to build a large information system. The contractor says the system will be ready to deploy to a production environment for use by the end of one year's time from project inception.

Next, suppose this company sets out to analyze and define all the requirements to build that system before building even a single small portion of the system. Suppose this process takes six months before any new code is written at all. The company delivers large requirements and design documents to their customer at the end of this process.

At this point, there may already be a high prevalence of undiagnosed defects inside of the requirements and design documents for that system! Thus, any ensuing "disease" has not yet had a "date of first occurrence" because none of the system's code has been written, tested, or used -- not even in prototype or proof-of-concept form!

Here are a few more epidemiological terms that draw immediate analogies:

Definition: Incubation Period

A period of subclinical or unapparent pathologic changes following exposure, ending with the onset of symptoms of infectious disease.

Definition: Latency Period

A period of subclinical or unapparent pathologic changes following exposure, ending with the onset of symptoms of chronic disease.

Defects Latent in Large Documents Have a Long Incubation Period Followed by Sudden Onset

Now we can understand that when the contractor spent six months building a large requirements and design document, but built no physical code for others to review and use they raised the risk of "infection" which will likely result in a sudden, or acute, onset of a variety of problems. Ultimately, this will be measured as both a high incidence and a high prevalence during the time period the defects are discovered.

Latent Defects are Like Subclinical Infections Until Onset

Wikipedia defines a subclinical infection as follows:

"A subclinical infection is the asymptomatic (without apparent sign) carrying of an (infection) by an individual of an agent (microbe, intestinal parasite, or virus) that usually is a pathogen causing illness, at least in some individuals. Many pathogens spread by being silently carried in this way by some of their host population. Such infections occur both in humans and nonhuman animals."

Now we know such infections occur in humans, nonhuman animals, and large requirements and design documents not yet tested by tangible development. Keep in mind that "tangible development" does not mean 100% complete and ready for release, but it does mean, at minimum, prototyped and delivered in a visible, clickable, malleable form -- not just words on paper or promises in contractual agreements.

Applying Quarantine and Isolation Tactics Not Just at Borders

Let's now consider quarantine and isolation practices, considering the SARS outbreak mentioned above. When SARS happened, public health officials acted quickly and implemented quarantine procedures to try to control and prevent the spread of the pathogen into their own populations. Consider this summation of quarantine measures from Taiwan:

During the 2003 Severe Acute Respiratory Syndrome (SARS) outbreak, traditional intervention measures such as quarantine and border control were found to be useful in containing the outbreak. We used laboratory verified SARS case data and the detailed quarantine data in Taiwan, where over 150,000 people were quarantined during the 2003 outbreak, to formulate a mathematical model which incorporates Level A quarantine (of potentially exposed contacts of suspected SARS patients) and Level B quarantine (of travelers arriving at borders from SARS affected areas) implemented in Taiwan during the outbreak. We obtain the average case fatality ratio and the daily quarantine rate for the Taiwan outbreak. Model simulations is utilized to show that Level A quarantine prevented approximately 461 additional SARS cases and 62 additional deaths, while the effect of Level B quarantine was comparatively minor, yielding only around 5% reduction of cases and deaths. The combined impact of the two levels of quarantine had reduced the case number and deaths by almost a half. The results demonstrate how modeling can be useful in qualitative evaluation of the impact of traditional intervention measures for newly emerging infectious diseases outbreak when there is inadequate information on the characteristics and clinical features of the new disease-measures which could become particularly important with the looming threat of global flu pandemic possibly caused by a novel mutating flu strain, including that of avian variety.

Source: http://www.ncbi.nlm.nih.gov/pubmed/17055533

What this summary illustrates is that quarantine, when applied at a higher level in the chain of transmission led to a far better reduction in the incidence rate of infection. The other measure led to a more modest, 5% reduction of cases and deaths.

What would happen if we applied this kind of model to the development of information systems, and did it at many levels, in order to prevent large populations of infected, buggy, defect-ridden documents or code from becoming integrated with healthy, corrected, defect-free populations (of software objects)?

Defining the Quarantine Model of Integration

Let's define a simplified "Quarantine Model of Integration" that can apply to more than just humans with possible infections crossing borders, but can also apply to requirements documents, design documents, napkin sketches, whiteboard scrawling, information system releases or upgrades, specific system features, and certainly all the way down to discrete units of software code.

Population A: Some set of individual objects.

Population B: Another set of individual objects similar to Population A.

Population B-Harmful: Some potential subset of population B with harmful characteristics that would disrupt and weaken the integrity of desired characteristics if introduced into Population A.

Population B-Benign: Some potential subset of population B without harmful characters if integrated into Population A.

Mitigating Filter Procedures: A set of actions that can be taken upon Population B to identify Population B-Harmful and Population B-Benign, thus allowing Population B-Benign to be integrated into Population A without harming it (while also isolating and preventing Population B-Harmful from integrating)

Improving Outcomes by Applying the Quarantine Integration Model Throughout the Development of an Information System

We will delve into the specifics of how to apply a model like this to control the development process in the next article. However, the type of control and prevention practices that are necessary when building an information system are different from what you might have seen in many large projects, such as the fictional one described above. Many projects undertaken by large corporations or governments attempt, with good intention, to prevent exposure to risks and defects by trying to define as many "requirements" and "design details" in large documents long before any of the software system is constructed. This is most often a mistake. It's a mistake, as we'll see, that goes back at least 42 years to 1970, but perhaps even further.

You probably remember that I earlier wrote:

Because information systems are designed by us, we have better internal control over the resulting behavior, and thus the healthy status, of information systems.

The key phrase there is "resulting behavior". What is unstated is that the process of creating that resulting behavior is itself can take a very meandering path that is very iterative (completed in multiple passes) and incremental (completed as a series of smaller divisions of a larger whole)

It's often said that an empirical model of process control is needed to properly manage this kind of creative, evolutionary process.

Definition: Empirical Process Control Model

The empirical model of process control provides and exercises control through frequent inspection and adaptation for processes that are imperfectly defined and generate unpredictable and unrepeatable outputs.

Notice that an empirical process control model is a lot like the scientific method. In the next article, we'll also discuss how scientific knowledge advances through iterative, incremental, and evolutionary spurts. For example: we all know that one woman's hypothesis and experiment would not overturn the germ-theory of disease if she claimed that illness was caused by another mechanism.

Peer Review is the Hallmark of Sound Science (And Also of a Sound Information Systems Development Process)

In the case above, we know the scientist's ideas must face the rigor of the peer review system that is the hallmark of science. The peer review process is just one implementation of the "Quarantine Model of Integration" we just defined. And, peer review is, in fact, the self-correcting mechanism built into the heart of science which differentiates it from countless other "ways of knowing" that our human species has and continues to utilize.

That peer-review system is also, naturally, at the heart of what CDC does in its constant effort to do sound science. And, as we'll preview next time, several types of peer review, and even-wider-review, are at the heart of any successful process for developing a winning, useful, and cost-effective information system.

Epideviology: the study of the causes of successful software development efforts, and the application of this study to increase success rates

Welcome to Epideviology!

We define epideviology as:

"the study of the causes of successful software development efforts, and the application of this study to increase success rates."

Epideviology is a play on the word epidemiology, the basic science of public health. The science of epidemiology, as defined by the CDC, is:

"the study of the distribution and determinants of health-related states in specified populations, and the application of this study to control health problems."

Source: http://www.cdc.gov/excite/classroom/intro_epi.htm

Vision

While epidemiology deals with populations of biological objects, software systems consist of a large population of technological objects. These objects are designed and created by humans for human purposes. Thus, epideviology is concerned with how those populations of objects depend upon each other, interact with each other, and otherwise co-exist.

We believe that many of the lessons learned by public health scientists and epidemiologists can be applied to the process of developing software systems to keep those systems free of the "software diseases" like a high prevalence of defects or an inability to modify the program.

Please stay tuned as we develop these ideas through future blog posts, lessons learned, and code samples.