3.0 Introduction
Do the research results add up to something important or are the results trivial? For the results to be important, the study needs to have a narrow focus, it has to measure the right outcomes, and the change in the outcome has to be large from a clinical perspective.
3.1 Did they measure the right thing?
3.2 Did they measure it well?
3.3 Were the changes clinically important?
3.1 Did they measure the right thing (suitable outcomes)?
There's a well known story about a man who fumbling about in the middle of the street on a very dark night. A passerby stopped and asked what was going on. The man replied "I dropped my keys and I can't find them". So the passerby agrees to help look for the lost keys. After a half hour, the passerby gets frustrated and asks the man if he remembers exactly where he was standing when he dropped the keys. "Over in the alley there" came the response. The passerby looked with surprise and exasperation at the man. "Over in the alley? Then why are you looking out here in the middle of the street?" The man replied "Because the light is better here."
3.1.1 Surrogate Measures
Patients are generally interested in one of four things. Mortality (will I die?), morbidity (will I go blind?), symptoms (will I throw up?), or quality of life (will I be able to walk up a flight of steps without getting winded?). They don't care about concentration of homocysteine in their blood, or what their CD4 cell count is, unless those values relate to something that is important to them.
Good research, then, should measure something that is important to patient. There is an acronym for this, POEM, which stands for Patient Oriented Evidence that Matter (www.infopoems.com). Every research study should directly measure an outcome that matters to the patient. Direct measurements, though, are often difficult to obtain. So sometimes researchers will examine intermediate measures that are faster and easier to assess, but which may or may not be predictive of more important endpoints. These intermediate measures are called surrogate measures.
Some examples of surrogate measures are forced expiratory volume and premature ventricular contractions. These measures are not important to a patient in themselves, but only in their ability to predict events like asthma difficulties or recurrence of heart attacks.
Improvement in forced expiratory volume may not translate into a reduction in asthma attacks. A reduction in abnormal ventricular depolarization may not translate into a reduction in the recurrence of heart attacks.
You have to show a strong correlation between the surrogate measure and the patient-oriented outcome. If there is only a weak correlation, then establishing a large effect on the surrogate measure will not translate into a large effect on the patient-oriented outcome.
You also need to establish that changes in the surrogate measure lead to changes in the outcome of interest. The surrogate measure might be strongly correlated with the patient-oriented outcome but only because both are related to a third factor. That third factor might end up being the measure that you need to change, not the surrogate measure.
Example: A study that showed an association between duration of breast feeding and brachial artery distensibility at 20 to 28 years of age (Leeson 2001) recognized that brachial artery distensibility is a surrogate outcome. Distensibility is a measure of stiffness, and could be considered a marker for cardiovascular disease in mid and later life. Such a link is tenuous and the authors themselves, as well as an accompanying editorial, (Booth 2001) admit that this does not establish a cause and effect relationship between breast feeding and heart disease.
Example: A study of chemotherapy for colorectal cancer (Buyse 2000) noted that tumor response was often used to assess the value of new treatments, but there was an uncertain connection between tumor response and mortality. The authors demonstrated through a meta-analysis that there was a link between tumor response and survival, but this link was weak. A 50% improvement in tumor response would only lead to a 6% change in the odds of death.
In contrast, a study of cholesterol lowering drugs (Law 2003), showed a significant decrease in LDL cholesterol and tied that lowering to a decreased risk of heart attacks and strokes. A 1.8 mmol/l change for example, was achieved and could be linked to a 61% reduction in the risk of ischemic heart disease and a 17% reduction in the risk of stroke.
You also need to assure yourself that the measure is sensitive to changes associated with improvement in health. There are a wide range of measures of pulmonary function, for example, and some are more responsive than others to changes in health (de Torres 2002).
3.1.2 Short term changes in outcome
Perhaps it is just human nature, but we are all impatient and we want to focus on the short term and the immediate. That's true for researchers also. They want to do the research, publish it, and move on as quickly as possible. Using a short term outcome measure facilitates this way of life. I'm sure that budgetary constraints have something to do with this as well.
The problem with the focus on short term outcomes this is that it is usually easier to get a short term change, but that's not what is really important from a clinical perspective. It's easy, for example, to get a smoker to quit smoking for a day, or maybe even a week. But most interventions that try to help people quit smoking don't work as well for keeping people off cigarettes for three months or for two years. Pretty much any diet works well in the first week or so. People will lose a few pounds right away. But can people continue to lose weight and maintain that weight loss for a full year? That's a much harsher but much more realistic test of the value of a diet.
Example: A study of a youth tobacco education program (Mahoney 2002) looked at immediate recall and recall four months later of the knowledge and attitudes that this program was trying to reinforce. Although most concepts were retained for the short term, only two: "recognition that smokers have yellow teeth and fingers" and "smoking one pack of cigarettes a day costs several hundred dollars per year" were retained at the four month evaluation.
3.1.3 Multiple outcome measures
The presence of a narrowly drawn research plan developed prior to the start of data collection adds a great deal to the credibility of a study. In contrast, a scattershot approach will dilute the credibility of the research. There is a saying in Statistics circles, "If you torture your data long enough, it will confess to something."
Example: A study of the relationship between childhood cancer and diet (Sarasua 1994) examined five different types of meat consumption (ham/bacon/sausage, hot dogs, hamburgers, lunch meats, and charcoal broiled foods), two different types of cancer (acute lymphocytic leukemia and brain tumor), and considered diet both of the child and of the mother during pregnancy. This led to 20 different combinations of these factors. In addition, the authors provided additional discussion using a different definition of high and low consumption. High consumption of hot dogs, for example, was defined as one or more hot dogs per week, but later results defining high consumption as two or more hot dogs were described.
A good research study has limited objectives that are specified in advance. There is solid empirical evidence that specifying a hypothesis prior to data collection reduced the chances of a false positive finding by a factor of three (Swaen 2001). Failure to limit the scope of a study leads to problems with multiple testing.
There are good reasons to look at multiple outcomes when you are trying to explore a new area. The results of this exploratory analysis would then provide justification and focus to a second study that would replicate the results. Looking at multiple outcomes is also fine if there are several distinct dimensions, like efficacy and side effects, that need to be evaluated. But looking at multiple outcome measures just because you can leads to a "fishing expedition," a study that looks at a large number of exposures or a large number outcomes without any effort to prioritize.
3.1.4 Subgroup comparisons
Examining a larg


