In this article we will discuss about:- 1. Selection of Cases and Controls 2. Comparability of Cases and Controls 3. Obtaining Information about Factor of Interest 4. Analysis of Case-Control Study.
1. Selection of Cases and Controls:
In addition to being clear and concise, the criteria required to be a case should be highly specific in order to exclude false-positive units. If by design, certain types of sampling units are to be excluded from the case group (e.g.; cases with known causes other than the factor of interest), these units should be excluded from the study; they should not be included in the control group even if not diseased.
In most studies, lists of cases are obtained from one or more clinics or diagnostic laboratories. Except for specified exclusions, all cases first diagnosed in a specified time period can be included in the study.
Usually there are a very large number of potential controls. If little or no effort is required to obtain the history of exposure to the factor(s) of interest, then all non-cases or all non-cases with specified other diseases may be used as controls.
ADVERTISEMENTS:
Whether explicit sampling of non-cases is used depends on the time and expense required to obtain the factor status for each unit selected. When sampling from a large number of potential controls, random or random systematic selection is preferred, provided no matching of cases and controls is to be used.
When both of the study groups are obtained by purposive selection from laboratory or clinic records, the cases and/or controls may not be representative of all cases and non-cases in the source population.
In particular, the prevalence of the factor(s) of interest in the available controls may not reflect its prevalence in the source population as it ought to, particularly if valid estimates of the importance of the association are desired.
If there is doubt about the representativeness of the cases and/or controls, additional data should be obtained to help evaluate the situation. Unfortunately, in practice only qualitative data are readily available to test how representative the groups are, and these deficits should be borne in mind when interpreting and extrapolating the results.
ADVERTISEMENTS:
A particular form of unrepresentative sample that gives rise to biased estimates of association arises when the rate of admission to the laboratory or clinic is associated with both the factor(s) of interest and the disease status. When these records are used in a subsequent study, the differential admission rate acts as a confounding variable and can bias the true association between the factor(s) and disease.
This phenomenon is often called Berkson’s fallacy after the person initially describing it. A classic example of Berkson’s fallacy occurred in a study of the association between cancer and tuberculosis based on human autopsies. The initial study results indicated less tuberculosis in autopsied cancer victims than in autopsied people dying from diseases other than cancer; thus suggesting a sparing effect of tuberculosis on cancer.
It was later found that the autopsy series contained a disproportionately large number of tuberculosis cases because the latter were more likely to be autopsied, and when this was taken into account the association between tuberculosis and cancer disappeared.
Documented instances of Berkson’s fallacy in veterinary medicine are rare; however, the effects of differential admission rates may have been observed, using hospital records, in a case-control study of the relationship between clinical mastitis and age of dairy cows.
ADVERTISEMENTS:
No association between age and mastitis was found in the case-control study; yet in a subsequent longitudinal study in the population of cows giving rise to the data for the case-control study, the rate of mastitis was found to increase significantly with the cow’s age.
The difference in results was due to the fact that many diseases of dairy cows increase in frequency with age, and thus the population of cows with diseases (the hospital population) was older than the average age in the source population.
Hence, only diseases whose frequency increased with age more rapidly than the average of other diseases were observed to have a significant association with age in the case-control study. Thus in this example, submission rate for diagnosis was related to both age and diagnosis and biased the association between these two variables.
The likelihood of admission rate bias can be assessed by comparing the characteristics of the control group(s) to independent samples from the source population; if the control group and population appear to have similar distributions with respect to a number of factors, admission rate bias is unlikely.
Also, the probability of admission bias occurring may be reduced by selecting controls from all available non-cases. It may be slight comfort that the majority of case-control study results apparently have not been unduly affected by this phenomenon.
In some studies (e.g., the association between lung cancer and smoking based on hospitalized patients), when the effects of admission rate are removed, the association between smoking and lung cancer becomes stronger because smokers are more likely to be hospitalized than non-smokers, and lung cancer patients are more likely to be hospitalized than non-lung cancer patients.
Thus the observed association based on hospital data is weaker than the association in the source population. Further, admission rate biases are unlikely to explain strong associations (relative risk > 3) and are unlikely to explain a gradient of risk with different levels of exposure. This is an additional reason for inclusion of these two items when considering the likelihood that an observed association is causal.
When using non-case patients from a clinic as controls, it is advisable to select the controls from all non-case patients rather than a specific subset of other diagnoses. It is possible to select different sets of controls from a number of diagnostic categories — one set from all non-case patients and another from patients with diseases X, Y, or Z.
When this is done, it is advisable to record biologically reasonable interpretations for all possible associations prior to conducting the study. Often, logical explanations for some possible differences in associations between different control groups are not apparent, and the investigator should reconsider the selection of controls.
ADVERTISEMENTS:
The use of controls selected from the source population is another way of circumventing the problem of admission rate bias. Population-based controls are particularly useful when the list of cases represents essentially all cases in a defined population (such as all infected farms in a county) or ail cases of a disease in a set of farms serviced by a veterinary practice.
Within reason, when selecting controls from defined populations, attempt to maximize collaboration among potential controls or non-response may bias the results in a manner similar to different admission rates.
If genetic comparability between cases and controls is desirable for the study, relatives of the case may be selected as controls. However, since siblings tend to share similar environments, their selection will indirectly make the environment of cases and controls more comparable, and this is not always desirable.
In selecting siblings as controls it is important to select a fixed number of controls per case and to exclude those cases where this ratio cannot be obtained. Otherwise large sibling groups may bias the results. Usually one would not select relatives of cases if the factor of interest is related to genotype (e.g., if the factor was phenotype).
If environmental comparability is required, controls may be selected from the same original source as the cases (i.e., from the same farm or kennel). Again, cases and controls should be selected in a fixed ratio to ensure that larger farms or kennels do not bias the results. (This was also noted when the example of urea-plasma and infertility was introduced.)
2. Comparability of Cases and Controls:
Theoretically, the cases and controls should be similar in all respects except for the disease (dependent variable) being investigated. Of course, they would also differ with respect to the exposure factor if it were associated with the disease.
One indication of comparability of groups is a similar response (collaboration) rate in both groups. Very different response rates should lead to skepticism about the validity of results, particularly if the overall response rate is low (less than 75-80%).
In practice, the cases and controls may differ in many ways, and two commonly used methods to increase the comparability of groups are analytic control and matching. Restricted selection (e.g., only selecting cows between 4 and 7 years of age) also tends to make the groups more similar, since the restriction applies to both the cases and controls.
In analytic control, data on ancillary factors are obtained and appropriate statistical methods (such as the Mantel-Haenszel technique) are used to prevent distortion of results from extraneous factors.
Host factors are frequently confounding variables and should be included in the list of ancillary factors if it is known that the risk of disease is influenced by them. If the list of ancillary factors is long, complex analytic methods beyond the scope of this text (such as logistic regression) may be required for analysis of the data.
Matching may be used to increase the similarity of cases and controls. The characteristics of each case with regard to potential confounding factors are noted, and a control is sought with the same characteristics. In most studies the number of factors that can be matched is small (perhaps two or three); otherwise it becomes difficult to identify controls with the required characteristics.
In case-control studies, only factors known to be associated with the risk of disease should be included as matching factors. It is a peculiarity of case-control studies that overmatching (matching for non-causal factors) may reduce the ability (power) to detect true associations between the factor and disease. If one wishes to study the effect of an extraneous factor, it is necessary to use analytic control rather than matching, since the effects of matched factors cannot be studied.
As an example, matching was used in a study of factors related to mycoplasma mastitis in dairy herds. Two sources of control herds were used, one matching on size of herd, the other on level of milk production. See Table 6.5.
3. Obtaining Information about Factor of Interest:
A major objective in case-control studies is to collect accurate, unbiased information about the factor of interest. To assist in this, data should be obtained in the same manner and with the same rigor from both cases and controls. One way of ensuring equal rigor is to keep the investigator blind to the disease status and/or to keep the respondent unaware of the exact reason for the study.
To test its validity, the information collected may be compared with the data in other records or the results of selected tests. This is very similar to evaluating a screening test, if the sensitivity and specificity of the question are equal in both cases and controls, although errors may reduce the apparent strength of the association, they will not falsely inflate it.
4. Analysis of Case-Control Study:
The proportions being compared (the proportion of cases that are exposed and the proportion of non-cases that are exposed) in the case- control study should be calculated and displayed together with the results of statistical analysis and the appropriate epidemiologic measures of association (Table 6.6).
If the factor has more than two levels on the nominal scale (e.g., breeds), the level of factor that makes the most biologic or practical sense should be chosen as the reference group. If the factor is ordinal in type, the non-exposed or least exposed group may be used as the reference group (odds ratio = 1).
A series of 2 x 2 tables each containing the referent group is constructed, and the strength of association assessed in the usual manner. As an example, the referent group in a study of the association between breed and hip dysplasia in dogs was “other breeds.” This group consisted of a number of crossbred dogs and a number of breeds having only a few dogs each (see Table 6.7).