Epi Info 7.2

Background | Test for Interaction | Mantel-Haenszel Methods
When Interaction and Confounding Are Minimal | Strategy for Analysis | Exercises

This chapter considers the analysis of a binary outcome (disease D) and binary exposure (exposure E) with data stratified according to an extraneous cofactor (cofactor C). Two phenomena -- confounding and statistical interaction -- are considered.

Trusted Windows (PC) download Epi Info 7.2.4. Virus-free and 100% clean download. Get Epi Info alternative downloads. Epi Info 7.2.4.0 add to watchlist send us an update. 22 screenshots: runs on: Windows 10 32/64 bit Windows 8 32/64 bit Windows 7 Windows Vista Windows XP file size: 23.8 MB filename.

Confounding

Confounding (from the Latin confundere: to mix together) is a distortion of an association between E and D brought about by a cofactor C. Confounding occurs when E is associated with C and C is an independent risk factor for D. In addition, C is not intermediate in the causal pathway.

For example, smoking (C) confounds the relation between alcohol consumption (E) and lung cancer (D) because alcohol user are more likely to smoke than non-users. Thus, the effects of smoking get mixed-in with the effect of alcohol consumption--smoking confounds the association between alcohol consumption and lung cancer.

One way to address confounding is to subset data into relatively homogenous subgroups ('strata') according to the confounding cofactor. Not surprisingly, data can show one thing in aggregate form and another once disaggregated.

Measures of association in the aggregate are called crude measures of association since relations are unadjusted. Let us precede symbols for measures of association with a c when referring to crude measures of association. For example, cRR will represent the crude risk ratio (i.e., the risk ratio based on all data combined in single 2-by-2 table).

Subscripts will denote strata-specific measures of association. For example, RR1 will represent the risk ratio in stratum 1, RR2 will represent the risk ratio in stratum 2, and so on.

Suppose, in the aggregate, we see the following crude data:

D+ D-
E+ 200 800 1000
E- 50 950 1000
250 1750 2000

Therefore, p1 = 200 / 1000 = .20, p2 = 50 / 1000 = .05, and cRR = .20 / .05 = 4.0.

Now suppose we stratify by confounding factor C. In strata 1 (positive for C) and find:

D+ D-
E+ 194 606 800
E- 24 76 100
218 682 900

In this strata, p1,1 = 194 / 800 = .2425, p2,1 = 24 / 100 = .24, and RR2 = .2425 / .24 @ 1.0.

In strata 2 (negative for factor C) we find:

D+ D-
E+ 6 194 200
E- 26 874 900
32 1068 1100

In this strata, p1,2 = 6 / 200 = .03, p2,2 = 26 / 900 = .0288, and RR2 = .03 / .0288 @ 1.0.

Therefore, the strong positive association seen in the aggregate disappears in the subgroups. This proves C confounded the association between E and D in the aggregate.

Statistical Interaction (Effect-Measure Heterogeneity)

The term 'interaction' has two distinct meanings in epidemiology. Biological interaction is the interdependent operation of two or more factors in a cause. There is always biological interaction in epidemiologic data. Statistical interaction is when the statistical model being used does notexplain the joint effects of two or more independent variables. Biological interaction and statistical interaction are two distinct phenomena that should not be confused. Here, we consider statistical interaction only.

Statistical interaction is synonymous with effect-measure heterogeneity. In epidemiology, this occurs when the value for the effect-measures being used (e.g., risk ratio) is differs in different subgroups. A numerical example will serve to illuminate.

Once again we may start with the crude (unstratified) data:

D+ D-
E+ 200 800 1000
E- 50 950 1000
250 1750 2000

Again, p1 = 200 / 1000 = .20, p2 = 50 / 1000 = .05, and cRR = .20 / .05 = 4.0.

Suppose, on stratification, we find:

Stratum 1 (negative for C)

D+ D-
E+ 12 188 200
E- 48 752 800
60 940 1000

Therefore, p1,1 = 12 / 200 = .06, p2,1 = 48 / 800 = .06, and RR1 = .06 / .06 = 1.0.

Stratum 2 (positive for C)

D+ D-
E+ 188 612 800
E- 2 198 200
190 810 1000

Therefore, p1,2 = 188 / 800 = .2350, p2,2 = 2 / 200 = .01, and RR2 = .235 / .01 = 23.5.

Because the risk ratio is heterogeneous in the two strata, we say there is a statistical interaction between E and C as relates to D.

The above demonstrations suggest a strategy for dealing with extraneous factors. In essence, data are explored through stratification.

Illustrative Data Set (SEXBIAS.REC)

To illustrate methods in this chapter, let us consider a data set that demonstrates both interaction and confounding. Data were collected as part of a University of California at Berkeley study to assess whether men were being given preferential treatment over women in admission to graduate programs (Bickel & O'Connell, 1975, Freedman et al., 1991, pp. 16 - 19). Assuming that the men and women who applied for admission to the graduate programs were equally well-qualified, one would expect equal acceptance rates by gender. However, it initially appeared as if men were being admitted in greater proportions than women. Hence, the investigation.

The experience of applicants to the six largest majors at the school is stored in SEXBIAS.ZIP. This data set contains 4526 records and the following variables:
VariableTypeLenDescription
MAJORAlpha9Department major: A, B, C, D, E, and F
SEXAlpha91 = Male 2 = Female
ACCEPTYes/no1Application accepted: +/-

Crude analysis (TABLES SEX ACCEPT) derives:

ACCEPT
SEX + - | Total
-----------+---------------+------
1 | 1198 1493 | 2691 Acceptance rate, men = 1198 /2691 = 0.445
2 | 557 1278 | 1835 Acceptance rate, women = 557 / 1835 = 0.304
-----------+---------------+------ RR = 0.445 / 0.304 = 1.46
Total | 1755 2771 | 4526 P < 0.00001

Therefore, men appear to have a higher acceptance rate than women ( supporting evidence of preferential treatment). However, what if men had applied to majors with more favorable acceptance rates than women? Then the cofactor of MAJOR would confound the observed relation. To investigate this possibility, data are stratified by MAJOR.

Stratification

Table stratification is accomplished with the command:

EPI6> TABLE <E> <D> <C>

Epi info 7.2

For the illustrative example, the following command is issued:

EPI6> TABLES SEX ACCEPT MAJOR

This produces separate tables for each of the 6 majors. Annotated output is shown below:

MAJOR =A
ACCEPT
SEX | + - | Total
------------------------------
1 | 512 313 | 825 Acceptance rate, men = 512 / 825 = 0.621
2 | 89 19 | 108 Acceptance rate, women = 89 / 108 = 0.824
-----------+-------------+------ RR = 0.621 / 0.824 = 0.75
Total | 601 332 | 933 p = 0.000033

MAJOR =B
ACCEPT
SEX | + - | Total
-----------+-------------+------
1 | 353 207 | 560 Acceptance rate, men = 353 / 560 = 0.630
2 | 17 8 | 25 Acceptance rate, women = 17 / 25 = 0.680
-----------+-------------+------ RR = 0.630 / 0.680 = 0.93
Total | 370 215 | 585 p = 0.61

MAJOR =C
ACCEPT
SEX | + - | Total
-----------+-------------+------
1 | 120 205 | 325 Acceptance rate, men = 120 / 325 = 0.369
2 | 202 391 | 593 Acceptance rate, women = 202 / 593 = 0.341
-----------+-------------+------ RR = 0.369 / 0.341 = 1.08
Total | 322 596 | 918 p = 0.39

MAJOR =D
ACCEPT
SEX | + - | Total
-----------+-------------+------
1 | 138 279 | 417 Acceptance rate, men = 138 / 417 = 0.331
2 | 131 244 | 375 Acceptance rate, women = 131 / 375 = 0.349
-----------+-------------+------ RR = 0.331 / 0.349 = 0.95
Total | 269 523 | 792 p = 0.59

MAJOR =E
ACCEPT
SEX | + - | Total
-----------+-------------+------
1 | 53 138 | 191 Acceptance rate, men = 53 / 191 = 0.277
2 | 94 299 | 393 Acceptance rate, women = 94 / 393 = 0.239
-----------+-------------+------ RR = 0.277 / 0.239 = 1.16
Total | 147 437 | 584 p = 0.32

MAJOR =F
ACCEPT
SEX | + - | Total
--------------------------------
1 | 22 351 | 373 Acceptance rate, men = 22 / 373 = 0.059
2 | 24 317 | 341 Acceptance rate, women = 24 / 341 = 0.070
-----------+-------------+------ RR = 0.059 / 0.070 = 0.84
Total | 46 668 | 714 p = 0.54

Therefore, only Major A demonstrates a significant difference in acceptance rates by sex -- and this in favor of women by a small margin. Notice that the initial crude analysis hid this pattern (a.k.a., Simpson's paradox). It is now evident that application to specific MAJORs confounds the study of SEX and ACCEPTance rates and there is an interaction between SEX and MAJOR.

A chi-square test for interaction may be used to help whether effect-measure heterogeneity is present. Because this test applies to both risk ratios and odds ratios (and other measures of association), let MA refer to the measure of association parameter being studied. The null and alternative hypotheses are:

H0: MA1 = MA2 = . . . = MAS (no interaction)
H1: at least one of the strata-specific measure of association differs (interaction)

The method of calculating the chi-square interaction statistic in Epi Info is unspecified, but it is assumed to be a general Wald statistic (see Epidemiology Kept Simple Formula 15.1). Under the null hypothesis, this chi-squared interaction statistic has S - 1 degrees of freedom, where S represents the number of strata being tested.

Epi Info 7.2

Illustrative example. In SEXBIAS.REC we test H0: RR1 = RR2 = RR3 = RR4 = RR5 = RR6. Results, printed in the summary section of the stratified output, are:

Chi Square for evaluation of interaction 18.10
P value 0.00282859

Since there are 6 strata, df = 5. This along with the divergent incidence (risk) ratio in strata 1 suggests that statistical interaction is present.

It is often advantageous to summarize the relation being studied with a single, unconfounded measure of association and tests. This can be accomplished by pooling unconfounded strata-specific measures of association to form a summary measure of association.

Summary Measure of Association

The Mantel-Haenszel method of pooling calculated as weighted average of strata-specific estimates with weights proportional to N1*N2/N, where N represents the total number of people in the strata (Cochran 1954; Mantel & Haenszel 1959). This assumes the measures of association are uniform among strata. This homogeneity assumption allows us to combine strata-specific measures of association to form a single summary measure that has been adjusted for confounding. Any non-uniformity will be suppressed nonuniformity through summarization. The pooled measure of association may be viewed as a statistical convenience whose purpose is to draw correct conclusions about the effect of the exposure.

Illustrative Example (SEXBIAS.REC). By suppressing the non-uniformity of the incidence (risk) ratios in SEXBIAS.REC, we find:

SUMMARY RISK RATIO (RR)
Crude RR without stratification 1.47
Summary RR of (ACCEPT=+) for (SEX=1) 0.94
95% confidence limits for RR 0.87 < RR < 1.03

Comments:
(1) The crude RR estimate of 1.47 indicates higher acceptance for men, whereas the summary estimate of 0.94 indicates slightly higher acceptance rates in women. Thus, the summary RR is an unconfounded estimate of the effect of gender on acceptance to graduate school at UC Berkeley.
(2) The 95% confidence interval for the summary RR is calculated using the method in Robins et al., 1986.

Mantel-Haenszel Summary Test Statistic

A test of H0: aMA = 1 (where aMA represents the parameter for the Mantel-Haenszel adjusted measure of association) is performed with a Mantel-Haenszel chi-square statistic. Under the null hypothesis, this test statistic has a chi-square sampling distribution with 1 degree of freedom.

Illustrative Example (SEXBIAS.REC). The null hypothesis H0: aRR = 1 is tested with a Mantel-Haenszel summary chi-square statistic. The Mantel-Haenszel test statistics for SEXBIAS.REC are:

** Summary of 6 Tables With Non-Zero margins **
N = 4526
Mantel-Haenszel Summary Chi Square 1.43
P value 0.23226346

Comment: The p value of .23 fails to provide evidence against H0. We conclude no significant difference in acceptance rates by gender.

In the absence of interaction and confounding, stratification and adjustments are unnecessary. In such instances, crude measures of association offer the benefit of better precision (compared with M-H summary measures of association).

Telecharger Epi Info 7.2 Francais Gratuit

Illustrative Example. Data from a case-control study of esophageal cancer and tobacco consumption (Breslow & Day, 1980; Tuyns, 1977) are available in BD1NEW.ZIP. We are interested in the relation between tobacco consumption (TOBHIGH: 1 = 20+ g/day, 2 = less than 20 g/day) and esophageal cancer (CASE: 1 = case, 2 = control) while considering the possible confounding or effect-measure modifying effects of alcohol consumption (ALCHIGH: 1 = 80+ g /day, 2 = < 80 g/day). The following commands are issued to analyze the data:

EPI6> READ BDNEW
EPI6> TABLES TOBHIGH CASE ALCHIGH

Key output includes:

Stratum 1 (ALCHIGH = 1)
CASE
TOBHIGH | 1 2 | Total
-----------+-------------+------
1 | 30 23 | 53
2 | 66 86 | 152
-----------+-------------+------
Total | 96 109 | 205

Single Table Analysis Stratum 1 Odds ratio = 1.70

Stratum 2 (ALCHIGH = 2)
CASE
TOBHIGH | 1 2 | Total
-----------+-------------+------
1 | 34 127 | 161
2 | 70 539 | 609
-----------+-------------+------
Total | 104 666 | 770

Single Table Analysis Stratum 2 Odds ratio = 2.06

Thus, the strata-specific odds ratios are 1.70 and 2.06, respectively. We might now ask if it makes sense to summarize these two odds ratio with a single summary statistic. The chi-square interaction statistic (H0: OR1 = OR2) is helpful in this regard. Epi Info prints this information in the area labeled 'Summary Odds Ratio':

Chi Square for evaluation of interaction 0.24
P value 0.62621898

In this instance df = 2 - 1 (not shown by Epi Info) and c²int = 0.24, p = 0.63. This supports an assumption that differences in strata-specific odds ratios may be random (no statistical interaction).

The crude odds ratio and M-H summary odds ratio also listed in the area labeled 'Summary Odds Ratio':

SUMMARY ODDS RATIO
Crude OR 1.96
Mantel-Haenszel weighted Odds ratio 1.92

We also note that the crude odds ratio and Mantel-Haenszel weighted odds are similar. Therefore, it is reasonable to report the crude odds ratio. To get the confidence interval and p value for the crude odds ratio issue the command. For example,

EPI6> TABLES TOBHIGH CASE

Output is:

TOBHIGH | 1 2 | Total
-----------+-------------+------
1 | 64 150 | 214
2 | 136 625 | 761
-----------+-------------+------
Total | 200 775 | 975

Odds ratio 1.96
Cornfield 95% confidence limits for OR 1.36 < OR < 2.82

Although the detection and control of confounding is crucially important in epidemiologic research, there exists no single way for dealing with this problem. Nevertheless, epidemiologists agree that potential confounders must be identified before data are collected so that data on these factors can be collected to allow further evaluation. So how does one know what variables might confound an analysis? Briefly, this information comes from an understanding of the systems being investigated, and is based on previous research, clinical insight, and understanding of the processes being studied. It is essential that the investigator 'does their homework,' researching all potential confounders, before collecting data. With this said, a couple of rules-of-thumb are presented:

(1) Adjustments for confounding are contraindicated when interaction is present, as such summary measures of association would obscure important modifications of effect.

(2) Since confounding is a matter of systematic error (not random error), hypothesis tests should not be used in the detection of confounding.

(3) A pragmatic strategy for calculating good measures of association suggests:

  • Before the study is begun, the investigator attempts to understand the complex causal interrelations among the exposure, disease, and various other factors. This may require lots of homework on the part of the investigation, as well as close collaboration with
    subject matter specialists.
  • Measurements and coding for E, D, and C1, C2, ..., Ck must be valid based on understanding of phenomena.
  • The research question must be defined in an insightful way. 'Finding the question is often more important than finding the answer' (Tukey, 1980).
  • Study design are based on choices that maximize the likelihood of delineating causal relations.
  • After data are collected, entered and cleaned, the analyst explores inter-relations, starting with simple comparisons and descriptions. Identified relationships between E and C and C and D heighten the awareness of the potential for confounding.
  • Data are stratified and explored for interaction. (The above test for interaction may be applied.) When interaction is confirmed, strata-specific estimates are reported.
  • The continued consultation with a subject matter specialist may be necessary before a decision is made whether or not to control for potential confounder C.
  • In the absence of interaction and confounding, crude (unadjusted) estimates of association may be reported.
  • The best estimate of association is both valid and precise. If interaction is present, strata-specific measures of association are reported. If interaction is absent but confounding is present, summary (adjusted) measures of association are reported. If neither interaction nor confounding are present, crude (unadjusted) measures of association are reported.
  • In practice, there will always be uncertainty about whether a given set of variables are or are not confounders. 'Science DOES NOT BEGIN WITH A TIDY QUESTION. Nor does it end with a tidy answer' (Tukey, 1980).

(1) GENERIC.ZIP: Simpson's Paradox (Hypothetical Data). This exercise illustrates Simpson's Paradox while applying a strategy for the detecting and accounting for confounding and interaction. Three case-control data sets are presented: GENERIC1.REC, GENERIC2.REC, and GENERIC3.REC. Each data set contains the variables E (exposure), D (disease), and C (potential confounder). For each data set determine if interaction is present. If interaction is present, stop there and report strata-specific odds ratios and other relevant case-control statistics. If interaction is absent, assess the potential for confounding. Summarize your assessment. If confounding is present, report an adjusted odds ratio and associated case-control statistics. If interaction and confounding are absent, report the crude (unadjusted) case-control statistics.

(2) BD2.ZIP:Breslow & Day 2: The Oxford Childhood Cancer Survey (Breslow & Day, 1980, p. 238; Kneale, 1971; Steward & Kneale, 1970). Data are from a case-control study of childhood leukemia and lymphoid tumors and in utero X-ray exposure (Kneale et al., 1971). The primary variables of interest are CASE (1 = case, 2 = control), XRAY (1 = exposed, 2 = unexposed). The potential confounder is AGE (years). Analyze these data and report the 'best' odds ratio estimate and a 95% confidence interval for the parameter. Summarize your results in narrative form.

(3) BI-HELM1.ZIP:Bicycle Helmet Use in Two Northern California Counties (Perales et al., 1994). This data set contains information on bicycle helmet use in Santa Clara County and Contra Costa County -- two counties in northern California (U.S.A.). Data definitions are included in a data documentation file in the ZIP archive (bi-helm1-dd.htm), which can be downloaded by clicking on the highlighted text, above. Review this data documentation file and then perform the following analyses.
(A) Determine crude incidences of helmet use in Santa Clara County (p1) and Contra Costa County (p2). (The easiest way to derive these statistics is to use a two-variable tables command TABLES COUNTY HELMETUSE ). Test whether these proportions differ, and summarize your results.
(B) Stratify the data on the matching variable (TABLES COUNTY HELMETUSE MATCHVAR). Stratify the data based on the socioeconomic matching variable MATCHVAR. Report strata-specific helmet use rates by school and test whether within-strata rates differ significantly. Summarize your results narratively.
(C) Test the incidence (risk) ratio parameter for interaction Be explicit in listing the null and alternative hypotheses. Report all relevant test statistics and state your conclusion.
(D) Discuss your findings. In so doing, consider the potential for interaction and confounding. Which schools show higher helmet-use rates compared with their matched counterpart? etc.

(4) CERVICAL: Cervical Cancer and Smoking (Nischan et al., 1988; Pagano & Gauvreau, 1993, p. 359). Data from a case-control study of cervical cancer and smoking are shown below.

7.2
CaseControl
Smoke +108163
Smoke -117268
(A) Based on these data calculate the odds ratio of smoking for cervical cancer.
(B) Data stratified by number of sexual partners are shown below. Calculate stratum specific odds ratios.
Stratum 1: Zero or One Partner
CaseControl
Smoke +1221
Smoke -25118
Stratum 2: Two or More Partners
CaseControl
Smoke +96142
Smoke - 92150
(C) Based on these exploratory analyses, would you say there is interaction? Justify your response. How would you report your results?

(5) ASBESTOS.ZIP: Asbestos Exposure and Lung Cancer (Hypothetical data). Data are from an case-control study of lung cancer and asbestos exposure. The data set includes information on smoking (SMOKE: + / -), asbestos exposure (ASBESTOS: + / -), and lung cancer (LUNGCA: + / -)

(A) Calculate the odds ratio of lung cancer associated with smoking. Include a 95% confidence interval, and interpret your findings.
(B) Calculate the odds ratio of lung cancer associated with asbestos exposure. Include a 95% confidence interval and interpret your findings.
(C) An investigator thinks it would be interesting to sort out the inter-relationship between asbestos, smoking, and lung cancer by looking at the lung cancer risk associated with asbestos in smokers and non-smokers separately. Perform such a stratified analysis. In so doing, report strata-specific odds ratios. Perform a test for interaction. (Include all hypothesis testing steps.) Is interaction present? Calculate and report the summary (adjusted) odds ratio. Is confounding evident? Is confounding present? Would it make sense to report the adjusted odds ratio in light of your findings about interaction? How would you report your results? Report your final results.

Epi Info : Conduct professional public health-related surveys and analyze data in a very time-efficient manner with the help of this comprehensive and useful application


Epi Info is a highly reliable statistical software designed for epidemiology research, sporting a rich array of modules for survey creation and analytic routines. Developed by a team that aims to prevent and control diseases, the application’s purpose is to assists physicians, nurses and generally, medical staff to collect and visualize data in a simplistic manner.Epi Info has been on the market for twenty years and is the software of choice for many research and medical centers all over the world. It has a great advantage over the traditional paper surveys, by automating data entering and analysis, thus saving time and increasing productivity.Epi Info bundles a wide range of modules that have separate purposes, thus differentiating the stages of a complete epidemiologic analysis process. As such, it relies on a Form Designer that allows for the creation of surveys consisting of multiple questions that can be placed on the page in conformance with the user’s desire.The Form Designer features a ‘Check Code’ capability that forces events to occur depending on various conditions in the data set. For instance, if the subject is a male, all questions related to the characteristics specific for a female are hidden, thus greatly simplifying the survey operation.The next module is the one for entering the data into the created questionnaire. If the survey has been designed correctly, the data insertion turns out to be complication-free.The most powerful section is the analytic one and includes two methods, namely ‘Classic’ and ‘Visual Dashboard’. The latter is a more lightweight approach and bundles only some of the commands available in the former. The Classic module is a richer component with advanced routines that consist of t-tests, cross tabulations, risk ratios and differences, to name just a few.Finally, The Map section is able to display data using GPS coordinates or geographical references. It uses data layers and shape files to distribute the information over the world map.In conclusion, Epi Info is a valuable asset that works in favor of extensive epidemiologic studies in order to deliver statistical results quickly, that otherwise would be accomplishable over a much wider time span.

Conclusion

To conclude Epi Info works on Windows operating system and can be easily downloaded using the below download link according to Freeware license. Epi Info download file is only 21.7 MB in size.
Epi Info was filed under the General category and was reviewed in softlookup.com and receive 5/5 Score.
Epi Info has been tested by our team against viruses, spyware, adware, trojan, backdoors and was found to be 100% clean. We will recheck Epi Info when updated to assure that it remains clean.

Epi Info user Review

Please review Epi Info application and submit your comments below. We will collect all comments in an effort to determine whether the Epi Info software is reliable, perform as expected and deliver the promised features and functionalities.

Popularity 10/10 - Downloads - 134 - Score - 5/5


Category:General
Publisher:Centers for Disease Control and Preven...
Last Updated:7/22/2019
Requirements:Windows 10 64 bit / Windows 10 / Windows 8 64 bit / Windows 8 / Windows 7 / Windows Vista / Windows XP
License:Freeware
Operating system:Windows
Hits:765
File size:21.7 MB
Price:Not specified

Epi Info 7.2 Free Download

Leave A comment
Name: *
E-Mail: *
Comment: *

Epi Info 7.2 Download