DataToText: Mediation Macro (David A. Kenny)

David A. Kenny

April 19, 2013

Mediation Macro

This mediation macro, called MedText, was written by David A. Kenny, Department of Psychology, University of Connecticut. This is Version 1h, completed on September 25, 2011. This macro has not been extensively tested and will almost certainly be revised, and so it is advisable to return for updates. SMALL BUT IMPORTANT CHANGES WERE MADE IN APRIL 2013 AND THIS NEW VERSION, NOT ANY OLDER ONE SHOULD NOW BE USED.

In January of 2013, an R version of MedText was released. Go to the MedText.R page.

Thank You!

I thank Andrew Hayes and Kris Preacher for allowing me to adapt their bootstrapping macro into MedText. I also thank Dave MacKinnon, Shengquan Ye, Betsy McCoach, Tamar Saguy, Eileen Pitpitan, Stefano Livi, Sylwia Bedyńska, and Amanda Snook for suggestions. Finally, I thank my good friend the late Bob Calsyn for the sample data. I am very open to suggestions and advice from users of this program. However, the priorities of changes will be for MedTextR, the R version of the macro.

I have created a Frequently Asked Questions page. If you have questions click here.

Download:

MedText.SPS (You need SPSS to open this and the next 3 files.)

Sample Data File

Macro Call

SPSS Output

Macro Output (You do not need SPSS to open this file. But if you do not use “wordwrap” it will look ugly.)

To understand how to run a macro return to the DataToText page. The macro takes a few minutes to run and so please be patient. Make sure to backup the raw data file, as sometimes an error in the macro can alter the data file. Note that the output (the text file) cannot be viewed in SPSS. It need to be opened using NotePad or Word.

The Macro Call

This is the statement for the sample data:

Medtext x = Treatment/y = stable_housing/m = hous_conts /

xn='Treatment'yn='Days Housed' mn='Housing Contacts'

xm='yes' ofile='c:\medtext.txt' directory ='c:\'.

Please note the slashes after the entry for x, y, and m. Between all other entries, just use spaces. Make sure to end the macro with a period. Carefully use single quotes. The marco is very finicky!

The defaults are as follows:

x = X

y = Y

m = M

xn ='Causal Variable'

yn ='Outcome'

mn ='Mediator'

alpha =.05

xm='no'

ofile ='medtext.txt'

trials = 5000

directory = 'C:\'

ncov = 0

clist =

That is, if you just say "MedText.", the program will assume that are variables in the SPSS data file with variables named X, Y, and M.

MedText was written on SPSS 16 and 18; there is no guarantee that it work on earlier or later versions of SPSS. It appears that the two tables at the end of the file are not correct if the macro is run in version SPSS earlier than 16.

The macro:

allows for only a single causal variable, mediator, and outcome (however, if you have a multiple causal variables you can treat all but one of them as a "covariate"; also, separate runs can be conducted for each outcomes; multiple mediators are problematic),

presumes that the mediator and the outcome are measured on an interval scale,

uses listwise deletion, and

does not do a moderator analysis (though a another macro has been released).

Variables in the macro:

x = name of the causal variable in the SPSS data set (do not use quotes). If a dichotomy, then assign labels in "values" in the SPSS datafile for this variable.

   y = name of the outcome variable in the SPSS data set

   m = name of the mediating variable in the SPSS data set

   xn = name of the causal variable in the text output (use quotes; spaces are allowed)

yn = name of the outcome variable in the text output (use quotes; spaces are allowed)

mn = name of the mediating variable in the text output (use quotes; spaces are allowed)

For the above three variables, it is advised to use English names and not SPSS acronyms. Also it is advised to use capitals.

   alpha = significance level (defaults to .05)

   xm = whether the causal variable is manipulated (yes) or not (no) (use quotes and all lower case)

   ofile = the name of the output file (use quotes); this is where you go to find the text

   directory = the name of the directory where temporary files are written (use quotes); this must be a directory you are allowed to write on; MedText will leave some files on this directory when it is done; I am working on finding a way to erase them.

   trials = the number of trials for bootstrapping in multiples of 1000. Do not set to zero! However, if you want a quick run performed, set it to 1000.)

   ncov = the number of covariates

   clist = the SPSS names of the covariates separated by spaces (make sure the number of names is equal the number of covariates).

Note carefully which terms have quotes and which do not and where the slashes are where they are not.

If a non-English version of SPSS is being used, MedText changes the language to English. It does not currently change the language back to the original language.

Using DataToText

There is no guarantee for accuracy. Examine not only DataToText output file, but also the SPSS output file. The user needs to carefully edit the MedText output in research reports. Please cite this MedText webpage if you do use it. Moreover, you need a footnote that says: “Some of the material here was produced by the SPSS macro MedText (Kenny, 2011).” It is also strongly encouraged that if you use essentially the same material as generated by Medtext, that you put such material in quotations.

Warnings

MedText provides eleven possible warnings. The user needs to pay careful attention to them. Note that the example below produces two warnings.

1. The outcome variable is dichotomous and logistic regression, not multiple regression, should be used. The output from MedText in this case is wrong!

2. The mediator is dichotomous and logistic regression, not multiple regression, should be used. The output from MedText in this case is wrong!

3. Given the large sample size, you might want to consider lowering alpha.

4. The small sample size might preclude a mediational analysis.

5.   As zero does not appear to be meaningful value for mediator, you might consider grand mean centering the mediator.

6.   As zero does not appear to be meaningful value for causal variable, you might consider grand mean centering the causal variable.

7. There are outliers in the dataset. Examine the output to see what cases are considered to be outliers. (MedText uses the SPSS definition of plus or minus three and one half standard deviations to determine if a case is an outlier. It should be noted that this procedure is very conservative and not a very robust way of determining outliers.)

8. There is evidence that the effect of the mediator or the causal variable is nonlinear and either a data transformation or a nonlinear term might be advisable.

9. The causal and mediator variables explain less than 1 percent of the variance of outcome variable.

10.    There are sufficient missing data so as to make the use of listwise missing data option less than optimal. Better strategies for missing data (e.g., multiple imputation or full information maximum likelihood).

11.    The causal variable and mediator interact to explain the outcome variable and needs to be added to the model.

12.    No "values" were given causal variable which is a dichtomy, and so "One" and "Two" have been assigned.

Again the user needs to pay special attention to these warnings and make the necessary modifications. For instance for the example below, the mediator was transformed using a square root transformation. Warnings are planned for non-normality and heterogeneity of variance.

Details about the Output

Some of the output might not be clear. If so, the user should consult relevant references.

Power analysis: MedText takes the sample size and the number of covariates and does a power analysis for a “medium” effect size, r = .3. However, for the test of Step 1, it presumes complete mediation and r = .09 (.3*.3) as the effect size measure.

Tests of non-linearity: MedText reports the quadratic effects of the mediator and the causal variable. The user needs to examine the output. Sometimes the finding of nonlinearity may just be a Type I error.

Macro Output

If the X variable is dichotomous, then you need to have labels for that variable which DataToText will use. It is advisable to use singular labels (e.g., Man and Woman). If Notepad is used make sure you use the wordwrap option in formal. Also for the tables to align, use Courier font. Below is the sample output.

WARNINGS: 1. There is one outlier for the variable Housing Contacts. Examine the output to see what observations are considered to be outliers. 2. There is evidence that the effect of Housing Contacts on Days Housed is nonlinear and either a data transformation or a nonlinear term might be advisable.

MEDIATIONAL MODEL

        The causal variable or X is Treatment, a manipulated variable, and is a dichotomy with 42.2% Controls and 57.8% Treateds, the outcome variable or Y variable is Days Housed, and the mediator or M is Housing Contacts. The causal mediational model is as follows: The variable Treatment is presumed to cause Housing Contacts, which in turn is presumed to cause Days Housed. If there were complete mediation, then the causal effect of Treatment on Days Housed controlling for Housing Contacts would be zero. For the estimates below to be valid, it must be assumed that there is no measurement error in Housing Contacts. Additionally, it must be assumed that there are no unmeasured common causes of Housing Contacts and Days Housed. Finally, it must be assumed that Days Housed does not cause Housing Contacts.

RESULTS

Descriptive Statistics

        There are a total of 109 observations. The means and standard deviations are presented in Table 1. The unexplained variance in Housing Contacts is equal to 14.077 (sd = 3.752) controlling for Treatment, with a multiple correlation for the regression equation of .236. The unexplained variance in Days Housed is equal to 136.467 (sd = 11.682) controlling for Treatment and Housing Contacts, with a multiple correlation for the regression equation of .469.

Power

        In this section, theoretical power analyses are computed using the study's sample size 109 with an alpha of .05. Baron and Kenny (1986) terminology is used. (The power of the test for Steps 1 and 2 does not take into account that Treatment is a dichotomy.) The power of the Step 1 test is .15, assuming that direct effect (path c') is zero and that all other paths have a moderate effect size (r = .3). The power of the Step 1 test, if a moderate effect size is assumed, would be the same as the Step 2 test below. The power of the Step 2 test or a is .89, assuming that effect size is moderate (r = .3). The power of the Step 3 (path b) and Step 4 (path c') tests is .87, assuming that the tested path has a moderate effect size (r = .3) and the other path is zero, and the correlation between Treatment and Housing Contacts is .236 (the actual correlation between those variables). A conservative estimate of power of the test of the indirect effect is .61 assuming that a and b have moderate effect sizes and that the direct effect is zero. Again, all of these power calculations are hypothetical.

The Four Steps

        The results of the four Baron and Kenny (1986) steps, which are summarized in Table 2, are as follows. The effect of Treatment on Days Housed or path c is equal to 6.558 (p = .009), with a 95% confidence interval of 1.654 to 11.462 and a medium effect size (d = .514). The mean for Treateds is equal to 12.784 and the mean for Controls is equal to 19.342. Step 1 has been passed. The effect of Treatment on Housing Contacts or path a is equal to 1.831 (p = .013), with a 95% confidence interval of .389 to 3.274 and a small effect size (d = .488). The mean for Controls is equal to 2.689 and the mean for Treateds is equal to 4.520. Step 2 has been passed. The effect of Housing Contacts on Days Housed controlling for Treatment or path b is equal to 1.398 (p < .001), with a 95% confidence interval of .801 to 1.995 and a medium effect size (r = .411). Step 3 has been passed. The effect of Treatment on Days Housed controlling for Housing Contacts or path c' is equal to 3.998 (p = .089), with a 95% confidence interval of -.625 to 8.621 and a small effect size (d = .342). The least squares mean for Treatment Controls is equal to 12.784 and the least squares mean for Treatment Treateds is equal to 16.782. Step 4 has been passed. A mediational diagram for unstandardized estimates is contained in Figure 1 and for standardized estimates is contained in Figure 2. (In contemporary analyses, Baron and Kenny (1986) are no longer reported, but rather total, direct, and indirect effects are reported and tested.)

Indirect Effects

        The indirect effect of Treatment on Days Housed or ab is equal to 2.560, with a smaller than small effect size (d*r = .211; see note at the bottom for an explanation of effect size of an indirect effect), and the direct effect is equal to 3.998. The percentage of the total effect or c' + ab that is mediated is equal to 39.04 percent. The mediator is said to be "distal" (Hoyle & Kenny, 1999) in that standardized path b is greater than standardized path a. Thus, Housing Contacts is "closer" to Days Housed than to Treatment. The Sobel standard error is equal to 1.157, which makes the Z test of the indirect effect equal to 2.213 (p = .027). Because the Sobel test is statistically significant, it is concluded that the indirect effect is significantly different from zero. The bootstrap estimated indirect effect is 2.634 (p = .010) with a standard error of 1.129 (Preacher & Hayes, 2008). The 95 percent bias corrected bootstrap confidence interval (5000 trials) is from .522 to 4.930, and because zero is not in the confidence interval, it is concluded that the indirect effect is different from zero. (In contemporary analyses, the bootstrapped test, and not the Sobel test, is reported.)

Tests of Nonlinearity and Interaction

        The tests of nonlinearity are as follows: Because Treatment is a dichotomy, its quadratic effects cannot be measured. The quadratic effect of Housing Contacts squared on Days Housed is -.106 and is statistically significant (p = .034). There are concerns about nonlinear effects and either a data transformation or a nonlinear term might be advisable. The interactive effect of Treatment and Housing Contacts is not statistically significant (p = .492).

OVERALL SUMMARY

        Here is an attempt to summarize the results, but they need to be carefully verified by the investigator. The direct effect from Treatment to Days Housed equals 3.998 and is not statistically significant (p = .089). The predicted mean difference between the Treateds and Controls groups on Days Housed equals 3.998. The indirect effect from Treatment to Days Housed equals 2.560 and is statistically significant (p = .010). For the indirect effect, the predicted mean difference indirectly via Housing Contacts between the Treateds and Controls groups on Days Housed equals 2.560. There is evidence of partial mediation of the effect of Treatment on Days Housed given that the indirect effect is statistically significant but the percentage of the total effect mediated is less than 80 percent.

                        Table 1: Descriptive Statistics

Variable                  Mean        Standard Deviation

--------------------------------------------------------

Treatment                 .422              .496

Days Housed             15.552            13.107

Housing Contacts         3.462             3.843

                         Table 2: Baron & Kenny Steps

Step       Path       Estimate        95% CI        Beta        p

------------------------------------------------------------------

   1         c          6.558     1.654 to 11.462    .248     .009

   2         a          1.831       .389 to 3.274    .236     .013

   3         b          1.398       .801 to 1.995    .410    <.001

   4         c'         3.998      -.625 to 8.621    .151     .089

Note: Effect sizes are partial correlations (r) unless the predictor is a dichotomy where it is Cohen's d. Because an indirect effect is the product of two effect sizes, the effect size is the product of partial correlations (r*r) or Cohen's d times the partial correlation (d*r). If the causal variable is a dichotomy, all predicted means presume that the mediator and covariates equal zero.

Figure 1

Mediation Diagram with Unstandardized Coefficients

                                  Housing Contacts

                                    /\       \

                                    /          \

                                  /              \

                                /                  \

                      1.831* /                      \ 1.398*

                            /                          \

                          /                              \

                        /                                  \

                      /                                    \/

              Treatment ______________________________> Days Housed

                               3.998 (6.558*)

                                   * p < .05



Figure 2

Mediation Diagram with Standardized Coefficients

                            Housing Contacts

                                /\      \

                                /          \

                              /              \

                            /                  \

                   .236* /                      \ .410*

                        /                          \

                      /                              \

                    /                                  \

                  /                                    \/

            Treatment ______________________________> Days Housed

                               .151 (.248*)

                                * p < .05

                         References

      Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.

      Hoyle, R. H., & Kenny, D. A. (1999). Sample size, reliability, and tests of statistical mediation. In R. H. Hoyle (Ed.), Statistical strategies for small sample research (pp. 195-222). Thousand Oaks, CA: Sage.

      MacKinnon, D. P., Fairchild, A. J., & Fritz, M. S. (2007). Mediation analysis. Annual Review of Psychology, 58, 593-614.

      Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40, 879-891.

Return to the Top of the Page

Return to the DataToText Page