ANOVA Notes 2

Overview of ANOVA

When we are comparing means with two groups, we use a t-test. With more than two groups, we use ANOVA. If we have one IV, we use one-way ANOVA. If we have significance and more than two levels, we run post-hoc pair-wise comparisons.

If we have two IVs, we use two-way ANOVA. If, however, we have within-subjects variables (aka repeated measures), we will use Repeated Measures ANOVA. If we have sphericity, we’ll run a univariate analysis. If no sphericity, we will run the correction on the univariate or a multivariate analysis. From this point in all analyses, if we have an interaction, we will run a simple effects analysis. If we have significance, we will run post hoc, pair-wise comparisons.

Statistics+DecisionTree+from+Howell

Unweighted vs. Weighted Means and SS Type I, II, III, and IV

Unweighted means are the average of cell means, ignoring sample size in each cell. Weighted means take the sample size into account. For example, let’s say a study analyzed 40 drivers from Nevada (30 drunk, 10 sober) and 40 California drivers (10 drunk, 30 sober). If sober drivers in both states each scored a 2 on the exam and drunk drivers scored an 8, the unweighted means would vary between the states. Nevada drivers would score an unweighted average of 6.5 – (30*8 + 10*2)/40 – and California drivers would score an unweighted average of 3.5 – (10*8 + 30*2)/40. The weighted means would take sample size into account, so drivers in both states would score an average of 5 on the test, as each of the group means were the same.

SPSS offers four ways to specify sums of squares. These are listed as Type I (hierarchical), Type II (experimental), Type III (unique), and Type IV which is rarely used.

The default in SPSS is Type III, which is generally what we’re going to be looking for. In the drunk and sober driver example above, Type III provides us the weighted means, which shows the unique effects of A, B, and AB. (The unweighted means are also listed in SPSS as the estimated marginal means.)

Type I would give us the effects of A ignoring B (or B ignoring A, depending on the order we entered this data). In this instance, if we called A the effect of drinking, B the effect of state, then the AB interaction would be drinking and state together. So, entering the data as A, B, AB would give us the following: the main effect of A (drinking), the effect of B beyond the effect of A (state beyond drinking), and the AB interaction above and beyond the main effects of A and B (drinking and state interaction beyond the main effects of drinking and state). The effects of each treatment are adjusted for other treatments that were considered beforehand (order of analysis matters).

Type II would give the effect of B beyond the effect of A. The effects of each individual treatment are adjusted for all other treatments of the same level. The effects of factors A, B, and AB (regardless of order) will test: the main effect of A beyond the effects of B, the main effect of B beyond the effects of A, and the interaction effect of AB beyond what the main effects of A and B explain.

If we ‘partial out’ an effect, such as the effect of state, we are holding the state constant. We can also say that we’re “removing the effects of state” or “controlling for the effects of state.”

When factors are orthogonal, the SS for Type I, II, and III are all equal. It is when they are non-orthogonal that the type of test matters.

Q: Test the effect of A ignoring B
A: Type I SS, enter A first

Q: Test the effect of B beyond the effect of A
A: Type I SS, enter A first, then B OR Type II SS (any order)

Q: the unique effect of B, A, and AB
A: Type III SS (any order)

Fixed and Random Factors

When a factor is fixed, levels are chosen systematically. A factor is random if levels are chosen randomly from a larger population of levels. For example, if there are only two schools in a city and you use those two in your analysis, that factor is fixed. However, if there are more than two schools in a city and we’re randomly selecting two from the sampling frame, the factor is random. In most applications, subjects will be random and everything else will be fixed.

Univariate

In univariate analysis in Repeated Measures ANOVA, we have a fourth assumption added to our ANOVA assumptions: the assumption of sphericity. Sphericity is a measure of compound symmetry and it basically means that we have stability across measures and that variance is held in the same ratio. If data fails to meet the sphericity assumption, our best options are to perform a univariate analysis with the Huynh-Feldt correction or run a multivariate analysis with Pillai’s test. Note: univariate is a more powerful analysis.

Main effects: the means differ across levels of one factor.

Simple effects: often utilized when you get a significant interaction, allows you to consider the effects of one variable at only one level of the other variable.

Interaction: the effect of one variable depends on the level of the other variable(s).

A design is viewed as nonorthogonal when the number of cases in the various cells of a factorial ANOVA are unequal. If, however, the ratio of cases is equal, the design is orthogonal.

Completely Randomized Factorial Design (CRF): factors are fully crossed, all between-subjects. There is no way to estimate an interaction effect with subjects because each subject is observed only once.

Split-Plot Factorial Design (SPF): both within-subjects and between subjects factors (mixed design)

Randomized Block Factorial Design (RBF): repeated measures design, all factors are within-subjects.

Completely Randomized Hierarchical (CRH): levels of at least one treatment are nested within those of another treatment.

Randomized Block Hierarchical (RBH): like CRH, with subjects exposed to all conditions, and one variable nested in the other.

Between-Subjects Factors: subjects receive only one level of a variable and are then compared to groups from other levels.

Within-Subjects Factors: subjects receive all levels of a variable, allowing us to calculate subject effects (pi). Within subject designs are generally more powerful as we are able to remove subject variability from the error term. Within subject analyses must adjust for violations of sphericity, or use multivariate tests. In these designs, we must be aware of potential carry over effects.

Crossed: Every level of each variable is paired with every level of the other variable.

Nested: If the Subject is in one condition, they do not see the other condition

Final Notes

See here for a list of assumed equal variance and unequal post-hoc.

The One-Way ANOVA procedure produces a one-way analysis of variance for a quantitative dependent variable by a single factor (independent) variable. Analysis of variance is used to test the hypothesis that several means are equal. This technique is an extension of the two-sample t test.

In addition to determining that differences exist among the means, you may want to know which means differ. There are two types of tests for comparing means: a priori contrasts and post hoc tests. Contrasts are tests set up before running the experiment, and post hoc tests are run after the experiment has been conducted. You can also test for trends across categories.

Example. Doughnuts absorb fat in various amounts when they are cooked. An experiment is set up involving three types of fat: peanut oil, corn oil, and lard. Peanut oil and corn oil are unsaturated fats, and lard is a saturated fat. Along with determining whether the amount of fat absorbed depends on the type of fat used, you could set up an a priori contrast to determine whether the amount of fat absorption differs for saturated and unsaturated fats.

Statistics. For each group: number of cases, mean, standard deviation, standard error of the mean, minimum, maximum, and 95% confidence interval for the mean. Levene’s test for homogeneity of variance, analysis-of-variance table and robust tests of the equality of means for each dependent variable, user-specified a priori contrasts, and post hoc range tests and multiple comparisons: Bonferroni, Sidak, Tukey’s honestly significant difference, Hochberg’s GT2, Gabriel, Dunnett, Ryan-Einot-Gabriel-Welsch F test (R-E-G-W F), Ryan-Einot-Gabriel-Welsch range test (R-E-G-W Q), Tamhane’s T2, Dunnett’s T3, Games-Howell, Dunnett’s C, Duncan’s multiple range test, Student-Newman-Keuls (S-N-K), Tukey’s b, Waller-Duncan, Scheffé, and least-significant difference.

One-Way ANOVA also offers:
• Group-level statistics for the dependent variable
• A test of variance equality
• A plot of group means
• Range tests, pairwise multiple comparisons, and contrasts, to describe the nature of the group differences

• Descriptive. Calculates the number of cases, mean, standard deviation, standard error of the mean, minimum, maximum, and 95% confidence intervals for each dependent variable for each group.

• Fixed and random effects. Displays the standard deviation, standard error, and 95% confidence interval for the fixed-effects model, and the standard error, 95% confidence interval, and estimate of between-components variance for the random-effects model.

• Homogeneity of variance test. Calculates the Levene statistic to test for the equality of group variances. This test is not dependent on the assumption of normality.

• Brown-Forsythe. Calculates the Brown-Forsythe statistic to test for the equality of group means. This statistic is preferable to the F statistic when the assumption of equal variances does not hold.

• Welch. Calculates the Welch statistic to test for the equality of group means. This statistic is preferable to the F statistic when the assumption of equal variances does not hold.

—————-

Equal Variances Assumed:

• LSD. Uses t tests to perform all pairwise comparisons between group means. No adjustment is made to the error rate for multiple comparisons.

• Bonferroni. Uses t tests to perform pairwise comparisons between group means, but controls overall error rate by setting the error rate for each test to the experimentwise error rate divided by the total number of tests. Hence, the observed significance level is adjusted for the fact that multiple comparisons are being made.

• Sidak. Pairwise multiple comparison test based on a t statistic. Sidak adjusts the significance level for multiple comparisons and provides tighter bounds than Bonferroni.

• Scheffe. Performs simultaneous joint pairwise comparisons for all possible pairwise combinations of means. Uses the F sampling distribution. Can be used to examine all possible linear combinations of group means, not just pairwise comparisons.

• R-E-G-W F. Ryan-Einot-Gabriel-Welsch multiple stepdown procedure based on an F test.

• R-E-G-W Q. Ryan-Einot-Gabriel-Welsch multiple stepdown procedure based on the Studentized range.

• S-N-K. Makes all pairwise comparisons between means using the Studentized range distribution. With equal sample sizes, it also compares pairs of means within homogeneous subsets, using a stepwise procedure. Means are ordered from highest to lowest, and extreme differences are tested first.

• Tukey. Uses the Studentized range statistic to make all of the pairwise comparisons between groups. Sets the experimentwise error rate at the error rate for the collection for all pairwise comparisons.

• Tukey’s b. Uses the Studentized range distribution to make pairwise comparisons between groups. The critical value is the average of the corresponding value for the Tukey’s honestly significant difference test and the Student-Newman-Keuls.

• Duncan. Makes pairwise comparisons using a stepwise order of comparisons identical to the order used by the Student-Newman-Keuls test, but sets a protection level for the error rate for the collection of tests, rather than an error rate for individual tests. Uses the Studentized range statistic.

• Hochberg’s GT2. Multiple comparison and range test that uses the Studentized maximum modulus. Similar to Tukey’s honestly significant difference test.

• Gabriel. Pairwise comparison test that used the Studentized maximum modulus and is generally more powerful than Hochberg’s GT2 when the cell sizes are unequal. Gabriel’s test may become liberal when the cell sizes vary greatly.

• Waller-Duncan. Multiple comparison test based on a t statistic; uses a Bayesian approach.

• Dunnett. Pairwise multiple comparison t test that compares a set of treatments against a single control mean. The last category is the default control category. Alternatively, you can choose the first category. 2-sided tests that the mean at any level (except the control category) of the factor is not equal to that of the control category. < Control tests if the mean at any level of the factor is smaller than that of the control category. > Control tests if the mean at any level of the factor is greater than that of the control category.

————————–

Equal variances not assumed

• Tamhane’s T2. Conservative pairwise comparisons test based on a t test. This test is appropriate when the variances are unequal.

• Dunnett’s T3. Pairwise comparison test based on the Studentized maximum modulus. This test is appropriate when the variances are unequal.

• Games-Howell. Pairwise comparison test that is sometimes liberal. This test is appropriate when the variances are unequal.

• Dunnett’s C. Pairwise comparison test based on the Studentized range. This test is appropriate when the variances are unequal.