ANOVA is an acronym for ANalysis Of VAriance. It’s actually very similar to Regression, except we’re using a categorical variable to predict a continuous one.
Note:
Sum of Squares for Regression: the amount of information explained by our model.
Sums of Squares for Error: the amount of information that our model doesn’t explain.
Sums of Square Total: SST is just N times Variance.
Now that we have that information, we can calculate our F-statistic, just like we did
for regression. The F-statistic compares how much variation our model accounts for vs. how much it can’t account for. (See formula below)
The larger that F is, the more information our model is able to give us.
An F-test is an example of an Omnibus test, which means it’s a test that contains many
items or groups. When we get a significant F-statistic, it means that there’s SOME statistically significant difference somewhere between the groups, but we still have to look for it.
For example, if we had 3 pairs of itmes, we need to do 3 t-tests in order to find the statistically significant difference or differences. To conduct these T-tests, we take just the data in the two categories for that t-test, and calculate the t-statistic and p-value.
(Example: Video 9:00 - Individual T-Test)
Youtube Source: ANOVA: Crash Course Statistics #33