
How do you check the multicollinearity requirement for the linear regression model in R and SPSS?
Definition
Multicollinearity is a condition where there is a strong correlation between the independent variables in a statistical model. This can occur in a linear regression model when there are multiple independent variables that correlate with each other and thus there is some type of redundancy in the data.
The problem with multicollinearity is that it affects the estimation accuracy of the regression coefficients and can cause the estimates to be unstable. It can also make the p-values used to determine the significance of the regression coefficients uninterpretable.
There are several ways to check the multicollinearity requirement for a linear regression model in R. One way is to examine the correlation matrix of the independent variables. If the correlation between two or more independent variables is high (usually a value of 0.8 or higher is considered critical), there is a possibility of multicollinearity.
Another option is to calculate the variation influence factor (VIF). The VIF indicates how much the estimate of one regression coefficient is influenced by the estimates of the other regression coefficients. A VIF value of 1 means that there is no multicollinearity, while a value higher than 1 (usually a value of 5 or higher is considered critical) indicates multicollinearity.
Example in R
In R, the correlation between two or more variables can be checked, for example, with the cor() function. Here is an example with the "swiss" dataset in R, where we compare the bivariate correlation of the four variables Fertility, Agriculture, Examination and Education. We see that none of the correlations are above 0.8 or below -0.8.

To get the VIF values of the independent variables, we can use the Vif() function of the car package in R. We see in the example below that none of the values are above 5. Therefore, in our linear model, there is no problem of multicollinearity.

Example in SPSS
To make a correlation table in SPSS, the following steps are necessary:
- Click Analyze
- Go to Correlation and Bivariate
- Select your independent variables and click ok.
- See if none of the correlations are strikingly high (above 0.8 resp. below -0.8).
To display the Vif values in SPSS for a linear regression model, you can select the option when estimating a linear regression model.
- Click Analyze
- Go to Regression and Linear
- Click on the "Statistics" button
- Click the Collinearity Diagnostics box
- Click ok and select the variables for your regression model
- In the output, we now see the Vif values for each independent variable in the last column of the regression table.

There are several ways to address multicollinearity, such as selecting which variables to include in the model or creating new variables by linking existing variables. It may also be necessary to revise the model and use alternative approaches. It is important to identify and address multicollinearity before interpreting results and drawing conclusions.