Binary logistic regression analysis enables the estimation of the relationship between one or more independent (or explanatory) variables and the dependent (or outcome) variable with two categories. The regression coefficient (ß) of a logistic regression is the estimated increase in the log odds of the outcome per unit increase in the value of the predictor variable.
More formally, let Y be the binary outcome variable indicating no/yes with 0/1, and p be the probability of Y to be 1, so that p = prob (Y=1). Let X1,… Xk be a set of explanatory variables. Then, the logistic regression of Y on X1,… Xk estimates parameter values for ß0, ß1,..., ßk via the maximum likelihood method of the following equation:
Additionally, the exponential function of the regression coefficient (exp (ß)) is obtained, which is the odds ratio (OR) associated with a one-unit increase in the explanatory variable. Then, in terms of probabilities, the equation above is translated into the following:
The transformation of log odds (ß) into odds ratios (exp (ß); OR) makes the data more interpretable in terms of probability. The odds ratio (OR) is a measure of the relative likelihood of a particular outcome across two groups. The odds ratio for observing the outcome when an antecedent is present is:
where p11/p12 represents the “odds” of observing the outcome when the antecedent is present, and p21/p22 represents the “odds” of observing the outcome when the antecedent is not present. Thus, an odds ratio indicates the degree to which an explanatory variable is associated with a categorical outcome variable with two categories (e.g. yes/no) or more than two categories. An odds ratio below one denotes a negative association; an odds ratio above one indicates a positive association; and an odds ratio of one means that there is no association. For instance, if the association between being a female teacher and having chosen teaching as first choice as a career is being analysed, the following odds ratios would be interpreted as:
0.2: Female teachers are five times less likely to have chosen teaching as a first choice as a career than male teachers.
0.5: Female teachers are half as likely to have chosen teaching as a first choice as a career than male teachers.
0.9: Female teachers are 10% less likely to have chosen teaching as a first choice as a career than male teachers.
1: Female and male teachers are equally likely to have chosen teaching as a first choice as a career.
1.1: Female teachers are 10% more likely to have chosen teaching as a first choice as a career than male teachers.
2: Female teachers are twice more likely to have chosen teaching as a first choice as a career than male teachers.
5: Female teachers are five times more likely to have chosen teaching as a first choice as a career than male teachers.
The odds ratios in bold indicate that the relative risk/odds ratio is statistically significantly different from 1 at the 95% confidence level. To compute statistical significance around the value of 1 (the null hypothesis), the relative-risk/odds-ratio statistic is assumed to follow a log-normal distribution, rather than a normal distribution, under the null hypothesis.
The logistic models described in Tables II.2.53, II.2.54, II.2.55 and II.2.56 (Chapter 2) measure how the probability of experiencing work-related stress “a lot” (binary outcome variable) varies across teachers as a function of specific task intensities (expressed in number of hours, i.e. continuous explanatory variable) and of their quadratic terms, to take into account possible nonlinearities.
Once estimated, the coefficients of the logistic model are converted into probabilities as follows:
Where
P (Y=1|intensity) is the probability of experiencing work-related stress “a lot”, given the number of hours task i is performed (intensity i, with i being teaching, individual planning or preparation of lessons, marking/correcting student work, general administrative work)
β0, β1, β2 are the coefficients of the logistic model, β0 being the intercept
Finally, the probability of experiencing work-related stress “a lot” at a given task intensity is multiplied by 100 in order to obtain the expected share of teachers experiencing stress in their work “a lot” at the given task intensity.