PHP4006 Advanced Statistics Assignment
Section A (Answer ALL questions)
Answer each question as TRUE or FALSE in the answer book provided.
- The null hypothesis of the Shapiro-Wilk test can be given as “The sample comes from a normally distributed population”.Outliers are influential cases.
- SPSS has a non-parametric equivalent to a Factorial ANOVA for a completely randomized design.
- If p = .01, you can conclude that there is a 1% chance the null hypothesis is true.
- The odds of an event are the ratio of the probability of an event happening to the probability of the event not happening.
- The dependent/outcome variable in a logistic regression is a binary variable.
- In a Principal Component Analysis, the first component accounts for more of the variability in the data than the second component.
- A MANOVA assumes that the variables in each group are normally distributed.
- A loglinear model with two factors is equivalent to a contingency table analysis.
- An ANCOVA allows there to be errors in the measured covariate.
Section B (Answer THREE questions)
- You are planning a study of the effectiveness of two therapies for treating Depression. Group A will receive the current therapy and Group B will receive a new and improved therapy. During the investigation all patients will be assessed three times over a total of 12 weeks treatment. You are trying to decide if the new therapy is more effective than the current therapy.
- a) State the name and design of the statistical test you intend to use?
(3 marks) b) State the null hypotheses of your test chosen in (a). (3 marks)
- Explain what the significance level of the chosen test is. (2 marks)
- State the assumptions that the chosen test requires to hold (6 marks)
- Describe how you would assess whether the assumptions hold.
(6 marks)
- f) State two other analyses you would consider using if the assumptions do not hold. (5 marks)
- Write short notes describing the type of data, question to be answered and your understanding of how each of the following statistical techniques works
- Factor Analysis (6 marks) ii. Path Analysis (6 marks) iii. Structural Equation Modelling (6 marks)
Then describe a dataset and justify the use of ONE of the above techniques. DO
NOT use a dataset used in the lectures. (7 marks)
- You have studied the impact of cognitive fatigue on behavioural impulsivity. You measured impulsivity using a decision-making task where a higher score represents less impulsivity. 60 participants were allocated to one of three conditions: a control condition and two conditions inducing cognitive fatigue (Mild or High). After the treatment, participants completed the decision-making task. The output for the One-way independent ANOVA is below (assumptions were met).
- Calculate the Omega-squared effect size for the differences in impulsivity scores across conditions from the information below. (9 marks)
- Calculate the Pearson’s r effect size for the statistically significant contrast result using the information below (7 marks)
- Explain some advantages of using effect sizes in addition to measures of statistical significance. (5 marks)
- Briefly summarise the main findings of the data below. (4 marks)
Descriptives
Impulsivity
95% Confidence Interval for Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Control 20 100.80 8.817 1.972 96.67 104.93 79 114 Mild 20 85.05 11.009 2.462 79.90 90.20 65 100 High 20 81.10 6.601 1.476 78.01 84.19 64 96
Total 60 88.98 12.319 1.590 85.80 92.17 64 114
ANOVA
Impulsivity
Sum of Squares |
df |
Mean Square |
F |
Sig. | ||
Between Groups |
4345.033 |
2 |
2172.517 |
26.874 |
.000 | |
Within Groups |
4607.950 |
57 |
80.841 | |||
Total |
8952.983 |
59 |
Contrast Coefficients
Contrast Fatigue Condition
Control Mild High | |
1 |
-2 1 1 |
2 |
0 -1 1 |
Contrast Tests
Contrast |
Value of Contrast |
Std. Error |
t |
df |
Sig. (2tailed) | |
Impulsivity Assume equal variances |
1 2 |
-35.45 -3.95 |
4.925 2.843 |
-7.198 |
57 |
.000 .170 |
Does not assume equal variances |
1 2 |
-35.45 -3.95 |
4.877 2.870 |
-7.268 -1.376 |
37.957 31.096 |
.000 .179 |
- The data analysed below is on glucose control in diabetic patients. Good control is measured by a low value of Glucose in the blood
response |
G |
Glucose in the blood |
predictors |
K |
Knowledge of the illness |
F |
Measure of attribution called fatalistic externalism | |
D |
Duration of the illness in months | |
S |
Length of schooling 0 – less than 13 years, 1 – more than 13 years |
The output below is taken from the use of a forced entry regression and a forward regression to predict Glucose from Knowledge, Fatalism, Duration and Schooling.
- How many diabetic patients were in the study
(1 marks)
- Which predictor variable does the forced entry regression suggest is the best predictor of Glucose? Quote the p
- marks)
- Which predictor variable does the forward regression suggest is the best predictor of Glucose? Quote the p
(2marks)
- Explain why the best predictor chosen by the forward regression does not have to be the same as that suggested by the forced entry regression. (4 marks)
- From the forced entry regression, state the model equation and predict the glucose level of a person with Duration = 141, Fatalism = 19,
Knowledge = 36 and Schooling = 1.
(10 marks)
- Using the forward regression output, what conclusions do you come to about the use of the 4 predictors to predict blood Glucose level.
- marks)
- Why, from looking at the definitions of the four predictor variables would you perform further analyses and what would they be?
(4 marks)
Regression
Variables Entered/Removed b
Variables Variables
Model Entered Removed Method
K, D, S, F
- All requested variables entered.
- Dependent Variable: G
Model Summary
Adjusted R Std. Error of
Model R R Square Square the Estimate
18.198
- Predictors: (Constant), K, D, S, F
ANOVAb
Model Sum of Squares df Mean Square F Sig.
1 Regression 5557.055 4 1389.264 4.195 .004a
Residual 20863.710 63 331.170
Total 26420.765 67
- Predictors: (Constant), K, D, S, F
- Dependent Variable: G
Coefficientsa
1 (Constant) 130.385 6.983 .000 F .028 .485 .008 .057 .955
D -.053 .028 -.225 -1.890 .063
S -11.283 4.850 -.284 -2.327 .023
K -.731 .364 -.267 -2.008 .049
- Dependent Variable: G
Regression
Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 Forward
(Criterion:
K . Probability-of-
F-to-enter <= .050)
- Dependent Variable: G
Model Summary
Adjusted R Std. Error of
Model R R Square Square the Estimate
18.810
- Predictors: (Constant), K
ANOVAb
Model Sum of Squares df Mean Square F Sig.
1 Regression 3067.774 1 3067.774 8.670 .004a
Residual 23352.990 66 353.833
Total 26420.765 67
- Predictors: (Constant), K
- Dependent Variable: G
Coefficientsa
1 (Constant) 126.438 11.056 .000
K -.932 .317 -.341 -2.945 .004
- Dependent Variable: G
Excluded Variables b | ||
Model Beta In t Sig. |
Partial Correlation |
Collinearity Statistics |
Tolerance | ||
1 F -.023a -.173 .863 D -.162a -1.401 .166 S -.231a -1.920 .059 |
-.021 -.171 -.232 |
.759 .988 .889 |
a. Predictors in the Model: (Constant), K b. Dependent Variable: G |