Advanced statistical methods in epidemiology
Assessment Exercise
For the assessment exercise, you are asked to analyse data from a cohort study of childhood malaria, carried out in a rural population in Northern Ghana.
Infants residing in a random sample of households in the study area were enrolled in the cohort when they attended a health facility for their first DPT vaccination. Follow-up was through passive case detection at local health facilities. An episode of clinical malaria was diagnosed if a child presented to a facility with fever and if malaria parasites were detected on microscopy. Data on potential risk factors for clinical malaria were obtained from a household survey carried out in the same study area.
A total of 875 infants were enrolled between September 2000 and June 2002, and follow-up continued until June 2004.
You are asked to use the data from this cohort study to address the following questions:
- What was the incidence rate of clinical malaria in this cohort?
- How did this rate vary over time?
- What were the risk factors for clinical malaria in this cohort?
- Was there any evidence that the incidence of clinical malaria was either increased or decreased if children had experienced a previous episode?
In addressing these questions you may use either or both of the datasets provided.
The data
Two datasets are provided. The first is restricted to the first event of clinical malaria experienced by each child during the follow-up period. The second gives data on all events.
Important note: Please make sure you download these two new datasets from the U: drive on or after Friday 22 April as these replace earlier datasets that are now obsolete.
The Stata dataset ghana.first.2016.dta contains 875 observations and 21 variables as shown below. In this dataset, observation ends at the time (devent) of the first episode of malaria (malepi = 1) or exit from the study (malepi = 0).
Variable name |
Description |
Coding |
id |
Unique identification number | |
hhsize |
Number of household members | |
nroom |
Number of rooms in household | |
ethnicity |
Ethnicity of household head |
1 = Kassem 2 = Nankam 3 = Buli 4 = Other |
religion |
Religion of household head |
1 = Traditional 2 = Catholic 3 = Muslim |
nhis |
Health insurance status of household members |
1 = Yes, all 2 = Yes, household head 3 = Yes, others 4 = Yes, household head and others 5 = None |
water |
Source of drinking water of household |
1 = Bought 2 = Piped water 3 = Borehole/well 4 = Surface water |
electric |
Source of light for household from electric grid |
1 = Yes 2 = No |
tele |
Household possession of television |
1 = Yes 2 = No |
radio |
Household possession of radio |
1 = Yes 2 = No |
netsum |
Number of bednets possessed by household | |
netuse |
Study child uses bednet |
1 = Yes 2 = No |
itnuse |
Study child uses insecticide treated bednet |
1 = Yes 2 = No |
urban |
Urban residence |
1 = Yes 2 = No |
dob |
Date of birth | |
sex |
Sex of child |
1 = Male 2 = Female |
doe |
Date of enrolment | |
vistat |
Vital status at exit from study |
1 = Alive 2 = Died/Migrated |
doexit |
Date of exit from study | |
malepi |
Episode of clinical malaria |
0 = Exit from study 1 = Episode of malaria |
devent |
Date of first episode or exit from study |
The Stata dataset ghana.all.2016.dta contains 2522 observations and 23 variables as follows:
Variable name |
Description |
Coding |
id |
Unique identification number | |
hhsize |
Number of household members | |
nroom |
Number of rooms in household | |
ethnicity |
Ethnicity of household head |
As above |
religion |
Religion of household head |
As above |
nhis |
Health insurance status of household members |
As above |
water |
Source of drinking water of household |
As above |
electric |
Source of light for household from electric grid |
As above |
tele |
Household possession of television |
As above |
radio |
Household possession of radio |
As above |
netsum |
Number of bednets possessed by household | |
netuse |
Study child uses bednet |
As above |
itnuse |
Study child uses insecticide treated bednet |
As above |
urban |
Urban residence |
As above |
dob |
Date of birth | |
sex |
Sex of child |
As above |
doe |
Date of enrolment | |
vistat |
Vital status at exit from study |
As above |
doexit |
Date of exit from study | |
start |
Start date of time period | |
end |
End date of time period | |
malepi |
Episode of clinical malaria |
0 = Exit from study 1 = Episode of malaria |
prevmal |
Number of previous episodes of clinical malaria since enrolment |
In the above dataset, there are multiple records for children who experienced an episode of clinical malaria. Each record corresponds to a period at risk, and ends either with a malaria episode (malepi = 1) or exit from the study (malepi = 0).
There were 875 children and a total of 1648 episodes of clinical malaria: x 200 children had no episodes x 227 children had 1 episode x 188 children had 2 episodes x 123 children had 3 episodes x 76 children had 4 episodes x 31 children had 5 episodes x 30 children had more than 5 episodes
Background notes
Malaria is caused by a parasite transmitted by the bite of an infected anopheline mosquito.
Worldwide it causes around 600,000 deaths each year, mostly among young children in subSaharan Africa. After repeated infections, children gradually develop partial immunity to the parasite. Not all infections give rise to severe symptoms, and not all symptomatic cases are seen or treated at health facilities.
The project report
Your report on this analysis should be printed on A4 paper (single-spaced), and must be no more than 5 pages including any tables or figures. Small fonts and small margins must not be used. The report should include:
- A brief discussion of the strategy you used in analysing the data (maximum 1 page). This should be more detailed than a Methods section of a scientific paper, since you should make explicit the structure of your analyses.
- A concise presentation of your results, including tables and figures as appropriate. Because of the strict space limits, you will need to be selective in the analyses you present.
- A brief discussion, summarising your main conclusions and discussing any potential sources of error or bias.
Criteria for grading
5 Excellent: |
An outstanding report which clearly answers the question, shows indepth understanding of the analysis and is well-explained. |
4 Very good: |
A thorough analysis, with all relevant information reported. |
3 Good: |
Sound analysis, but some relevant points are omitted and/or the presentation lacks clarity. |
2 Satisfactory: |
Basic understanding of major points is shown, but some errors in the analysis or interpretation, or muddled presentation. |
1 Unsatisfactory: |
Inadequate analysis and lack of understanding shown. |
0 Very poor: |
Serious lack of understanding shown: inappropriate analysis used. |