Major Research Projects
Statistical Inference for Probability-Linked Longitudinal Data
Linked datasets are critical for scientific research. Recent research on the use of linked data has focused on confidentiality, essentially ignoring linkage errors. These errors arise when distinct population units are linked or when a population unit is not linked. They can lead to incorrect and inefficient inferences if ignored, reducing the power to detect important changes and relationships. This project will lead to new and efficient methods for statistical analysis of linked datasets, particularly linked longitudinal datasets. It will develop measures for information loss caused by linkage, allowing assessment of the cost-benefit trade-off in the creation of linked datasets.
The Strategic Roadmap for the Australian Government's National Collaborative Research Infrastructure Strategy states that analysis of linked data, and particularly linked longitudinal data, has the potential to revolutionise Australian public health research. Similar benefits should flow from analysis of linked datasets in other areas, e.g. the Statistical Longitudinal Census Dataset that the Australian Bureau of Statistics intends to create by linking individual records across censuses. These benefits will be maximised by controlling the impact of linkage error when analysing these datasets. This project will develop the statistical theory and related methodology to solve this problem in a statistically efficient manner.
Back to top
The Dynamic Analyses to Optimise Ageing (DYNOPTA)
The Dynamic Analyses to Optimise Ageing (DYNOPTA) project involves data from nine longitudinal surveys in Australia that are specifically aimed at older people or include a sample of older people with a combined pool of over 50 000 participants. DYNOPTA aims to combine the data from all of these surveys to produce more reliable estimates and more powerful analyses, which may be useful for making decisions about public health and policy. The objective of the project is to use the analysis of all or some of the surveys to examine research questions in ageing. This can increase the power of analyses by increasing the effective sample size, particularly the number of cases of rare conditions. Combining the data may also increase the range of people and the explanatory variables in the analysis again increasing the power of the analysis. It can help adjust for coverage bias or other biases in particular studies.
The nine studies cover a variety of survey designs, and differ in their geographic and demographic coverage as well as in the years of observation. Some surveys are a result of simple random samples, for others stratified, or cluster designs were used. This project considers the statistical methodologies for combing, integrating and analysing the survey data. There are major issues of adjustment and calibration for differences in sample design, weighting procedures, variable definitions, different collection methods and missing data.
The DYNOPTA project is led by Associate Professor Anstey at the Australian National University. Professor David Steel is a chief investigator at the University of Wollongong, with other chief investigators from University of Canberra, University of Newcastle, University of Sydney, Monash University, and the University of New South Wales. More information is available on the DYNOPTA website: http://dynopta.anu.edu.au
Back to top
The role of households, neighbourhoods and networks in social statistics
This project will determine the importance of accounting for households, spatial proximity and social networks in the design, estimation and analysis of social statistics. Criteria indicating when it is necessary to account for lack of independence between people and methods to identify the important sources of dependence between people will be developed. It will create new methods to appropriately incorporate households and other groups and networks in the analysis of social survey data. The implications for the design of social surveys will be determined.
The project involves Professor David Steel, Professor Ray Chambers and Dr Robert Clark, Centre for Statistical and Survey Methodology, University of Wollongong and Dr Mark Tranmer and Dr Mark Elliot of the Centre for Census and Survey Research, University of Manchester. It is funded by a grant under the Linkage International Social Sciences Collaboration between the Australian Research Council (ARC) and the UK Economic and Social Research Council, with combined funding of $390,000.
Back to top
Generalised Linear Mixed Models: Theory and New Areas of Application
This grant will be used to significantly advance and widen the scope of generalised linear mixed models (GLMM) in several directions.
It will include the development of new theory and methodology for GLMM analysis and inference. New areas of application of GLMM will be explored and GLMM research will be suitably modified. The research will draw upon recent developments in Statistics and Computing.
Science: variational approximation, sequential Monte Carlo, slice sampling, kernel machines and models for extreme data. Specifically, it will develop the following:-
- Variational approximation theory for approximate GLMM fitting and inference.
- Fast, simple and effective Monte Carlo methods for fitting and inference.
- Mixed model-based splines for extreme values.
- GLMM geared towards Machine Learning and Data Mining.
- GLMM for analysis of tissue array data.
Back to top
Statistical Methods for Flow Cytometric Data
Flow cytometry is a method by which multiple characteristics of single cells or other particles are simultaneously measured as they pass through a laser beam in a fluid stream.
It continues to expand as a tool for medical and biological research, particularly in the areas of immunology, molecular biology, parasitology and haematology. The ability of flow cytomerty to meaure multiple characteristics of cells is a major advantage compared to conventional cell-based arrays, which are limited to recording population averages.
A typical flow cytomerty experiment results in multivariate samples of size about 100,000, corresponding to the number of cells. The porperties of each cell are described by a multi-dimensional vector. Current commercial technology supports dimensions between 3 and 13. Two of the dimensions usually correspond to the intensity of forward scatter and side scatter which characterise the physical properties of the cell (e.g. size and granularity). The remaining dimensions correspond to the intensity of the cell's fluorescence at a given wavelength (colour).
With funding by a Discovery Project from the Australian Research Council, we are developing new statistical methodology to aid flow cytmetric data analysis. Even though the methodology is driven by issues arising in flow cytomerty research, it is mostly generic and applicable to other large multi-dimensional settings. Examples of methodology developed so far include: feature significance for multivariate densities, on generalised linear mixed model ideas.
Outcomes from the grant are regularly posted on the grant's website
Back to top
Handling Missing Data in Complex Household Surveys
Household surveys are a widely used methodology in research, government and industry. These surveys are affected by missing data at a variety of levels, such as time points within person, people within households and households within areas. This project will create methods which deal with missing data incorporating these complex multi-level structures.
The project is funded by an ARC Linkage Proposal, with industry partner the Australian Bureau of Statistics. The total funding from the ARC and the ABS is $103,950 over three years. The chief investigators are Dr Robert Clark and Professor Ray Chambers. A PhD student, Luise Lago, commenced work on the project in July, 2007.
Back to top
Efficient Design for Generalized Linear Models
In industrial, commercial and social research, we collect data in order to predict the outcome of a process based on the inputs to that process. We want to maximize the information that is gained from the data. Good planning is crucially important to achieve this. This project will determine how best to select the inputs to the process for many situations that occur in research. A computer package to answer these questions will be written. The nation will benefit from a fundamental increase in efficiency of research and, therefore, in efficient use of research dollars.
The chief investigators are A/Prof KG Russell; Prof JA Eccleston; Prof SM Lewis; Dr DC Woods. The project has received funding of $164,000 from 2006 - 2008.
Back to top
New methods for small group analysis from sample surveys
Ray Chambers and David Steel are chief investigators on this linkage project which commenced in January 2008. The industry partners are Australian Bureau of Statistics, Australian Bureau of Agricultural and Resource Economics, NSW Department of Health Centre for Epidemiology and Research, and NZ Ministry of Health.
Total funding $601,000.
The aims of this project are:
- To create and evaluate cost-effective methods to produce data for small groups using statistical modeling to combine survey data with information from other sources.
- To develop methods for small sample inference from sample surveys, based on unit level models with correlated group effects.
- To develop a theory of robust inference for group effects applicable to small samples, where group definitions are flexible and can depend on the variable of interest.
- To determine the implications of the need to produce data for small groups for sample design, particularly in multistage surveys and repeated surveys.
- To provide high quality research training in applied statistics for research students in an area where there is strong demand in industry.
- To provide the opportunity for a postdoctoral researcher to establish international research credentials and industry relevant expertise.
- To establish ongoing research collaborations with four major organizations that conduct surveys and analyse the resulting data for small groups.
The research consists of five separate but related strands:
- Efficient sample design and weighting with spatially correlated group effects.
- Rotating panel designs for measuring change.
- M-quantile models for spatially correlated data and categorical variables.
- Outlier robust inference for small domains and small areas.
- Statistical modeling and inference under informative grouping.
The following projects are currently underway as part of the overall research project.
Personnel employed as part of the project
| Name |
Position |
Title |
|
| George Sofronov |
Post-doctorate |
Efficient small area estimation with spatially correlated data |
| Diane Hindmarsh |
PhD |
Small area estimation and analysis: theory and application to health surveys |
| Anang Kurnia |
PhD |
Small area inference with long tailed distributions |
Other people associated with the project include Hukum Chandra from the Indian Agricultural Statistics Research Institute, New Delhi, Nikos Tzavidis from the University of Manchester, and Nicola Salvati and Monica Pratesi from the University of Pisa who are working on research related to strands C and D of the project in association with Ray Chambers.
Other students in the School of Mathematics and Applied Statistics at the University of Wollongong who are working on related topics include: Mohammad Namazi Rad: 'Topics in Survey Design and Analysis' under the supervision of David Steel and Ray Chambers, and Wilford Molefe: 'Multilevel Models and Sample Design for Small-Area Estimation' under the supervision of Robert Clark.
This project will lead to advances in statistical theory relating to small sample inference from national and sub-national surveys. These advances will have a substantial impact on the information provision and analytic capacities of each of the four partner organizations.
Links to outcomes such as talks, papers and papers in progress will be posted on this website as they become available.
Back to top
Methodology Development Partnership with Australian Bureau of Statistics
The pressing need to establish a program of research into new and innovative methods of collecting, analysing and exploiting statistical information has led to the Australian Bureau of Statistics (ABS) partnering with the University of Wollongong (UOW) to establish a Chair in Statistical Methodology.
Professor Ray Chambers commenced in this position in May 2006, following a distinguished career as Director of the Southampton Statistical Sciences Institute at the University of Southampton in the UK.
The Australian Statistician, Mr Dennis Trewin, said the agreement would forge stronger links between the ABS and the university sector and ensure that statistical methodology skills are available more widely around the nation. "Statistical methodology is an area of critical importance to the ABS and other providers of statistics. It requires specialist training to ensure quality statistics are produced for Australia. It is increasingly important in the information age and there is need for more people with these skills," he said.
Recent activities as part of this partnership include:
- A series of one-day Fellows Research Meetings (insert link here) held at Goulburn has been established. "Goulburn 4" was held in July 2007, and Goulburn 5 is planned for November 2007. This meetings are hosted by UoW, and include a range of researchers and statisticians from UoW, the ABS, the Australian Bureau of Research Economics and other government departments and universities.
- UoW has conducted annual symposiums at the ABS since 2005.
- UoW has presented a number of short courses at the ABS including Survey Estimation, Principles and Practices of Methodological Research and Logistic Regression.
- ABS managers and statisticians have visited UoW for a number of visits, to discuss research plans, statistical issues at the ABS and to plan joint activities.
- A mentoring scheme has been established where UoW academic statisticians provide technical leadership and mentoring for selected ABS staff.
An ARC Linkage project on "Handling Missing Data in Complex Household Surveys" commenced in July 2007, with industry partner the ABS. An ARC Linkage project on "New Methods for Small Group Analysis" was approved in 2007 and will commence shortly. The industry partners for this project are the ABS, the Australian Bureau of Agricultural and Resource Economics, and the Centre for Epidemiology and Research, Public Health Intelligence in NZ Ministry of Health.
Back to top
Sampling for Subpopulations in Household Surveys with Application to Maori and Pacific Sampling
This project was conducted in 06/07 by a team consisting of Robert Clark (CSSM, lead investigator), Robert Templeton (NZ Ministry of Health), Mike Doherty (Statistics New Zealand) and Angela Forbes (Statistics New Zealand). The project was funded by the Statistics New Zealand Official Research Fund with a budget of NZD$69,100.
Many NZ national household surveys have a requirement to produce statistics with adequate precision both for the whole of NZ and for important subpopulations, particularly the Maori and Pacific populations. This project developed strategies for sampling subpopulations including:
- Optimal allocations for two-stage sampling with screening for subpopulations;
- comparison of the Maori electoral roll compared to ethnicity measured by the NZ Health Survey;
- the effect of inter-censal mobility on ethnicity benchmarks based on the NZ Census;
- the accuracy of proxy screening for ethnicity in the NZ Health Survey.
A full report has been published in the Official Statistical Research series at http://www.statisphere.govt.nz/official-statistics-research/series/vol-4.htm
Back to top
Collaboration with New Zealand Ministry of Health
The Public Health Intelligence Unit of the New Zealand Ministry of Health (MoH) and the CSSM signed a two year collaboration agreement including funding of up to NZ$225,000, in October 2006.
The partnership includes a range of training activities and statistical projects. Projects to date include:
- Development of the sample design for the 2006-2007 New Zealand Health Survey;
- Development of preliminary sampling weights for the New Zealand Health Survey;
- Sample design and analysis advice on other surveys including the NZ Alcohol and Drug Use Survey, the NZ Oral Health Survey and the NZ Adult Nutrition Survey.
CSSM has presented short courses in Wellington to MoH and other researchers and policy analysts including:
- Sample Design and Estimation for Household Surveys (two day course, October 2005)
- Interpreting Survey Data (session as part of Public Health Intelligence Workshop, May 2006)
- Survey Quality and Cost (one day course, September 2006)
- Logistic Regression (one day course April 2007)
- Multilevel Modelling (two day course, planned for December 2007)
Activities planned for 2008 include a scholarship program, and further expansion of the statistical training programme.
For more information on the New Zealand Health Survey sample design, see the (Clark and Gerritsen Symposium Proceedings Submission.pdf) attached paper which was presented at the Statistics Canada Symposium in November 2006.
|