Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.



Yidong Zhang

DPhil student

Learn associations of genetic variations and human diseases with large bio-biobank dataset and statistical methods

Modelling hidden structure between genomic variations and human diseases

The relationship of genomic variations and human diseases has been of great interest to researchers for decades. At present, large amount of results have already been obtained using standard statistical methods like GWAS. However, with the accumulation of next generation sequencing data at an unprecedented high speed and the establishments of large scale bio-bank datasets, more advanced statistical tools are needed to do further mining of data and to gain deeper insight for mechanism and relationships of diseases. 

My work involves developing new statistical tools to analyse large scale phenotype data and applying these new algorithms onto bio-bank dataset for clustering of diseases. Currently I am building a new Bayesian statistical model based on widely used machine learning tool "LDA". In the near future, I will analyse UK biobank phenotype data with the new algorithm. And based on the results, various downstream analysis will also be carried out.