The module covers the fundamental principles of informatics and bioinformatics applied to clinical genomics. Students are taught to find and use major genomic and genetic data resources; use software packages, in silico tools, databases and literature searches to align sequence data to the reference genome, critically assess, annotate and interpret findings from genetic and genomic analyses. Theoretical sessions are coupled with practical assignments of analysing and annotating predefined data sets. This module is central to the Genomic Medicine Programme as it provides students with the skills to begin to analyse genomic data. This module is suitable for beginners and does not involve the use of the Unix command line or R. Students with previous bioinformatics experience are advised to take the advanced bioinformatics module.
In GM7 you will learn about:
Methods of alignment of sequencing data to the reference genome using state of the art alignment programmes
Assessment of data quality through application of quality control measures
How to determine the analytical sensitivity and specificity of genomic tests
Use of tools to call sequence variants (e.g. GATK) and annotate variant-call files using established databases
Filtering strategies for variants, in the context of clinical data, and using publicly available control data sets
Use of multiple database sources, in silico tools and literature for pathogenicity evaluation, and familiarity with the statistical programmes to support this
Principles of integration of laboratory and clinical information, and place of best- practice guidelines for indicating the clinical significance of results
Principles of biomedical ontologies (e.g. HPO, SNOMED, ICD) and how to use them for the annotation of clinical phenotypes
By the end of this module students will be able to:
Understand the principles applied to quality control of sequencing data, alignment of sequence to the reference genome, calling and annotating sequence variants, and filtering strategies to identify pathogenic mutations in sequencing data
Understand the challenges associated with the analysis of variation data, how to treat candidate variants given known false positive/negative rates and population frequencies as well as the implications for disease diagnosis
Interrogate major database sources of genomic sequence (e.g. Ensembl), protein sequences (e.g. Uniprot), short variations (e.g. 1000 Genomes, dbSNP, HapMap, EVA), structural variation (e.g. dbVar), variant-disease association (e.g. ClinVar, OMIM, Decipher), GWAS and other association studies (e.g. dbGAP, EGA, GWAS catalogue), pathways (e.g. Reactome), cancer genes and variants (e.g. ICGC, COSMIC, TCGA) and be able to integrate this information with clinical data, to assess the potential pathogenic and clinical significance of the identified sequence variants
Identify and critically evaluate biomedical ontologies for the annotation of clinical phenotypes
Acquire relevant basic computational skills for handling and analysing sequencing data for application in both diagnostic and research settings
Discuss and critically evaluate statistical methods for handling and analysing sequencing data
Gain practical experience of the bioinformatics pipeline for variant calling through the Genomics England programme.
Justify and defend the place of Professional Best Practice Guidelines in the diagnostic setting for the reporting of genomic variation.