Genome-wide association studies with Hail scalable software
Heidi Steiner
March 16, 2023
Installations
Discuss
Hail + scalability 🏗️
Genome-wide association studies (GWAS) 🧬
Hands-on
Load CyVerse
Launch JupyterLab DataScience App
Install Java
sudo apt update
sudo apt install openjdk-8-jdk
pip install hail
Open source data science library
Scale-able genomic software
Unified genomic data representation
Community 🤗
Problem:
Not feasible to process tens or hundreds of thousands of whole genomes on a single computer
Solution:
Worry about the contents of a pipeline, rather than how to parallelize it
Statistical method to survey large amounts of genetic variants for a relationship with a disease (or a particular trait)
Back to CyVerse…