The advent of Next Generation Sequencing (NGS) technologies has provided researchers with an unparalleled opportunity for the identification of novel therapeutic targets for drug discovery and development. We apply and develop quantitative tools for target identification and validation resulting from the association of genomic structural variation (‘omics) data with clinical data. Our research expertise involves managing large scale and complex family/population-based GWAS, NGS and phenotype data such as electronic medical records, longitudinal outcomes, etc.

For each study, we perform quality control and data management, as well as appropriate analysis using an array of genomic, multivariate, machine learning and visualization tools to identify and annotate variants for functional follow-up and drug targeting. Our team has extensive research experience in the fields of epidemiology, statistical/computational genetics/genomics, systems biology, statistics and bioinformatics. We provide support in all phases of a project, from study design to reporting and interpreting results. Our research experience involves working with diverse data formats (VCF, BAM/SAM, etc.), platforms (Sanger, WES, WGS, custom capture targeted sequencing, etc.) and profiling technologies (microarrays, RNASeq, etc.).

We have ample expertise in the use of programming languages such as Python, C/C++, Shell, Java, R, SAS, MATLAB as well as software tools such as PLINK, PLINK-seq, MaCH, IMPUTE2, HLA*IMP, METAL, GWAMA, BWA, SciKit-learn, Weka, SamTools, GATK, FreeBayes, MuTect, VarScan, etc.

  • Cloud computing environments (e.g. AWS, Google), distributed computing tools (StartCluster, Hadoop, Spark), and containerization (e.g. Docker).
  • Expertise in publicly available databases (e.g. ExAC, NHLBI, 1000G, UK10K, 100KUK, ENCODE) to aid interpretation of findings.
