SINGH
HARPREET
About
I'm a computational biologist and ML engineer with a deep curiosity for the hidden architecture of life. My journey started at IISER Mohali, where I spent years studying how chromosomes fold in 3D space and how that folding shapes which genes get turned on or off. In my final year I completed two independent research projects simultaneously: one on 3D genome organization and network analysis, and one investigating how conserved non-coding elements regulate gene expression during mammalian brain development, the latter contributing directly to a publication in Genetics. That work led to three peer-reviewed publications in Nature NPJ Aging, BMC Genomics, and Genetics, and a Government of India research grant.
Over time I fell in love with the engineering side of science. Building the tools, not just using them. That led me to cloud-native pipelines, single-cell ML systems, and interactive visualizations that make complex biology readable to anyone. I am currently completing my Master of Data Science at UBC, where I also TA a visualization course taught by one of the leading researchers in the field.
I am actively looking for bioinformatics, computational biology, and ML scientist roles in Vancouver where I can keep doing what I love: asking hard biological questions and building production-grade systems to answer them. If that sounds like your team, I would love to talk.
Skills
Projects
GenBrowser
Interactive 3D chromosome visualization implementing the novel CSAA algorithm to map mutation-vulnerable genomic regions on the chromosome surface.
- Novel CSAA algorithm derived from protein solvent-accessible surface area (SASA)
- Renders 225 Mb of chromosome 1 in real-time using Three.js and WebGL
- Deployed as a zero-dependency static app on GitHub Pages
ChromApipe
Cloud-native Nextflow DSL2 pipeline for chromatin accessibility analysis, deployed on AWS Batch with Wave containers and Fusion file system.
- Wave containers and Fusion file system eliminate S3 staging overhead entirely
- Full CI/CD with Docker, AWS ECR, and CloudWatch monitoring
- v1.0 GitHub release with complete documentation
spaceGen
End-to-end single-cell multiome ML pipeline analyzing NASA OSDR spaceflight data to characterize how microgravity rewires gene expression in brain tissue.
- Hexagonal architecture with bronze/silver/gold medallion data layers
- XGBoost and scikit-learn classifiers with full MLflow experiment tracking
- MuData gold layer enables future snATAC-seq integration without refactoring
PolicyLens
LLM-powered course policy QA system that routes queries through intent classification and retrieves answers exclusively from a structured facts database.
- LLM used only for intent classification, never for factual generation
- Streaming responses with real-time word-by-word delivery via FastAPI
- Supports multiple courses via structured JSON facts databases
Master's Thesis Role of Conserved Non-Coding Elements in Gene Expression
Investigated how genomic proximity to conserved non-coding elements regulates gene expression during mammalian brain development, contributing directly to a publication in Genetics.
Neural Network for Gene Length Prediction
40% improvement in model convergence via architecture optimization
3D Genome Structure as a Predictor of Gene Expression
Combined biological and physical features outperform either modality alone
Web Dashboard for 3D Genome Organization
Awarded competitive Government of India research grant
Sleep Disorder Analysis
Full pipeline reproducibility via Docker, Makefile orchestration, and Pandera validation
Master's Thesis Behavioral Patterns in Student Populations
1500+ participants across student and faculty populations at IISER Mohali
Timeline
Education & Experience
Displaying relevant education, research, and professional experience.
Publications
Convergent evolution of a genomic rearrangement may explain cancer resistance in hystrico- and sciuromorpha rodents
Biased visibility in Hi-C datasets marks dynamically regulated condensed and decondensed chromatin states genome-wide
Evolutionary loss of genomic proximity to CNEs impacted gene expression dynamics during mammalian brain development
AWARDS
DST-INSPIRE Fellowship, Bachelor's (2012-2015)
DST-INSPIRE Fellowship, Master's (2015-2017)
Government of India Research Grant (2019)
Contact
Credits
This site was designed and built from scratch without any templates. Planning, content refinement, and architectural decisions were worked through with Claude Opus 4.6. The project structure, codebase organization, and initial implementation were developed in Kiro IDE. Rapid iteration and visual refinement were done in Cursor IDE.
Built With
SvelteKit · D3.js · LayerCake · Three.js · Canvas API · Claude Opus 4.6 · Kiro IDE · Cursor IDE