The Ismagilov Lab at the California Institute of Technology is hiring a Research Lead, Stochastic Sequencing Analysis to support an ambitious DARPA BTO project to overcome the ubiquitous noise and scale limitations of using biological sequencing data for artificial intelligence (AI) training and inference. This intense, fast-paced project runs one year with the goal to transition it to a much larger-scale effort in industry and/or academia.
The Research Lead, Stochastic Sequencing Analysis will work under the supervision of the PI and be integrated into a team of experimental and computational co-workers to develop and implement the necessary computational tools to advance the project. The successful candidate will have extensive experience developing models and algorithms for analysis of noisy biological data, a deep background in analysis of stochastic processes, and strong bioinformatics expertise in analyzing both long- and short-read sequencing data. The successful candidate will have a strong drive and ambition to develop novel computational methodologies, drive scientific innovation to advance the frontiers of knowledge and technology, and have an outsized impact on multiple fields, including human health and agricultural biosecurity.
Essential Job Duties
- Execute the deliverables of the DARPA project, providing leadership in the computational components of the effort to guide the strategic research direction of computational analyses.
- Design, develop, and implement novel stochastic models and algorithms for overcoming the noise and scale limitations of using biological data for AI training and inference.
- Deploy, maintain, and optimize the state-of-the-art computational tools, including NGS bioinformatic pipelines, that are developed by the lab, its collaborators, and other leaders in the field for this project.
- Collaborate with students/postdocs in the lab and with external collaborators to improve and innovate computational pipelines, enhancing their accuracy, efficiency, and usability.
- Collaborate with students/postdocs and research staff in quality control assessment and troubleshooting of human and microbial sequencing, providing advanced technical guidance and leadership in computational analysis and interpretation.
- Collaborate with and mentor students/postdocs in the analysis and interpretation of next generation sequencing (NGS) datasets from complex clinical samples. Examples of datasets include, Illumina short read shotgun metagenomes, real-time nanopore sequencing, long read shotgun metagenomes, human and bacterial RNA-Seq, 16S/ITS Amplicon sequencing.
- Contribute specialized scientific and technical knowledge to interpret complex results and inform research directions.
- Collaborate with and mentor students/postdocs to design appropriately powered experiments and to perform statistical analyses, including advising on computational study design and analysis strategies.
- Apply specialized, subject-matter expertise to train students/postdocs and research staff on project-relevant computational pipelines and the use of Caltech’s high-performance computing cluster.
- Assist with generating data for grant reports, patents, and publications, including leading or contributing to the preparation of scientific manuscripts and presentations.
- Work with Caltech’s IT to ensure proper data and computing management, helping establish scalable computational infrastructure to support project objectives.
Basic Qualifications
General Qualifications:
- Ph.D. in bioengineering, computational biology, bioinformatics, biology, molecular biology, or related field, plus 2 years of post-degree industry or academic experience.
- A record of independent and impactful research accomplishments.
- Broad, interdisciplinary mindset along with flexibility and eagerness to continue learning.
- Ability to lead rigorous scientific investigations, including thoughtful study design and selection of appropriate controls and robust analysis methods.
- Ability to clearly communicate technical findings through high-quality scientific writing and effective data visualization. Experience coordinating complex research efforts and managing project deliverables involving multiple researchers and collaborators.
- Excellent critical-thinking abilities and sound, evidence-based decision-making skills. Ability to identify and solve practical, relevant, and complex problems.
- Ability to effectively manage multiple competing priorities and deadlines.
- Must be self-motivated, take initiative, and demonstrate scientific leadership while working independently and collaboratively.
- Demonstrated ability to communicate complex scientific concepts effectively in both written and oral formats. Excellent interpersonal skills and demonstrated ability to engage productively with diverse, interdisciplinary researchers and external collaborators.
Computational Qualifications:
- Demonstrated ability to go beyond basic bioinformatics pipeline development to develop novel computational methodologies or analytical frameworks for complex biological datasets.
- Prior experience designing and/or developing novel stochastic models and algorithms.
- Strong expertise in stochastic mathematics and computation.
- Prior experience developing stochastic models and algorithms for biological data.
- Strong quantitative and statistical skills.
- Strong general computational skills.
- Strong understanding of the fundamental principles underlying modern sequencing-analysis tools and methodologies.
- Strong computational experience with NGS datasets.
- Strong experience with high-performance computing.
- Strong experience and proficiency in Python.
Preferred Qualifications
- Experience with analysis of real-time sequencing data.
- Experience with analysis of low-biomass sequencing data.
- Experience with analysis of long-read sequencing data.
- Experience with training and deploying AI and ML models that rely on quantitative sequencing data.
- Familiarity with and interest in modern Al tools.
Required Documents
- Resume/CV
- A statement describing the candidate’s relevant experience in at least one, preferably both, of the following two areas: (1) Demonstrated ability to extend beyond standard bioinformatics pipeline development by creating novel computational methods or analytical frameworks for complex biological datasets; or (2) Experience designing and/or developing new stochastic models and algorithms. The statement must: (i) Clearly reference the specific item(s) in the candidate’s Resume/CV that correspond to the work described, and explain how the listed experience connects to the claimed capability; (ii) Include concrete details about the work performed, such as the nature of the contribution, the methods or models developed, and tangible outcomes or deliverables; (iii) Specify the position under which this work was conducted (e.g., role, institution, or project affiliation); (iv) Indicate the duration of the candidate’s involvement in the relevant project or work. Please note that applications will only be considered if they address these points with high factual specificity.
Hiring Range
$78,200 - $147,500 Per Year
The salary of the finalist(s) selected for this role will be set based on a variety of factors, including but not limited to, internal equity, experience, education, specialty and training.
As one of the largest employers in Pasadena, CA, Caltech is committed to providing comprehensive benefits to eligible employees and their eligible dependents. Our benefits package includes competitive compensation, health, dental, and vision insurance, retirement savings plans, generous paid time off (vacation, holidays, sick time, parental leave, bereavement, etc.), tuition reimbursement, and more. Non-benefit eligible employees will have access to some benefits such as onsite counseling and sick time. Learn more about our benefits and staff perks.
EEO Statement
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity, or national origin, disability status, protected veteran status, or any other characteristic protected by law.
Caltech is a VEVRAA Federal Contractor.
To read more Equal Employment Opportunity (EEO) go to eeoc_self_print_poster.pdf.
Disability Accommodations
Caltech complies with the Fair Employment and Housing Act (FEHA) and the Americans with Disabilities Act (ADA). We consider reasonable accommodation measures that may be necessary for eligible applicants and employees to perform the essential functions of a position.
If you would like to request an accommodation to complete this application, interview, or otherwise participate in the employee selection process, please contact Caltech Recruiting at employment@caltech.edu.
Additionally, if you do not meet the basic qualifications of a role but believe you can perform the essential functions of the job with reasonable accommodation, please reach out to Caltech Recruiting at employment@caltech.edu.
Are you looking for more jobs nearby? Find your favorite jobs now by visiting our online jobs page.