Senior Deep Learning Engineer
I’m currently partnering with a well-funded biotech company building large-scale AI models to accelerate drug discovery and biological research.
They’re hiring a Senior Deep Learning Engineer to focus on training, building, and deploying transformer-based models across biological data modalities.
Key responsibilities of the role:
- Design and implement transformer architectures for biological sequence and multimodal data
- Build and scale distributed training pipelines (multi-GPU / multi-node)
- Optimize large-model training (FSDP, DeepSpeed, mixed precision, etc.)
- Deploy models into production research platforms
- Improve inference performance (quantization, distillation, optimization)
- Collaborate with computational biologists and platform engineers
Key experience needed:
- 4+ years of hands-on deep learning experience
- Strong expertise with transformers and large-scale model training
- Production experience deploying ML systems
- Advanced proficiency in PyTorch (or similar framework)
- Experience working in high-performance compute environments
Biotech or biological sequence modeling experience is a strong plus, but strong transformer experience from other domains (LLMs, multimodal models, etc.) is also highly valued.
This is an opportunity to work on foundation-style models in biology with real-world scientific impact.
The role is pay a salary of up to $400,000 per annum and comes with a wealth of benefits. The role is hybrid 3 days a week in SF.
Apply within if this is interesting to you.
Tags & Focus Areas
About Strativ Group
A scaling, SOTA Generative AI Startup operating with a world class team (Founders have multiple prior exits) with talent from Open AI, IBM, MIT and several top orgs, focused on pioneering work and advancements in large language models (LLMs), code generation, and code translation. Their projects directly involve industry leading partners where they’re applying advanced AI to solve meaningful, practical challenges with real-world impact.