Back to jobsprofluent
Machine Learning Engineer
$180k – $250k/yr Emeryville, US on-site full time mid 27d ago
About this role
Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Emeryville, CA, we are backed by leading investors including Altimeter Capital, Bezos Expeditions, Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures, and have raised over $150M to date.
We're looking for an experienced Machine Learning Engineer to build and improve the models and ML systems that drive our protein design efforts. In this role, you'll deploy and optimize large-scale generative models for protein design, and develop the surrounding infrastructure and tooling that enable our ML and protein design scientists to work faster and more confidently. As an early member of a small, fast-moving engineering team, you'll have significant ownership over our ML stack and the opportunity to shape how our platform evolves.
Responsibilities
Build robust, reproducible and user-friendly pipelines for automated model fine-tuning, alignment and evaluation
Design and implement modular, easy-to-maintain, multi-model pipelines for protein design.
Develop highly scalable ETL pipelines to process petabyte-scale protein data for model pretraining
Optimize model training and inference code to maximize throughput and resource utilization when deployed at scale
Develop software and infrastructure that enable the ML team to work quickly and frictionlessly in distributed and multi-cloud environments
Partner with ML and protein design scientists to prototype research ideas and bring them into production
Who You Are
You're comfortable taking ownership and working independently in a fast-moving environment
You're an execution-oriented engineer who maintains high standards, and focuses on the highest-impact work
You're comfortable owning the full stack of your work, from training code to the infrastructure it runs on
You care deeply about model quality, efficiency, and reliability
You're willing to step beyond your core responsibilities when the team needs it
Representative Projects
Building hyperparameter search frameworks for SFT and Alignment workflows
Increasing protein language model throughput during long context generation
Updating existing model architectures to work and run efficiently on new GPU hardware
Implementing a protein design pipeline that integrates prompt retrieval, sequence generation, attribute prediction, and structure prediction
Establishing an ETL pipeline for sampling and tokenizing training datasets from an internal database of billions of sequences
Developing a benchmarking and evaluation system for newly trained sequence generation models
Contributing to the development of an internal service that provides transparent multi-node job submission for ML scientists
Qualifications
BS or MS in Computer Science, Machine Learning, or a related field
3+ years of hands-on experience building and training ML models in PyTorch
Strong Python and software engineering fundamentals, including testing, code quality, and version control
Experience profiling, benchmarking, and optimizing ML model training and inference
Experience implementing or optimizing transformer-based architectures
Familiarity with cloud infrastructure and containerization (GCP, AWS, Azure, Kubernetes, Docker)
Strong fundamentals in ML, statistics, and/or linear algebra
Preferences (but not required)
Familiarity with protein language models or computational biology
Experience with GPU-level optimization (CUDA, Triton)
Experience with distributed training (DDP, FSDP, multi-node GPU clusters)
Experience with databases and data processing pipelines
Experience orchestrating multi-step ML workflows
Experience building backend systems that serve ML models in production
Contributions to open source ML projects or published research
What We Offer
High-growth opportunity with meaningful impact on the future of protein design Competitive compensation package with equity participation 401(k) with a strong employer match Comprehensive benefits including health/dental/vision insurance Generous PTO policy and commitment to work-life balance Professional development opportunities in a cutting-edge field at the intersection of AI and biology
Profluent Bio, Inc is an equal opportunity employer promoting diversity and inclusion in the workspace. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical conditions, veteran status, sexual orientation, gender (including gender identity and gender expression), sex (which includes pregnancy, childbirth, and breastfeeding), genetic information, taking or requesting statutorily protected leave, or any other basis protected by law.
Employment Eligibility Verification
Legal authorization to work in the United States is required. In compliance with federal law, all persons hired must verify their identity and work eligibility and complete the required employment verification form upon hire.
Hiring Salary Range$180,000—$250,000 USD Offices: Emeryville, California, United States (Emeryville, CA);