Skip to content
flint
Back to jobs
odysseyml

Member of Technical Staff, Data Platform Lead

Palo Alto, US on-site full time senior Mar 11, 2026

About this role

WHO WE ARE Odyssey https://odyssey.ml/ is an AI lab pioneering general-purpose world models: causal, multimodal systems that learn to predict and interact with the world over long horizons, while generating real-time, interactive simulations from any starting point. This foundational technology promises to revolutionize robotics, science, healthcare, education, gaming, defense, and beyond. WHAT WE'RE LOOKING FOR We need a deeply experienced Data Platform Lead to take full ownership of our data practice. This is a crucial technical leadership position focused on architecture, strategy, and getting things done. You should be an expert with serious, hands-on data engineering chops, capable of defining the long-term architectural vision while still diving into the code. Success in this role requires a complete understanding of the data lifecycle: from partnering with Operations to source data, designing robust data recipes and ensuring the resulting data assets are optimized for our world models. WHAT YOU’LL DO - Define and implement the long-term technical architecture for our data platform, ensuring scalability, reliability, and support for high-volume, multimodal datasets. - Take ownership of the end-to-end data lifecycle, from sourcing and acquisition to delivery for machine learning model training. - Design and build robust data processing pipelines, including data recipes for cleaning, feature engineering, and normalization, specifically addressing the complexity of inputs required for world models. - Develop and manage the data curation system, including flexible metadata schemas, evolving labels, and modular tagging pipelines, to allow researchers to dynamically categorize, resample, and select high-quality training data. - Work closely with ML Research and Engineering teams to understand immediate and future data requirements, translating research needs into actionable data infrastructure and acquisition strategies. - Lead the integration of sophisticated signals and quality filtering into the data flow, such as VLM analysis, pose estimation, and aesthetic scoring, to ensure training datasets meet high quality standards. - Drive the strategy for data acquisition, evaluating the trade-offs between various methods, aligning with budget constraints and quality requirements. WHO YOU ARE - You live and breathe data, with a strong belief in data quality and diversity as a primary lever for optimizing model performance. - 8+ years building data platforms, focused on data architecture and engineering. - Experience supporting ML teams, specifically preparing and optimizing data for model training. - Great at designing and building reliable, high-volume data pipelines (ETL/ELT). - Expert in cloud data technologies like data warehousing and lakehouse architectures (e.g., Snowflake, Databricks, BigQuery, and AWS S3/Redshift). - Proficient with modern data processing frameworks (e.g., Spark, Flink, Kafka) and various databases (NoSQL, graph, relational). - Knows how to set up practical data governance, quality checks, and metadata management. - A strong technical leader who can set a clear technical direction and mentor other engineers. - Experienced with complex data types (images, video, text) and signal processing. - Degree in Computer Science, Engineering, or a related field.
Sign in Apply