Senior Product Manager, LLM Post-Training & Evaluation

About Centific

Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. We harness the power of an integrated solution ecosystem—comprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 markets—to create contextual, multilingual, pre-trained datasets; fine-tuned, industry-specific LLMs; and RAG pipelines supported by vector databases. Our zero-distance innovation™ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.

Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale, helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.

About Job

Senior Product Manager, LLM Post-Training & Evaluation

Company: Centific

Location: Palo Alto, CA or Seattle, WA (Hybrid/Remote)

Type: Full-time

Role Overview

As the Senior Product Manager, LLM Post-Training & Evaluation, you will own the product strategy, roadmap, and go-to-market execution for Centific’s platform capabilities in large language model post-training and evaluation. This is a high-impact, technically demanding PM role at the intersection of cutting-edge AI research and enterprise product delivery.

You will translate complex customer requirements from leading AI labs, enterprise teams, and foundation model builders into a coherent product vision — spanning evaluation frameworks, benchmark infrastructure, human-in-the-loop workflows, and post-training pipelines. You will work hand-in-hand with Research Scientists, AI/ML Engineers, and Language Data Scientists to define what gets built, why, and for whom.

The ideal candidate is deeply technical, customer-obsessed, and comfortable making hard prioritization tradeoffs in a fast-moving research-adjacent environment. You have enough ML fluency to earn the respect of research scientists and enough product instinct to build things customers actually want.

Key Responsibilities

Product Strategy & Vision: Define and own the product roadmap for LLM post-training and evaluation capabilities — including benchmark infrastructure, evaluation-as-a-service offerings, post-training workflow tooling, and model quality platforms. Align roadmap with Centific’s enterprise AI strategy and customer commitments.
Customer & Market Discovery: Conduct deep discovery with AI labs, enterprise ML teams, and foundation model builders to understand evaluation pain points, post-training needs, and quality measurement gaps. Synthesize insights into structured requirements and opportunity sizing.
Requirements & Prioritization: Translate complex technical and business requirements into clear, actionable product specifications. Make rigorous prioritization decisions across a broad research-and-product backlog, balancing customer urgency, strategic value, and engineering feasibility.
Cross-Functional Leadership: Partner with Research Scientists, AI/ML Research Engineers, and Language Data Scientists to drive delivery from concept to production. Lead sprint planning, milestone tracking, and cross-team alignment without direct authority.
Go-to-Market Execution: Partner with Sales, Solutions Engineering, and Marketing to develop positioning, packaging, and pricing for evaluation and post-training offerings. Create compelling product narratives and technical collateral for customer-facing engagements.
Customer Engagement & Feedback Loops: Serve as the product voice in customer conversations with technical stakeholders at leading AI organizations. Gather feedback, validate hypotheses, and iterate rapidly on platform capabilities.
Metrics & Outcomes: Define and track product KPIs for adoption, quality, and customer satisfaction. Use data to drive product decisions and communicate progress to executive leadership.
Platform Thinking: Ensure that evaluation frameworks, benchmark datasets, and post-training pipelines are built as reusable, scalable platform assets rather than one-off solutions — creating durable competitive differentiation for Centific.

Core Competencies & Areas of Ownership

You will drive product thinking and roadmap decisions across the following domains:

LLM Evaluation & Benchmarking

Deep understanding of LLM evaluation approaches: automated metrics, human evaluation, model-as-judge, and hybrid frameworks
Ability to reason about tradeoffs in benchmark design: validity, reproducibility, scalability, and cost
Familiarity with long-context, multimodal, and agentic evaluation challenges

Post-Training Methods & Workflows

Working knowledge of post-training techniques (SFT, RLHF, RLAIF, DPO, PPO, GRPO) sufficient to define product requirements and scope engineering work
Understanding of how evaluation design interacts with fine-tuning outcomes and model alignment
Familiarity with safety, robustness, and governance considerations in model development

Platform & Infrastructure Product Management

Experience productizing ML tooling, evaluation pipelines, or data workflows into repeatable, scalable platform capabilities
Ability to define APIs, data contracts, and integration patterns in collaboration with engineering teams
Track record of building internal and external-facing research or ML infrastructure products

Enterprise & Customer Engagement

Skilled at engaging with highly technical customer stakeholders (research scientists, ML engineers, technical PMs)
Ability to structure and run discovery conversations, distill insights, and build business cases
Experience supporting enterprise sales cycles with product expertise and technical storytelling

Required Qualifications

Experience: 4–5 years of product management experience, with 3+ years in AI/ML, data platform, or developer tools products.
Technical Fluency: Strong understanding of ML concepts, LLM training pipelines, and evaluation methodologies. Able to engage credibly with research scientists and engineers on technical tradeoffs without being the deepest technical expert in the room.
Education: BS/MS in Computer Science, Engineering, Statistics, or a related technical field. Equivalent applied experience considered.
0-to-1 Track Record: Demonstrated success taking AI/ML products from early concept through customer adoption, including defining MVPs, iterating on feedback, and scaling to broader use.
Discovery & Requirements: Strong customer discovery and requirements engineering skills; experience translating ambiguous, complex needs into structured specs and clear prioritization.
Cross-Functional Execution: Proven ability to lead without direct authority — aligning research, engineering, design, and go-to-market teams toward shared outcomes.
Communication: Exceptional written and verbal communication skills; able to write crisp PRDs, present to executives, and hold your own in deep technical discussions with research scientists.

Preferred Qualifications

Research-Adjacent Experience: Prior experience as an ML engineer, data scientist, or research engineer before moving into product management.
Evaluation Expertise: Hands-on experience designing or running LLM evaluation studies, benchmark datasets, or quality measurement frameworks.
Post-Training Practice: Exposure to fine-tuning or post-training workflows (SFT, RLHF, preference optimization) in a research or applied setting.
Foundation Model Ecosystem: Experience working at or closely with AI labs, foundation model companies, or enterprise AI infrastructure providers.
Scientific Contribution: Familiarity with the research literature in LLM evaluation, alignment, and post-training; able to read and critically assess papers at top venues (NeurIPS, ICML, ICLR, ACL, EMNLP).
Enterprise SaaS & Platform: Experience with platform-as-a-service or API-first products in the B2B enterprise space.
Safety & Governance: Familiarity with responsible AI, model governance, and compliance considerations relevant to enterprise GenAI deployments.

How to Apply

Please send your resume, a brief product portfolio summary (products you’ve owned, outcomes delivered), and a short note on your perspective on the biggest unsolved problems in LLM evaluation to:

diana.moeck@centific.com

Subject Line: Senior PM – LLM Post-Training & Evaluation

Salary: $160k-170k

Centific is an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, citizenship status, age, mental or physical disability, medical condition, sex (including pregnancy), gender identity or expression, sexual orientation, marital status, familial status, veteran status, or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories, consistent with legal requirements.