Principal MLOps Engineer

  1. Home
  2. Remote jobs
  3. Agile
  • Company Raft Company Website
  • Employment Full-time
  • Location 🇺🇸 United States, Massachusetts
  • Submitted Posted 3 weeks ago - Updated 3 hours ago
<div class="content-intro"><p><strong>This is a U.S. based position. All of the programs we support require </strong><strong>U.S. citizenship to be eligible for employment. All work must be conducted within the continental U.S.</strong></p></div><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><strong>Who we are:</strong></span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Raft (<a class="theme markdown__link" href="https://teamraft.com/" target="_blank">https://TeamRaft.com</a>) is a customer-obsessed non-traditional defense tech company dedicated to empowering U.S. military and government agencies with cutting-edge AI/ML and data solutions. We are a leader in autonomous data fusion and Agentic AI, with a purposeful focus on Distributed Data Systems, Platforms at Scale, and Complex Application Development. With headquarters in McLean, VA, our range of clients includes innovative federal and public agencies leveraging design thinking, cutting-edge tech stack, and cloud-native ecosystem. We build digital solutions that impact the lives of millions of Americans.</span></p><p class="p1"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">We’re looking for an experienced <strong>Principal ML Ops&nbsp;Engineer </strong>to support our customers and join our passionate team of high-impact problem solvers.</span></p><p><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><strong>About the role: </strong></span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Raft is building mission-critical AI and data platforms for the Department of Defense (DoD). Our systems ingest and process massive volumes of real-time data from hundreds of sensors and operational sources, transform that data into usable intelligence, and deliver it to operators through mission applications and common operational pictures that support time-sensitive decision-making.</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Our platform operates at scale, processing billions of events per day with low-latency data pipelines and cloud-native infrastructure. As Raft expands its AI capabilities, we are investing in a more mature end-to-end machine learning platform to support model development, evaluation, deployment, monitoring, and lifecycle management across both cloud and constrained operational environments.</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">In this role, you will help design, deploy, and mature Raft’s ML platform and MLOps infrastructure. You will work across Kubernetes-based deployment environments, GPU-enabled infrastructure, model serving systems, CI/CD pipelines, and secure production operations to enable rapid and reliable delivery of machine learning capabilities. This role is ideal for someone who understands both the infrastructure needed to run ML systems in production and the practical needs of ML engineers building and deploying models.</span></p><h3><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">What you’ll do:</span></h3><ul><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Design, build, and maintain secure, scalable MLOps infrastructure and deployment pipelines for production ML systems</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Help mature Raft’s internal ML platform and model lifecycle capabilities, including model packaging, registry/catalog workflows, deployment, monitoring, and operational support</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Deploy and manage machine learning workloads on Kubernetes, including GPU-enabled clusters</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Support model serving and inference infrastructure for a range of ML use cases, including traditional ML, computer vision, speech/audio, and LLM-based systems</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Build and maintain CI/CD workflows for ML services, model artifacts, and platform components</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Partner closely with ML engineers, software engineers, and product teams to move models from experimentation to reliable operational deployment</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Improve observability, reliability, security, and maintainability across ML infrastructure and services</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Help evaluate and standardize runtime patterns, serving frameworks, and deployment architectures for production ML workloads</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Contribute to infrastructure decisions across edge, on-prem, and cloud-hosted deployment environments</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Support compliance-driven deployment practices and secure software supply chain requirements in defense environments</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Get hands-on with customers at the most forward-leaning places in the Department of War</span></li></ul><p><br><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><strong>What we are looking for:</strong></span></p><ul><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">7+ years of relevant hands-on experience in software engineering, platform engineering, DevOps, MLOps, or related technical roles</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">5+ years of experience with Docker and Kubernetes in production environments</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">5+ years of experience supporting enterprise cloud infrastructure or applications in AWS, Azure, or similar environments</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Strong experience provisioning, operating, and troubleshooting Kubernetes clusters in production</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience building and maintaining machine learning platforms, infrastructure, or pipelines used by engineering or data science teams</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Practical experience deploying machine learning workloads on Kubernetes</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience managing clusters or workloads that use GPUs</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Strong understanding of Helm and Kubernetes deployment patterns</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Strong scripting or programming skills, preferably in Python</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience with modern software engineering practices including Git, CI/CD, DevOps, and Agile/Scrum workflows</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Strong troubleshooting, systems thinking, and communication skills</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Ability to work independently and collaboratively in a fast-moving environment</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Ability to obtain and maintain a Top Secret clearance</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Ability to obtain Security+ certification within the first 90 days of employment</span></li></ul><p><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><strong>Highly preferred:</strong></span></p><ul><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience with ML model serving and inference platforms such as Triton Inference Server, KServe, Ray Serve, vLLM, or similar technologies</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience with secure and compliant deployment practices in regulated or government environments</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience with Kubernetes-based ML platforms such as Kubeflow</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Familiarity with service mesh technologies such as Istio</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience provisioning and debugging complex CI/CD systems</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience with infrastructure as code tools such as Terraform</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Familiarity with software supply chain security, container hardening, vulnerability management, and runtime scanning</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Experience supporting ML systems across multiple deployment environments, including cloud, on-prem, and edge</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Background working with machine learning engineers on model training, evaluation, packaging, and release workflows</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;">Familiarity with storage and artifact systems used in ML platforms, such as S3-compatible object stores, registries, and metadata/catalog system</span></li></ul><h3><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">What success looks like:</span></h3><ul><li style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">You help Raft stand up a more mature and repeatable ML platform for deploying and managing models in production</span></li><li style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">ML engineers can move faster because deployment, serving, and platform workflows are clearer, more reliable, and easier to use</span></li><li style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Model deployments become more secure, observable, and supportable across real-world mission environments</span></li><li style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">The organization gains stronger infrastructure for model lifecycle management, including deployment standards, runtime patterns, and platform ownership</span></li></ul><p><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><strong>Clearance Requirements:</strong></span></p><ul><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Ability to obtain and maintain a Top Secret clearance&nbsp;</span></li></ul><p><span style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><strong>Work Type:&nbsp;</strong></span></p><ul><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Remote in DMV; McLean, VA; Boston, MA; San Antonio, TX; Colorado Springs, CO; Tampa, FL; Honolulu, HI <strong>Locations ONLY</strong></span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">May require up to 40% travel<br></span></li></ul><p><strong><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Salary Range: $150,000.00 - $200,000.00</span></strong></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><strong>What we will offer you: </strong></span></p><ul><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Highly competitive salary</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Fully covered healthcare, dental, and vision coverage</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">401(k) and company match</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Take as you need PTO + 11 paid holidays</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Education &amp; training benefits</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Annual budget for your tech/gadgets needs</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Monthly box of yummy snacks to eat while doing meaningful work</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Remote, hybrid, and flexible work options</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Team off-site in fun places!</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Generous Referral Bonuses</span></li><li style="font-family: arial, helvetica, sans-serif; font-size: 12pt;"><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">And More!</span></li></ul><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><strong>Our Vision Statement:&nbsp;</strong></span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">We bridge the gap between humans and data through radical transparency and our obsession with&nbsp;the&nbsp;mission.&nbsp;</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><strong>Our Customer Obsession:</strong>&nbsp;</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">We will approach every deliverable like it's a product. We will adopt a customer-obsessed mentality. As we grow, and our footprint becomes larger, teams and employees will treat each other not only as teammates but customers. We must live the customer-obsessed mindset, always. This will help us scale and it will translate to the interactions that our Rafters have with their clients and other product teams that they integrate with. Our culture will enable our success and set us apart from other companies.</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;"><strong>How do we get there?</strong>&nbsp;</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Public-sector modernization is critical for us to live in a better world. We, at Raft, want to innovate and solve complex problems. And, if we are successful, our generation and the ones that follow us will live in a delightful, efficient, and accessible world where out-of-box thinking,&nbsp;and collaboration is a norm.&nbsp;</span></p><p><span style="font-size: 12pt; font-family: arial, helvetica, sans-serif;">Raft’s core philosophy is&nbsp;<strong>Ubuntu: I&nbsp;Am, Because&nbsp;We are</strong>. We support our&nbsp;<em>“nadi”</em>&nbsp;by elevating the other Rafters. We work as a hyper collaborative team where each team member brings a unique perspective, adding value that did not exist before. People make Raft special. We celebrate each other and our cognitive and cultural diversity. We are devoted to our practice of innovation and collaboration.&nbsp;</span></p><div class="content-conclusion"><p><strong>We’re an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.</strong></p></div>

Loading similar jobs...

USA Remote Jobs

Discover fully remote job opportunities in the United States at USA Remote Jobs. Apply for roles like Software Developer, Customer Service Specialist, Project Manager, and more!

© 2026 Created by USA Remote Jobs. All rights reserved.