Remote (US) - Eastern Time
Urbint uses AI and the latest industry science to identify threats to workers and infrastructure to stop safety incidents before they happen. We are a tight-knit team working together to build powerful technology that prevents serious injuries and infrastructure damages. Many of the largest energy and infrastructure companies in North America trust Urbint to protect workers, assets, communities, and the environment.
Job Summary
We are seeking a Senior Site Reliability Engineer to spearhead the platform operations team, responsible for our production environments in North America. Urbint has a mix of self-hosted services deployed within Google Cloud, Azure and AWS. Our primary environment is Google Container Engine (Kubernetes.)
What You'll Do
- Design High-Availability Systems - ensure that all of the systems that we deploy and depend on are configured to maintain full uptime. Plan out deployment strategies to ensure that uptime is maintained during upgrades and maintenance. Design and build out infrastructure-as-code projects. Perform resiliency, load, and disaster recovery tests.
- Maintain System and Network Security - patch management, ensure that dependencies are kept up to date. Stay informed about zero-day vulnerabilities and any risks that cannot be immediately patched and come up with alternative methods to mitigate their risk.
- Logging, Metrics and Alerting - set up and monitor logs, metrics, and alerts for the systems.
- Diagnosis and Troubleshooting - diagnose and resolve production issues. Contribute to retrospectives and post-mortems. Participate in the on-call rotation.
- Guiding Development Team with Best Practices - working with the development team to ensure that the software being built will be practical to deploy and maintain.
- Build Engineering - managing build/deployment pipelines and ensuring best practices are followed in this.
- Continuous Learning - Stay up-to-date with industry best practices, tools, and technologies related to infrastructure..
- Mentorship - Lead and mentor a team of SREs, providing guidance, coaching, and technical expertise in infrastructure management.
Who You Are
- 5+ years of experience designing and maintaining application systems in the cloud
- A friendly person first and a technologist second
- A deep understanding of operating systems and computer architecture experience
- Good programming abilities - for application diagnosis, infrastructure-as-code, and scripting and glue components.
- Excellent communication and organizational skills is a must
Benefits
- Mission Driven - Some companies use AI to serve better digital ads and trade stocks, we seek to make our communities safer and more resilient.
- We are 100% Distributed - work from almost anywhere. This role requires business travel for in-person meetings with the ideal candidate located in the US Central or Mountain time zones.
- Competitive compensation and benefits packages
Urbint's Core Values
- Passionate about customers: We strive to deliver sustainable value and exceed expectations, and we’re not satisfied until our customers are raving fans.
- Be decisive: We make timely, informed, and pragmatic decisions to keep the organization moving forward.
- Build trust: Our values are the building blocks to trust. As we live them, we grow and build lasting relationships.
- Focus on impact: We measure and strive to continuously improve our real-world impact.
- Be tenacious: We are agile in our approach to addressing challenges but firm in our beliefs.
- Win together: We efficiently leverage our diverse skills and perspectives for one another, united by our shared vision.
We're an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.