About Exegy
Exegy is a global leader in intelligent market data, advanced trading systems, and future-proof technology. Exegy serves as a trusted partner to the complete ecosystem of the buy-side, sell-side, exchanges, and financial services technology firms around the globe. Headquartered in St. Louis with regional offices in North America, the UK/Europe and Asia Pacific, Exegy has the global footprint to deliver world-class support and managed services to its customer base of elite financial market participants.
Job Summary
Exegy is seeking an Associate Site Reliability Engineer (SRE) to help support the reliability, scalability, and performance of our global data center and hybrid infrastructure environments. This role is an excellent opportunity to build foundational skills in systems engineering, automation, and operational rigor while supporting Exegy's mission-critical market data products.
As an Associate SRE, you will work closely with senior engineers to maintain operational processes, utilize automation tools, monitor system health, and help troubleshoot failures. You will collaborate with Infrastructure, Network Engineering, Security, and DevOps teams to learn and apply best practices for delivering resilient and secure platforms.
Responsibilities
Day-to-Day Operations & User Support
Provision and deprovision users across Active Directory, Okta, and Exchange
Manage access requests and permissions across on-prem and cloud systems
Troubleshoot Tier 1/2 support issues across hardware, software, connectivity, and accounts
Support device provisioning, configuration, and onboarding
Maintain email distribution lists and shared mailboxes
Provide general office and A/V technology support
Infrastructure Reliability & Operations
Assist in maintaining uptime across core systems, including compute, storage, virtualization, and network infrastructure
Provide day-to-day support for production services across on-prem data centers, co-locations, and hybrid cloud environments
Participate in a 24×7 on-call rotation (with the support and escalation path of senior engineers) and assist in major incident response
Shadow and assist senior team members in root cause analysis (RCA) and postmortem documentation
Automation & AI
Help maintain and execute existing automation scripts using tools like Ansible, Terraform, PowerShell, Python, or Puppet
Learn to automate routine operational workflows, configuration management tasks, and deployments
Assist in managing Infrastructure-as-Code (IaC) templates under the guidance of senior SREs
Monitoring, Observability & Performance
Monitor systems, networks, and applications using platforms such as Prometheus, Grafana, Datadog, New Relic, SolarWinds, or Splunk
Respond to system alerts, update health dashboards, and perform routine log analysis
Assist in conducting proactive performance health checks across hardware, virtualization, and OS layers (Windows/Linux)
Data Center & Systems Engineering
Assist with the management of physical and virtual data center infrastructure, including basic hardware lifecycle tasks and capacity monitoring
Perform scheduled patching, firmware upgrades, and routine system maintenance
Participate in Disaster Recovery/Business Continuity (DR/BCP) testing alongside senior engineers
Help maintain secure baseline configurations aligned to industry standards
Collaboration & Documentation
Partner with Network, Security, DevOps, and Application Engineering teams to execute cross-departmental tasks
Actively participate in architecture and scaling discussions to learn system design principles
Assist with basic office desktop support; A/V systems, employee onboarding, etc
Create, update, and follow operational runbooks, playbooks, and standard operating procedures (SOPs)
Security & Compliance Integration
Follow and support security controls, including MFA, access management, and loffing protocols
Assist in gathering evidence and generating reports for audit requirements (SOC 2, ISO 27001)
Apply patches and updates as part of routine vulnerability remidiation efforts
Our Ideal Candidate Has:
Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent practical experience (recent graduates are encouraged to apply)
2–4 years of experience or relevant internships in Systems Administration, IT Operations, or Site Reliability Engineering
Foundational understanding of Linux and/or Windows server administration
Basic exposure to or familiarity with scripting languages (e.g., PowerShell, Bash, Python)
Understanding of fundamental networking concepts and data center operations
Familiarity with basic monitoring and alerting concepts
Strong willingness to learn new technologies (like VMware, AWS/Azure/GCP, and IaC tools) and grow within the SRE field
Ability to participate in an on-call rotation with a collaborative, team-first mindset
Loading similar jobs...
Discover fully remote job opportunities in the United States at USA Remote Jobs. Apply for roles like Software Developer, Customer Service Specialist, Project Manager, and more!