Site Reliability Engineer

100% Remote

Health Tech

A Successful Health care Tech Company that is a leader in cancer research is looking for a SRE to join their team and help them accomplish their mission to improve lives by learning from the experience of every cancer patient.

‍What You'll Do

In this role, you'll work with the TechOps organization to accelerate their mission to improve cancer care and learn from patient experiences by ensuring that their technical infrastructure and staff maintain the highest levels of reliability, performance, and agility. You'll provide best practice guidance on reliability and scalability to their engineering teams. As a member of their SRE teams you will have a key role in scaling their technology platforms and empowering our development teams to consume them frictionlessly. In addition, you'll also:

Design and build infrastructure & systems that provide high levels of scalability, reliability, and performance, while balancing security, maintainability, and operational excellence.
Interface across teams to codify and reliably test infrastructure changes using their software development lifecycle
Partner with product and application teams to provide guidance and best practices around scalability, reliability, and performance of our productions systems, infrastructure, and software
Actively participate in code and configuration reviews
Craft solid and clearly explained designs, playbooks, and documentation, for consumption by teammates and the larger engineering organization
Improve operational efficiency through automation and deployment or development of new tools
Be proactive in performance & availability monitoring; provide remediations for systemic issues
Ingest requirements, scope work, produce estimates and help define deliverables with project timelines
Actively participate in on-call duties
Work as a team on escalations, resolving critical issues that impact our high SLA production systems

Who You Are

You're a Site Reliability Engineer with 4+ years of experience working in a devops or software engineering role. You're excited by the prospect of rolling up your sleeves to tackle meaningful problems each and every day. You’re a kind, passionate and collaborative problem-solver who seeks and gives candid feedback, and values the chance to make an important impact.

You have experience writing simple, readable, useful code, especially for operational tooling
You have experience with cloud environments such as AWS, Azure, or GCP
You have experience working with a production environment with high uptime requirements and measurable SLAs
You are familiar with container technologies such as Docker, Kubernetes or Mesos
You are proficient with configuration management, orchestration, and infrastructure-as-code tools such as Ansible and Terraform
You have demonstrated the ability to deliver high-quality, on-time solutions that are reliable, scalable, and maintainable
You are a strong communication skills and ability to work effectively across multiple business and engineering teams
You prefer working in a dynamic environment, comfortable challenging the status quo
You have the ability to adjust quickly to changing priorities and make quick decisions with limited information
You believe that a team working well together is truly smarter than the single smartest person on that team

‍

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Site Reliability Engineer

100% Remote

Health Tech

Apply Today