Systems Reliability Engineer (SRE) at Qntfy (Arlington, VA) (allows remote)
Systems Reliability Engineer (SRE)
Qntfy is looking for a talented and highly motivated SRE to join our ops team. This team is responsible for deploying, configuring, and maintaining the core systems and services that our software depends on. We like to move fast and aren’t beholden to any single technology, but we do have some favorites. An ideal candidate will have experience with, or the ability to figure out quickly, technologies like Mesos/Marathon, Kubernetes, Ansible, and Docker. As a SRE at Qntfy, you will have the freedom and responsibility to recommend and implement core architectural changes in support of our long-term technological vision. We have on-premises and AWS deployments to manage and are looking to increase our use of Kubernetes.
- Communicate with peers, customers, and partners to foster cooperation and development.
- Design and implement the systems to support major new features for our platform.
- Translate customer needs into technical requirements and produce stable solutions.
- Effectively estimate time to implement solutions.
- Plan, execute, and maintain infrastructure as code.
- BS or Master’s degree in Computer Science/Engineering, related degree, or equivalent experience.
- 3+ years experience with DevOps, SysAdmin, and/or datacenter operations.
- Ability to architect and deploy core services to support distributed systems while maintaining flexibility and high-quality documentation.
- Strong work-ethic and passion for problem solving.
- Experience with Kubernetes, Docker, and/or Mesos/Marathon.
- Experience provisioning new systems in a reproducible and maintainable fashion (including the use of technologies like Ansible, Terraform, and Kops).
- Proficiency with AWS, Linux systems, and supporting services.
- Strong understanding of security best practices and how to implement them in the real world.