As a Senior Site Reliability Engineer at Instapro Group, you will combine your software engineering expertise with a deep understanding of cloud infrastructure, particularly within the AWS ecosystem, to be responsible for our platform's reliability, scalability, and performance, as well as ensuring other teams are delivering work in a safe and effective continuous delivery setup. Instapro Group’s products positively impact countless homeowners and tradespeople across Europe and Canada.
Team and Technology
The Cloud Infrastructure Engineering team provides reliable & secure infrastructure to efficiently develop solutions that address user needs and foster business growth. You get to work with a team composed of a technical product manager and experienced engineers who are passionate about cloud technologies, distributed systems, and automation. Additionally, an engineering manager, and a director of technical strategy & execution will support your team.
We provision infrastructure using Terraform. We have AWS as a cloud provider and our platform consists of applications mostly running on EKS clusters while others run on Lambda. We also have a self-managed GitLab, which engineers in the organisation use to collaborate, test, and deploy their work.
Your role includes the following:
Collaborate effectively with the team and stakeholders to deliver high-quality solutions that improve reliability, scalability, observability & monitoring, and performance
Eliminate toil through automation and leveraging managed solutions to optimise the operation of our infrastructure and services
Monitor the health and performance of our infrastructure
Respond to and troubleshoot production incidents. Leading blameless post-mortem for learning and improvement
Enable engineers outside the team with their needs and promote industry best practices
Be proactive in suggesting improvements, ideas, knowledge, and offering help