Jobs
>
Vancouver

    Site Reliability Engineer - Vancouver, Canada - Axiom Zen

    Axiom Zen
    Axiom Zen Vancouver, Canada

    1 week ago

    Default job background
    Description

    We're looking for a Site Reliability Engineer who wants to be at the technical core of an organization that's completely reshaping how distributed applications on blockchains can reach massive audiences.

    You will join a Site Reliability Engineering team that has the ability to architect, build, and iterate on resilient, scalable systems.

    SRE also guides the organization in areas of Observability, Reliability, and Incident Response.

    The support we provide the other engineering teams enable them to deliver features that wow and delight our customers at a fast pace.

    In the role, you can expect to help us launch reliable products and services with your experience and skills.

    You'll join an established team with a focus on providing highly technical support to the rest of the Engineering organization.

    You will be leveraging infrastructure-as-code, submitting code changes via Pull Requests, and finding creative solutions for the unique and varying needs of each Engineering team.

    You'll contribute to the improvement of our in-house systems by researching and applying the latest and greatest technology to our stack.

    You'll become empowered to fully apply your experience, lessons learned, and technical abilities in an environment with little tech debt, no on-prem servers, and a strong foundation based on cloud-native technologies such as Kubernetes and industry leading cloud platforms.

    Every day, you'll collaborate with a world-class team both in our Vancouver office and distributed worldwide

    What we'll accomplish together:

    Develop effective infrastructure (cloud platform services, networking, kubernetes, etc.) for our projects to deploy onto, ensuring projects are scalable, resilient, and reliable in support of growing products.

    Build shared observability services including metrics, logs, tracing, and dashboarding as well as embody a center of excellence partnering with other teams to define SLOs and actionable error budgets for everyone's services.

    Respond to infrastructure incidents and support the larger Engineering team with their product incident response strategy.

    Perform post-mortems and in-depth root cause analysis to ensure we are always improving.

    Enhance tools and automation to fill the gaps in our current systems as well as build entirely new ones as we face bigger and more complex challenges.

    A little about you:

    You execute on defined projects to achieve team-level goals and independently define the right solutions or use existing approaches to solve defined problems.

    You understand OS, networking, kubernetes and other cloud native services and can debug system issues and identify system bottlenecks.

    You have experience working with Infrastructure as Code systems like Terraform, pulumi, or CloudFormation.

    You have experience collecting and processing metrics from tools such as Prometheus/Datadog/NewRelic and are familiar with the concepts of SLOs and SLI targets.

    You are comfortable with responding to production incidents and can fight fires with a calm and level head, leveraging post mortems to apply lessons learned.

    You have experience coding and developing applications. Bonus points for Go experience.

    You are comfortable diving into an unfamiliar system and finding your way around.

    While you believe in processes and the power of planning, you understand that you will often have to roll with the punches and prioritize the most impactful tasks on the fly.

    You have a strong ability to collaborate with cross-functional teams and build solid working relationships with everyone in the organization, from individual contributors to the CEO.

    You have experience building and working on deployment systems.

    You have self-awareness about your strengths and areas for development

    At Dapper Labs, we're looking for people who are passionate about what they do. You're encouraged to apply even if your experience doesn't precisely match the job description

    Full-time



  • Stafflink Vancouver, BC, Canada

    Job Description · Position: Site Reliability Engineer · Duration: 12 Months · Location: Principally remote, with at least one day per month in office for applicants in the lower mainland. Local candidates are given preference. · Work hours: Monday – Friday, 9:00 am – 5:00 ...


  • T-Net British Columbia Vancouver, BC, Canada

    Site Reliability Engineer Co-op (Sept May 2025) Job Overview · Our innovative technology transforms the way that organisations make decisions, allowing them to elevate their employees and drive better business outcomes. Embarking on an exciting new chapter in our growth story, w ...


  • Dapper Labs Vancouver, Canada Full time

    We're looking for a Site Reliability Engineer who wants to be at the technical core of an organization that's completely reshaping how distributed applications on blockchains can reach massive audiences. · You will join a Site Reliability Engineering team that has the ability t ...


  • Visier, Inc Vancouver, BC, Canada

    Our innovative technology transforms the way that organizations make decisions, allowing them to elevate their employees and drive better business outcomes. Embarking on an exciting new chapter in our growth story, we are looking for talented individuals who can help both Visier ...


  • Visier, Inc Vancouver, BC, Canada

    Visier Co-op Opportunity · Our innovative technology transforms the way that organisations make decisions, allowing them to elevate their employees and drive better business outcomes. Embarking on an exciting new chapter in our growth story, we are looking for talented individua ...


  • Visier Inc. Vancouver, BC, Canada

    Our co-op experience is unique and designed to prepare you for professional success as you work on real, impactful work from the beginning. Our ultimate goal is to give you the mentorship, training, and work experience you need to start your career. A number of our students retur ...


  • Red Hat British Columbia, Canada

    About the job · Red Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat's enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling cus ...


  • RAZR Marketing, Inc. Vancouver, BC, Canada

    You will be required to be in our office In Vancouver, BC three times per week. · These values have made RAZR what it is for years, and today, they are more important than ever. You can't wait to get out of bed in the morning & get on with your day · We are seeking a skilled an ...


  • Taurus SA Vancouver, BC, Canada

    Are you ready to take on an entrepreneurial challenge in the digital asset industry? Taurus, a global leader in digital asset infrastructure, has an exciting opportunity for you. · Founded in April 2018, Taurus provides enterprise-grade solutions to issue, custody, and trade dig ...


  • Razr Marketing Vancouver, BC, Canada

    Senior Site Reliability Engineer · These values have made RAZR what it is for years, and today, they are more important than ever. You can't wait to get out of bed in the morning & get on with your day · We are seeking a skilled and motivated Site Reliability Engineer (SRE) to ...


  • Sentry Vancouver, BC, Canada

    About the role · The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance and monitoring of Sentry's hosted platform. We do this by leveraging automation tools to automatically spin up and scale services to meet the traffic demands of 1 ...


  • Stafflink Vancouver, BC, Canada

    Position: Site Reliability Engineer · Location: Principally remote, with at least one day per month in office for applicants in the lower mainland. Local candidates are given preference. · Monday - Friday, 9:00 am - 5:00 pm PST · Serve as the subject matter expert (SME) for Dynat ...


  • Dapper Labs Vancouver, BC, Canada

    We're looking for a Site Reliability Engineer who wants to be at the technical core of an organization that's completely reshaping how distributed applications on blockchains can reach massive audiences. · You will join a Site Reliability Engineering team that has the ability to ...


  • Taurus SA Vancouver, Canada CDI

    Are you ready to take on an entrepreneurial challenge in the digital asset industry? Taurus, a global leader in digital asset infrastructure, has an exciting opportunity for you. · Founded in April 2018, Taurus provides enterprise-grade solutions to issue, custody, and trade dig ...


  • Red Hat, Inc. British Columbia, Canada

    About the job · Red Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat's enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling cu ...


  • TEEMA Vancouver, Canada Full time

    MUST LIVE IN CANADA NEAR AN AIRPORT · Looking for a technical lead with 10+ years of DevOps/SRE experience · MUST HAVE - 5+ years permanent residence or Citizenship (cant have lived out of Canada for the last 5 years) · MUST LIVE IN CANADA NEAR AN AIRPORT · Looking for a technica ...


  • Red Hat, Inc. British Columbia, Canada

    About the job · Red Hat is seeking a Senior Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. . OpenShift is a cloud native application platform for the enterprise, powered by Kubernetes. As an SRE you will contribute to runnin ...


  • Electronic Arts Vancouver, Canada

    EA's Digital Platform (EADP) organization drives important technology decisions and investments for EA on a global basis, across all divisions and studio teams. Technology and engineering leadership at EA is essential to making the industry's best games and services and the EADP ...


  • Electronic Arts Vancouver, Canada Regular

    Responsibilities · : You will create monitoring, alerting and dashboarding solutions that improve visibility into EA's application performance and business metrics. · You will help design and develop robust, supportable tools to automate the deployment and management of distrib ...


  • New Value Solutions Richmond, Canada

    New Value Solutions, a national IT consulting company, is seeking a Site Reliability Engineer for our client. · Responsibilities: · Serve as the subject matter expert (SME) for Dynatrace, responsible for configuring, optimizing, and managing Dynatrace monitoring solutions. · Des ...