Site Reliability Engineer - Mississauga, Canada - Randstad Digital

    Default job background
    Description

    Site Reliability Engineer - SRE (Contract Position)

    Number of Positions: 1 Filled: 0 Duration: 6 months

    Location: Mississauga, ON, CA

    Must be eligible to work in Canada

    This is a contract to hire position, 6months contract then FT Perm

    Hybrid role 2-3days/week onsite mandatory

    -The candidate must have a development (any) background.

    Part of SRE team, 3 other and one cloud arch

    interaction with dev, QA, vendor teams spread out in US and overseas

    Looking to automate the legacy system on Cloud Native.

    Must have skills and experience:


    • 3+ years of experience as an SRE supporting production infrastructure.


    • 5+ years of overall software engineering experience in a development environment.


    • Bachelor's degree in computer science and/or a wide range of relevant work experience.


    • Extensive experience with Azure and Windows systems.


    • Experience with container orchestration platforms such as Kubernetes.


    • Experience using IAC tools such as Terraform, Docker, Helm, Packer, Ansible, ARM.


    • Experience with configuration management tools such as Ansible, YAML and Terraform.


    • Experience managing observability tools such as Grafana, Kibana and Prometheus.


    • Experience with enterprise-grade software.


    • Experience with software development.


    • Experience with microservices architecture.


    • At least two years of experience managing Kubernetes production systems.


    • Experience with Power shell and Shell scripting


    • Strong verbal and written communications skills Solid knowledge of web architecture and systems.


    • Strong analytical and problem-solving skills.

    Roles and Responsibilities


    • Design and implement Kubernetes clusters according to business requirements, including scalability and security.


    • Build and maintain Docker container for use in the AKS environment.


    • Develop and maintain monitoring system to ensure the health and availability of SQL DBs, AKS clusters, ACA, APIM, file shares, service bus, web apps, etc. for production/Dev/Staging environments.


    • Build and own infrastructure through code and work closely with development/systems/networking teams to automate CI/CD pipelines to remove repetitive manual process to simplify operational needs.


    • Manage and optimize existing CI/CD pipelines.


    • Design, architect and develop cloud native solution using services like AKS, ACA, APIM, Azure SQL, Azure functions, service bus, data factory on Azure cloud platform.


    • Create and maintain technical documentation and build books.


    • Deploy application packages and new workloads to production environment.


    • Streamline and maintain QA and DEV environments that allows our developers and quality assurance teams to work more effectively and efficiently.


    • Perform regular DR drills and maintain DRP by collaborating with systems and development teams.


    • Identify and diagnose deficiencies with existing systems, frameworks, tools, and processes, and recommend creative solutions based on best practices and industry standards.


    • Create dashboards that provide visibility into production metrics.