Jobs
>
Toronto

    System Reliability Engineer - Toronto, Canada - CGI

    CGI
    CGI background
    Full time
    Description

    Position Description:

    We are Canada's largest independent information technology services firm, and after 40 years, we're still growing Innovation, technology, and service delivery are our focus. Our goal is to ensure our clients remain ahead of the competition. We provide a full spectrum of managed services from IT and business process outsourcing to systems integration and consulting that are transforming our clients' operations and helping them to succeed.

    Do you enjoy working with a highly motivated and talented team to deliver mission critical developer tooling? We are currently expanding our System Reliability Engineering team that helps one of our key clients deploy, manage, troubleshoot, and enhance their developer tooling platform, servicing over developers.

    As a System Reliability Engineer, you will be responsible for designing, implementing, and supporting a verity of developer productivity tools that include Ansible Tower, GitLab, Artifactory and SonarQube. The technology stack used to manage the platform includes Ansible, Terraform, Python, Prometheus, Splunk, and ELK.

    You will build automation solutions to provision and validate infrastructure and help debug and resolve problems. You will help to improve operational performance by focusing on user experience, effectively assessing and managing risk, and minimizing the impact of failures.

    Responsibilities

    •Keeping all components of the developer productivity platform up and running

    •Working closely with internal partners and platform users to ensure that all services meet security, SLA, and performance requirements

    •Writing, updating, and using documentation, including runbooks and playbooks

    •Automating infrastructure deployment, testing, application failover, failure mitigation, user self-service functions, and more

    •Debugging complex problems across the entire stack

    •Participating in various meetings with the Operations and Delivery teams.

    •Lead Daily/Weekly Meetings to discuss the overall health of the systems.

    •Leading Root Cause Analysis calls

    •Propose and implement Monitoring Improvements/Optimization and Automation Opportunities

    •Take part in PI (Program Increment) Planning sessions

    Key Skills and Attributes

    •5 years experience with software engineering, software development, or system operations

    •Experience working with Linux and can write shell scripts and understands Linux internals and performance tuning

    •Strong understanding of networking principles

    •Experience debugging large scale complex systems in production

    •Experience in building, implementing, and supporting highly available production systems

    •Experience automating infrastructure and deployments using Terraform, Ansible, and Python or equivalent technologies

    •Understanding of DevOps engineering, CI/CD, and software deployment

    •Working knowledge of developer tooling such as Artifactory, GitLab, SonarQube, and Ansible Tower

    •Experience with various monitoring and observability tools

    •Experience deploying and managing workloads on one of the major public cloud platforms, private clouds such as OpenStack

    •Experience deploying and managing workloads on one of the major container management platforms like Kubernetes, OpenShift, PCF or Rancher

    •A curiosity about how complex socio-technical systems operate and what happens during failure

    It's not expected that any single candidate would have experience across all these areas – we are looking for someone who is strong in a few areas and has interest and curiosity in others.

    #LI-SH1

    Skills:

  • DevOps Engineering
  • GitHub
  • OpenShift
  • Linux


  • Manulife Insurance Malaysia Toronto, ON, Canada $92,190 - $171,210

    Senior Site Reliability Engineer page is loaded · Senior Site Reliability Engineer · Postuler locations Waterloo, Ontario Toronto, siège social mondial (200 Bloor) time type Temps plein posted on Publié hier job requisition id JR Nous sommes un fournisseur de services financie ...


  • Manulife Insurance Malaysia Toronto, ON, Canada $92,190 - $171,210

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer · Postuler locations Waterloo, Ontario Toronto, siège social mondial (200 Bloor) time type Temps plein posted on Publié hier job requisition id JR Nous sommes un fournisseur de services financiers q ...


  • Manulife Insurance Malaysia Old Toronto, Canada

    Senior Site Reliability Engineer page is loaded · Senior Site Reliability Engineer · Postuler locations Waterloo, Ontario Toronto, siège social mondial (200 Bloor) time type Temps plein posted on Publié hier job requisition id JR Nous sommes un fournisseur de services financier ...


  • Agnico Eagle Ontario, Canada

    YOUR NEXT CHALLENGE: · Reporting to the Senior Reliability Specialist, you will be part of the Process Plant Department. You will also ensure that the goals and objectives are achieved while promoting and respecting Agnico Eagle's values, Health & Safety Code of Conduct and the ...

  • Tata Consultancy Services

    Reliability Engineer

    2 weeks ago


    Tata Consultancy Services Toronto, Canada

    About TCS: · TCS operates on a global scale, with a diverse talent base of more than 600,000 associates representing 153 nationalities across 55 countries. TCS has been recognized as a Global Top Employer by the Top Employers Institute - one of only eight companies worldwide to h ...


  • Irving Consumer Products Limited Toronto, ON, Canada

    Corporate Reliability Engineering Co-op – Toronto - Fall 2024 Administration, Facilities & Secretarial · Corporate Reliability Engineering Co-op – Toronto - Fall 2024 · Irving Consumer Products is a leading manufacturer of premium tissue products – including national brands and ...


  • Kruger Products Ontario, Canada $72,000 - $96,000

    Kruger Products L.P. is most well known as the leading manufacturer and distributor of Canada's leading Tissue Brands (including Cashmere, Purex, Scotties', SpongeTowels, White Swan and Embassy) for both the consumer and away from home markets, but we're quickly becoming known a ...


  • J.D. Irving, Limited Toronto, ON, Canada

    Corporate Reliability Engineering Co-op – Toronto - Fall 2024 Corporate Reliability Engineering Co-op – Toronto - Fall 2024 · Irving Consumer Products is a leading manufacturer of premium tissue products – including national brands and private label. We pride ourselves in our co ...


  • Impala Canada Toronto, Canada

    Job Description: · Reporting to the Maintenance Planning and Reliability General Foreman, the successful candidate will work at the Lac des Iles mine site rotation to be determined. · Responsibilities: · Ensure the performance of problematic equipment and propose sustainable so ...


  • Impala Canada Ontario, Canada Full time

    Who We Are: · Impala Canada is the owner and operator of the Lac des Iles Mine, located 90 minutes northwest of Thunder Bay, Ontario. In operation for 30 years, the LDI Mine is one of only two known pure palladium sources in North America. Palladium contributes to a cleaner globa ...

  • CSG Talent

    Reliability Engineer

    3 weeks ago


    CSG Talent Ontario, Canada Full time

    Join a Leading Mining Company in Canada as a Reliability Engineer. This is the best opportunity to grow your career in the maintenance department with a large mining company with its global assets. · This is residential role and it comes with very attractive salary and a great re ...


  • Tata Consultancy Services Toronto, Canada

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity ...


  • Infotek Consulting Services Inc. Toronto, Canada

    Infotek Consulting is searching for a Site Reliability Engineer - this is a remote opportunity with some travel involved Job Description: Our EPM (Event and Performance Management) team is availability, performance and reliability management discipline that supports the optimizat ...


  • Tata Consultancy Services Toronto, ON, Canada

    TCS is an equal opportunity employer, and embraces diversity in race, nationality, ethnicity, gender, age, physical ability, neurodiversity, and sexual orientation, to create a workforce that reflects the societies we operate in. Our continued commitment to Culture and Diversity ...


  • Riverside Natural Foods Toronto, ON, Canada InternshipSHIP

    We are an organic, "better for you" snack company, manufacturing award-winning, nutritious, allergen-free snacks. Your kids know us as MadeGood, tasty school-safe snacks. And your pup begs for a Cookie Pal treat, our human-grade, vegan dog treats. · From our ingredients to our m ...


  • Paymentus Toronto, ON, Canada

    Summary Paymentus leads the North American marketplace in electronic bill payment solutions and is looking for high performers to join our development team building SaaS Fintech solutions across a range of industries. You will contribute to a massively scalable data platform, tha ...


  • OnX Canada toronto, Canada

    OnX is looking for a Site Reliability Engineer for one our clients in Toronto. Client: Financial Services Location: Toronto, mostly remote Duration: 6 months with potential extension JBoss in middleware experience is super important Responsibilities: Following the senior technici ...


  • Kruger Products Ontario, Canada

    P. is most well known as the leading manufacturer and distributor of Canada's leading Tissue Brands (including Cashmere, Purex, Scotties', SpongeTowels, White Swan and Embassy) for both the consumer and away from home markets, but we're quickly becoming known as one of Canada's b ...


  • Infotek Consulting Services Inc. Toronto, Canada

    Infotek Consulting is searching for a Site Reliability Engineer - this is a remote opportunity with some travel involved · Job Description: · Our EPM (Event and Performance Management) team is availability, performance and reliability management discipline that supports the opti ...


  • Akamai Toronto, ON, Canada

    Do you have a passion for cutting edge technologies and tackling system problems? · Are you a self-starting professional who thrives in a dynamic environment? Join our Site Reliability team. Our Team builds and delivers highly secure network security frameworks to protect our cus ...