Senior Site Reliability Engineer - Cambridge, Canada - NCR Corporation

    Default job background
    Full time
    Description

    About NCR VOYIX

    NCR VOYIX Corporation (NYSE: VYX) is a leading global provider of digital commerce solutions for the retail, restaurant and banking industries. NCR VOYIX is headquartered in Atlanta, Georgia, with approximately 16,000 employees in 35 countries across the globe. For nearly 140 years, we have been the global leader in consumer transaction technologies, turning everyday consumer interactions into meaningful moments. Today, NCR VOYIX transforms the stores, restaurants and digital banking experiences with cloud-based, platform-led SaaS and services capabilities.

    Not only are we the leader in the market segments we serve and the technology we deliver, but we create exceptional consumer experiences in partnership with the world's leading retailers, restaurants and financial institutions. We leverage our expertise, R&D capabilities and unique platform to help navigate, simplify and run our customers' technology systems.

    Our customers are at the center of everything we do. Our mission is to enable stores, restaurants and financial institutions to exceed their goals – from customer satisfaction to revenue growth, to operational excellence, to reduced costs and profit growth. Our solutions empower our customers to succeed in today's competitive landscape.

    Our unique perspective brings innovative, industry-leading tech to all the moving parts of business across industries. NCR VOYIX has earned the trust of businesses large and small — from the best-known brands around the world to your local favorite around the corner.

    SRE Ops Engineer / Designer

    About NCR Corporation
    NCR Corporation (NYSE: NCR) is a global technology company leading how the world connects, interacts and transacts with business. NCR's assisted- and self-service solutions and comprehensive support services address the needs of retail, financial, travel, healthcare, hospitality, and public sector organizations in more than 100 countries. NCR () is headquartered in Atlanta, Georgia.

    Position Summary

    This SRE Ops Engineer role is responsible for driving next generation service stability, reliability and performance of Cloud services Financial Services business unit. The position requires a strong understanding of and practical experience with cloud platforms, preferably Google Cloud Platform. Experience in running applications/services in the Cloud and associated processes and automation is required to be successful in this role.

    This position reports to the Manager, SRE Ops for Financial Services, and works with Software Engineering, Enterprise Architecture, Platform Architecture and the SRE Observability Engineers to ensure collaboration across all teams is facilitated.

    The ideal candidate will have a strong working knowledge of cloud technologies, a background in ITIL processes (Incident, Problem, Change, Monitoring) both in theory and in practice. The candidate should be able to demonstrate knowledge of key processes and tools associated with Production Operations.

    Position responsibilities:

  • Ensure a "Cloud Ready" approach is taken to service availability, reliability and performance
  • 24x7x365 on call support (in rotation) to manage and execute on the Incident Management process
  • Execute on Problem Management, using modern tools for forensics and validation of root cause
  • Software Development in terms of automating repeatable Operations tasks (TOIL)
  • Define and manage the configuration management database (CMDB) to ensure accurate data is always available and in use for key production processes and automation
  • Enable communications to both technical and business/Exec facing audiences in the spirit of transparency.
  • Capacity management for the cloud environments to ensure maximum availability and performance of services
  • SRE Metrics & Monitoring Strategy (SLI, SLO, etc.)
  • Accountable for the Backup, HA and DR implementation and exercise
  • Partner with the Engineering ARE teams and product teams
  • Responsible for SRE Ops Guidelines across all Clouds to ensure consistency in approach, execution and reporting.
  • Level 4 cloud infrastructure support
  • Schedule and lead all continuous improvement activities including Incident reviews, Change implementation reviews, TOIL automation candidate areas etc.
  • This position works closely with NCR's Global – SRE team within NCR Chief Technology Office

    that guide the overall SRE strategy and direction for NCR.

    Basic Requirements

  • BA/BS in Computer Science, MIS or related discipline
  • 10+ years of IT experience
  • 5+ Years experience in ITIL Service Management processes and associated domain technologies
  • Exposure/experience with SRE as a discipline
  • 3+ years experience with Google, AWS and/or Azure cloud platforms and technologies (IaaS, SaaS and PaaS)
  • 5+ years experience in software development (Object Oriented JAVA/.NET/similar)
  • 3+ years in Cloud deployment design with cost impact analysis and optimization
  • 5+ years datacenter technology, architecture, and operational experience
  • 5+ years experience supporting PCI-DSS, ISO27001, SOC2 certifications
  • Demonstrated history of innovative thinking and delivery, including disruptive innovation
  • 5+ years hands on experience in utilizing modern operations tools like ServiceNow, Dynatrace, AppDynamics, Splunk and similar.
  • Highly organized
  • Working knowledge of CI/CD pipelines
  • Understanding of and experience with cloud native databases such In both GCP and Azure
  • Excellent communication, meeting facilitation and listening skills
  • Strong negotiation, team working and interpersonal skills
  • Proficient with PowerPoint, Word and Excel
  • Ability to travel both domestically and internationality if needed (note: should be minimal travel for this position)
  • Preferred Requirements

  • Working knowledge of Terraform scripting to develop "infrastructure as code"
  • Working knowledge of other cloud automation tools like Ansible, Rundeck, and Chef
  • System Admin level experience in ServiceNow/Dynatrace/AppDynamics etc
  • Offers of employment are conditional upon passage of screening criteria applicable to the job