Senior Site Reliability Engineer (BB-E2549)

Found in: Talent CA

DescriptionAs a Senior Network Site Reliability Engineer, you will solve exciting technical challenges by analysing, troubleshooting, and designing vital services, platforms, and infrastructure while always thinking about reliability, scalability, resilience, security, and performance.As an SRE, you will understand the end-to-end configuration, technical dependencies, and overall behavioural characteristics of the production services you collaborate with. In partnership with your Development colleagues, you will have the responsibility to ensure that services are designed and delivered to be mission critical with a focus on security, resiliency, scale, and performance.You'll be responsible to help support 24x7 uptime and availability of production mission critical customer facing cloud services distributed across multiple regions. You'll help to create more consistent, automated push button environments across all tiers, proactively test and tune all aspects of the infrastructure, streamline CI/CD processes, monitor and respond to system notifications and alerts and continually work to optimize and improve the performance, security and reliability of our systems. What you will do• Help build a Site Reliability Engineering culture across the organization by sharing your best practices, approaches, documentation, and code with other engineering teams.• Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually.• Able to troubleshoot complicated, cross platform issues handling Operating Systems in a cloud-based SaaS and On Premisses environments, handle live production incidents, debug/troubleshoot application and infrastructure issues, follow and implement SRE best practices.• Monitor application performance take steps to improve overall application performance and stability and follow through with implementation.• Conduct system analysis, configuration management and develops improvements for system software performance, availability and reliability.• Design, write, ship, and motivate the creation of software and systems to increase observability, product reliability and organizational efficiency.• Work closely with software engineers and testers to ensure the system is responding properly to no-functional requirements such as performance, security, and availability.• Document your system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it.• Maintain and monitoring deployment, orchestration, of the servers, docker containers, databases, and general backend infrastructureQualificationsWho you are:• IT Operations experience in large scale, mission critical enterprise heterogeneous infrastructure leveraging DevOps, SRE & Agile methodologies with B. Tech./B.E. degree in Electronics & Telecomm or Computer Science• Excellent skills in Windows 2012, 2016 and 2019 server management; Group Policy design and configuration• Significant experience in cloud computing infrastructure and in particular on the Microsoft Azure platform• Ability to provide advice, best practices and recommendations for the operation and deployment of Microsoft Azure• Extensive experience of supporting / managing hypervisor-based products/infrastructure (VMware, KVM, etc.)• Experience with CI/CD in cloud environments and container technology, Docker and Kubernetes, Docker Swarm• Experience as Linux systems administrator (e.g. CentOS, RedHat) and command line system administration such as Bash, VIM, SSH.• Experience in monitoring and analysing infrastructure performance using standard performance monitoring tools - Nagios, New Relic, Perfmon, PerfView, ProcDump, DebugDiag• Strong understanding of Internet protocols and applications such as SMTP, DNS, HTTP, SSH, SNMP etc.• Hands on experience in configuration management of server farms (using tools such as Puppet, Chef, Ansible etc).• Demonstrated understanding of ITIL methodologies, ITIL v3 or v4 certification What we offer SITA’s workplace is all about diversity: many different countries and cultures are represented in our workforce, and colleagues who’ve been working here for decades collaborate with those just out of college and early in their careers. SITA is a place of change and constant improvement, where we're always pushing ourselves to find better ways of doing things: smarter, quicker, easier, for us and our customers and for their customers too. And we offer all the good stuff you’d expect like holidays, bonuses, flexible benefits, medical policy, pension plan and access to world-class learning. Welcome to SITA SITA is the world’s leading specialist in air transport communications and information technology. We don’t just connect the global aviation industry. We apply decades of experience and expertise to address almost every core business, operational, baggage, and passenger process in air transport. We design, build and support technology solutions all with one vision to create easy air travel every step of the way. As an organization, we cover 95% of all international air travel destinations and work with over 2,800 air transport and government customers in every corner of the globe. Are you ready to explore the opportunities? SITA is an Employment Equity Employer and values a diverse workforce. In support of our Employment Equity Program, women, Aboriginal people, members of visible minorities, and/or persons with disabilities are encouraged to apply and self-identify in the application process. #LI-JG1

calendar_today6 days ago


info Full-time

location_on Montreal, Canada

work SITA

I expressly authorise the Terms and Conditions

Similar jobs