Application Observability Engineer - Mississauga - TD SYNNEX

    TD SYNNEX
    TD SYNNEX Mississauga

    3 days ago

    Full time
    Description

    Actual annual compensation offered will be based on several variables including geographic location, work experience, education, and skills/ achievements, and will be mutually agreed upon at the time of offer. The average compensation for this role is $85,000-105,000 CAD

    About the Role

    As an Application Observability Engineer, you'll operate at the intersection of applications, infrastructure, and reliability. You will support and troubleshoot the internal platforms and microservices that power critical business systems, ensuring services are healthy, observable, and performant in a large-scale enterprise environment. You'll partner closely with application developers, platform engineers, operations teams, and system administrators to investigate production issues, validate deployments, and maintain stable environments. This role is ideal for someone early to mid-career (1–4 years' experience) who enjoys hands-on troubleshooting, learning how distributed systems work in practice, and supporting modern, containerized platforms using tools like Kibana, Grafana, Jaegar, VictoriaMetrics, Redis and Kubernetes.

    What You'll Do

    Support and troubleshoot enterprise platforms

  • Monitor and support internal platforms and microservices running across server-based and containerized environments.
  • Investigate production issues by analyzing logs, metrics, and system health signals to identify root causes.
  • Troubleshoot application-level failures, performance issues, and connectivity problems across distributed systems.
  • Work with containerized and microservices environments

  • View, manage, and troubleshoot containerized workloads, including application deployments and configuration changes.
  • Understand service lifecycles, health checks, and when corrective actions (such as restarts or escalations) are required.
  • Leverage AI-assisted tools to accelerate troubleshooting, analysis, and documentation while maintaining sound engineering judgment
  • Maintain configuration and application health

  • Review and maintain application configuration using centralized configuration management approaches.
  • Validate application health endpoints and diagnostic signals to ensure services are operating as expected.
  • Support application deployments to servers and platform environments following established processes.
  • Troubleshoot supporting platform services

  • Investigate issues related to platform dependencies such as caching or in-memory data stores (e.g., Redis).
  • Identify common failure modes such as configuration errors, resource exhaustion, or network-related issues that impact application behavior.
  • Test and validate APIs and services

  • Test, validate, and troubleshoot APIs using industry-standard tools to confirm expected behavior.
  • Work with development teams to reproduce issues and verify fixes before and after deployment.
  • Collaborate across engineering and operations teams

  • Partner with developers, platform engineers, and operations teams to resolve incidents and improve platform stability.
  • Document troubleshooting steps, findings, and operational runbooks to improve team knowledge and response time.
  • What We're Looking For

    Required:

  • 1–4 years of relevant experience in systems engineering, platform engineering, application support, DevOps support, or a related technical role.
  • Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
  • Experience troubleshooting applications in enterprise or distributed system environments.
  • Familiarity with containerized platforms and microservices architectures.
  • Experience using logging, monitoring, and observability tools to diagnose issues.
  • Coding or scripting knowledge (e.g., Java, Python, Bash) to assist with troubleshooting and automation.
  • Experience testing and validating APIs.
  • Working knowledge of networking concepts as they relate to applications and containerized workloads.
  • Strong analytical, problem-solving, and communication skills.
  • Preferred:

  • Exposure to cloud-native environments and CI/CD pipelines.
  • Experience troubleshooting caching systems or in-memory data stores (e.g., Redis) and using logging tools like Kibana (ElasticSearch)
  • Familiarity with application health checks, diagnostics, and service monitoring patterns.
  • Experience working in large-scale enterprise environments with multiple teams and shared platforms.
  • Site Reliability Engineering (SRE) experience
  • Working Conditions & Flexibility

  • Work model: Remote / work-from-home in Canada or the U.S., with a preference for candidates based in Ontario (Canada) or North America time zones.
  • Schedule: Occasional non-standard hours or overtime may be required based on business needs; some on-call availability may also be necessary for critical production support.
  • Travel: May include local (in-country) and global travel for key meetings or team events.
  • La rémunération annuelle réelle offerte sera déterminée en fonction de plusieurs facteurs, notamment la région géographique, l'expérience de travail, la formation ainsi que les compétences et réalisations. Elle sera convenue mutuellement au moment de l'offre. La rémunération moyenne pour ce poste se situe entre 85 000 $ et 105 000 $ CAD.

    À propos du poste

    En tant Ingénieur en observabilité des applications, vous travaillerez à l'intersection des applications, de l'infrastructure et de la fiabilité. Vous soutiendrez et dépannerez les plateformes internes et les microservices qui alimentent des systèmes d'affaires essentiels, en veillant à ce que les services demeurent stables, observables et performants dans un environnement d'entreprise à grande échelle. Vous collaborerez étroitement avec les développeurs d'applications, les ingénieurs de plateformes, les équipes d'exploitation et les administrateurs systèmes pour enquêter sur les problèmes en production, valider les déploiements et maintenir des environnements stables. Ce rôle convient parfaitement à une personne en début ou milieu de carrière (1 à 4 ans d'expérience) qui aime résoudre des problèmes concrets, comprendre le fonctionnement réel des systèmes distribués et soutenir des plateformes modernes et conteneurisées à l'aide d'outils comme Kibana, Grafana, Jaeger, VictoriaMetrics, Redis et Kubernetes.

    Vos responsabilités

    Soutenir et dépanner les plateformes d'entreprise

  • Surveiller et soutenir les plateformes internes et les microservices fonctionnant dans des environnements basés sur serveurs ou conteneurs.

  • Enquêter sur les problèmes de production en analysant les journaux, les métriques et les signaux de santé du système afin d'identifier les causes profondes.

  • Dépanner les défaillances applicatives, les problèmes de performance et les enjeux de connectivité dans des systèmes distribués.

  • Travailler avec des environnements conteneurisés et microservices

  • Visualiser, gérer et dépanner les charges de travail conteneurisées, incluant les déploiements applicatifs et les changements de configuration.

  • Comprendre les cycles de vie des services, les vérifications de santé et déterminer quand des actions correctives (comme des redémarrages ou des escalades) sont nécessaires.

  • Utiliser des outils assistés par l'IA pour accélérer le dépannage, l'analyse et la documentation tout en conservant un jugement d'ingénierie solide.

  • Maintenir la configuration et la santé des applications

  • Examiner et maintenir la configuration des applications à l'aide d'approches centralisées de gestion de configuration.

  • Valider les points de terminaison de santé et les signaux diagnostiques pour s'assurer que les services fonctionnent comme prévu.

  • Soutenir les déploiements d'applications sur les serveurs et les environnements de plateforme selon les processus établis.

  • Dépanner les services de plateforme de soutien

  • Enquêter sur les problèmes liés aux dépendances de plateforme comme les systèmes de cache ou les magasins de données en mémoire (ex. Redis).

  • Identifier les modes de défaillance courants tels que les erreurs de configuration, l'épuisement des ressources ou les problèmes réseau affectant le comportement des applications.

  • Tester et valider les API et services

  • Tester, valider et dépanner les API à l'aide d'outils standard de l'industrie pour confirmer le comportement attendu.

  • Travailler avec les équipes de développement pour reproduire les problèmes et vérifier les correctifs avant et après les déploiements.

  • Collaborer avec les équipes d'ingénierie et d'exploitation

  • Travailler en partenariat avec les développeurs, les ingénieurs de plateforme et les équipes d'exploitation pour résoudre les incidents et améliorer la stabilité des plateformes.

  • Documenter les étapes de dépannage, les constats et les guides opérationnels afin d'améliorer les connaissances de l'équipe et les temps de réponse.

  • Ce que nous recherchons

    Exigences :

  • 1 à 4 ans d'expérience pertinente en ingénierie des systèmes, ingénierie de plateforme, soutien applicatif, soutien DevOps ou rôle technique connexe.

  • Baccalauréat en informatique, technologies de l'information ou domaine connexe, ou expérience pratique équivalente.

  • Expérience en dépannage d'applications dans des environnements d'entreprise ou de systèmes distribués.

  • Familiarité avec les plateformes conteneurisées et les architectures microservices.

  • Expérience avec des outils de journalisation, de surveillance et d'observabilité pour diagnostiquer des problèmes.

  • Connaissances en programmation ou en scripts (ex. Java, Python, Bash) pour aider au dépannage et à l'automatisation.

  • Expérience en test et validation d'API.

  • Connaissances de base en réseautique appliquées aux applications et charges de travail conteneurisées.

  • Excellentes aptitudes analytiques, de résolution de problèmes et de communication.

  • Atouts :

  • Exposition aux environnements infonuagiques et aux pipelines CI/CD.

  • Expérience en dépannage de systèmes de cache ou de magasins de données en mémoire (ex. Redis) et utilisation d'outils de journalisation comme Kibana (ElasticSearch).

  • Familiarité avec les vérifications de santé applicative, les diagnostics et les modèles de surveillance de services.

  • Expérience dans des environnements d'entreprise à grande échelle avec plusieurs équipes et plateformes partagées.

  • Expérience en ingénierie de fiabilité des sites (SRE).

  • Conditions de travail et flexibilité

  • Modèle de travail : Télétravail au Canada ou aux États‑Unis, avec préférence pour les candidats basés en Ontario (Canada) ou dans les fuseaux horaires nord‑américains.

  • Horaire : Des heures non standard ou du temps supplémentaire peuvent être requis selon les besoins d'affaires; une disponibilité en rotation (on‑call) peut aussi être nécessaire pour le soutien critique en production.

  • Déplacements : Possibilité de déplacements locaux (dans le pays) ou internationaux pour des réunions clés ou des événements d'équipe.

  • Key Skills

    Application Monitoring, ElasticSearch, Grafana, IT Production Support, Kubernetes, Site Reliability Engineering

    At TD SYNNEX, our values guide everything we do: Together, We Own It, We Dare to Go, We Grow and Win, and above all, We Do the Right Thing. These principles shape how we work with each other, our partners, and our communities as we drive innovation and create lasting impact.

    What's In It For You?

  • Elective Benefits: Our programs are tailored to your country to best accommodate your lifestyle.
  • Grow Your Career: Accelerate your path to success (and keep up with the future) with formal programs on leadership and professional development, and many more on-demand courses.
  • Elevate Your Personal Well-Being: Boost your financial, physical, and mental well-being through seminars, events, and our global Life Empowerment Assistance Program.
  • Diversity, Equity & Inclusion: It's not just a phrase to us; valuing every voice is how we succeed. Join us in celebrating our global diversity through inclusive education, meaningful peer-to-peer conversations, and equitable growth and development opportunities.
  • Make the Most of our Global Organization: Network with other new co-workers within your first 30 days through our onboarding program.
  • Connect with Your Community: Participate in internal, peer-led inclusive communities and activities, including business resource groups, local volunteering events, and more environmental and social initiatives.
  • Don't meet every single requirement? Apply anyway.

    At TD SYNNEX, we're proud to be recognized as a great place to work and a leader in the promotion and practice of diversity, equity and inclusion. If you're excited about working for our company and believe you're a good fit for this role, we encourage you to apply. You may be exactly the person we're looking for


  • Only for registered members Mississauga $85,000 - $105,000 (CAD)

    · En tant Ingénieur en observabilité des applications, vous travaillerez à l'intersection des applications, de l'infrastructure et de la fiabilité. · Soutenir et dépanner les plateformes internes et les microservices fonctionnant dans des environnements basés sur serveurs ou con ...

  • Only for registered members Mississauga Full time $85,000 - $105,000 (CAD)

    En tant Ingénieur en observabilité des applications, vous travaillerez à l'intersection des applications, de l'infrastructure et de la fiabilité. · ...

  • Only for registered members Mississauga, Ontario, Canada

    + Ingénier en Observabilité des Applications · + Soutenir et êt trendre les plateformes internes · + Travailler avec des environnements conteneurisés · + Maintenir la configuration des applications ...

  • Only for registered members Mississauga, Ontario

    We're looking for a Senior Staff Engineer, Network Observability, who will lead the design, · development and implementation of enterprise-scale network automation solutions. · ...

  • Sepal Toronto

    Obselvability engineer helps understand debug operate complex production systems at scale. · Design complex distributed queries over massive log telemetry datasets. · Explore creative ways to challenge AI's reasoning ability log analysis skills. · ...

  • Only for registered members Mississauga $99,960 - $151,368 (CAD)

    +Job summary · We're looking for a Senior Staff Engineer Network Observability who will lead the design development and implementation of enterprise-scale network automation solutions.Qualifications8+ years of network engineering experience with 5+ years focused on network automa ...

  • Only for registered members Mississauga Full time $99,960 - $151,368 (CAD)

    We're looking for a Senior Staff Engineer, Network Observability who will take a hands‑on lead role in administering, maintaining, and enhancing TJX's enterprise network tools portfolio, · Oversee Administration and maintenance of NetOps, Thousand Eyes, NetBrain, NetScout, · Main ...

  • Only for registered members Toronto

    We are seeking an Observability SRE Engineer to join our team on a 6-12 month contract basis in Toronto. The ideal candidate will have 5+ years of experience in Observability or SRE and working knowledge of metrics, logs, and basic tracing concepts. · Hands-on experience with at ...

  • Only for registered members Toronto Full time

    We are seeking a talented Platform Software Engineer to join the team building the Cerebras Inference Platform. · You will be instrumental in designing developing and operating the core backend services APIs that power the Inference platform You'll build the software that allows ...

  • Only for registered members Toronto

    Dynatrace APM Engineer position involves creating dashboards and charts within the Dynatrace platform and utilizing visualizations to deliver application and infrastructure monitoring information. · ...

  • Only for registered members Toronto, Ontario

    This APM engineer will be responsible for leading the development and implementation of Dynatrace in monitoring applications, cloud and on-premises servers, and databases. They will analyze performance, establish baselines, create alerts using Dynatrace expertise. · ...

  • Only for registered members Toronto

    We are building technology that changes how people work, collaborate and succeed together. Join us in shaping the future of intelligent sales. · Position Summary · We're seeking a Lead Site Reliability Engineer to rebuild and own our observability strategy across both agentic sys ...

  • Only for registered members Toronto

    We are seeking a highly skilled Lead Observability Engineer to lead a critical implementation of Sumo Logic for a client migrating from Dynatrace. · Lead the end-to-end implementation of Sumo Logic observability platform for AWS and EKS environments. · Migrate monitoring and aler ...

  • Only for registered members Toronto

    We are looking for a Senior Site Reliability Engineer to join our Observability Team. As an SRE you will help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load.This job would be perfect for someone who has a strong DevOps me ...

  • Only for registered members Toronto, Ontario

    We are seeking a highly skilled Lead Observability Engineer to lead a critical implementation of Sumo Logic for a client migrating from Dynatrace. This role requires deep expertise in Sumo Logic, Site Reliability Engineering (SRE) practices, and Kubernetes (EKS) observability. · ...

  • Only for registered members Toronto, Ontario Remote job

    We are looking for a Senior SRE to help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load. · ...

  • Only for registered members Toronto $148,000 - $249,000 (USD)

    Waabi is looking for a Senior Staff Software Engineer to design and lead the architecture and development of Waabi's monitoring and observability stack.We are constantly expanding our compute footprint in the cloud, and need to expand our observability and monitoring capabilities ...

  • Only for registered members Toronto Full time $148,000 - $249,000 (USD)

    Waabi is the leader in Physical AI and we're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis. · ...

  • Only for registered members Mississauga, Ontario

    SRE L2 Support Engineer with 5+ years of experience in SRE, AWS, Dynatrace (Observability tools), and Production Support. · ...

  • Only for registered members Mississauga, Ontario

    Senior Backend Engineer with AWS OpenSearch Vector Index DB. · ...

  • Only for registered members Mississauga, Ontario

    We are seeking a highly skilled and experienced personnel to join our team. · ...

Jobs
>
Mississauga