- Develop machine learning and AI models addressing key operational areas:
- Incident prediction and early warning systems to prevent downtime.
- Event correlation and noise reduction to streamline alerts.
- Capacity forecasting and anomaly detection for optimal resource utilization.
- Incident prediction and early warning systems to prevent downtime.
- Shift IT operations from reactive analytics to proactive, predictive operations.
- Integrate AI models with IT operations platforms, including:
- Observability systems such as Dynatrace (logs, metrics, traces).
- ITSM platforms, particularly ServiceNow, to improve incident and problem management workflows.
- Automation tools to enable intelligent operational actions and remediation pipelines.
- Observability systems such as Dynatrace (logs, metrics, traces).
- Deploy models as APIs, microservices, or batch jobs to support operations teams.
- Implement full MLOps lifecycle: model versioning, CI/CD, drift monitoring, explainability, and reliability.
- Ensure models are production-ready, compliant, stable, and auditable, especially in banking environments.
- Execute AI workloads on leading cloud platforms: Azure, AWS, GCP.
- Optimize compute, storage, and inference costs for 24×7 operational environments.
- Proficiency in Python and machine learning engineering.
- Expertise in time-series analysis, anomaly detection, and forecasting.
- Hands-on experience with MLOps practices and tools.
- Familiarity with infrastructure telemetry, including logs, metrics, and events.
- Experience with cloud platforms and containerized environments (preferably Kubernetes).
- Strong collaboration with IT operations, observability, and automation teams.
- Experience in enterprise-scale IT infrastructure, particularly in banking or financial services.
- Ability to translate AI/ML insights into actionable operational improvements.
-
Toronto, Ontario M5V 3L9 Posted March 5th, 2026 · Looking for more job opportunities? Click here · Job Type: Full Time · Job Category: IT · Job Description · Role: AI Engineer – Intelligent Operations (Infrastructure) · Location: Toronto, ON · Employment Type: Full-Time (FT) · Wo ...
Toronto, ON $80,000 - $145,000 (CAD) per year1 day ago
-
We're seeking an experienced AI Engineer (AIOps / Infrastructure Focus) to design and deploy production-ready AI/ML solutions that enhance automation, predictive monitoring, and intelligent IT operations. · ...
Toronto, Ontario2 weeks ago
-
About the Role: · Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems. Our mission is ...
Toronto, ON1 week ago
-
A Dynatrace Expert · Toronto, ON · senior specialist in observability and AIOps, responsible for architecting, implementing, and managing advanced Dynatrace solutions within large, complex enterprise ecosystems. This role goes beyond traditional monitoring, emphasising platform a ...
Toronto, Ontario3 days ago
-
Toronto - 4 Days WFO · A Dynatrace Expert is a senior specialist in observability and AIOps, responsible for architecting, implementing, and managing advanced Dynatrace solutions within large, complex enterprise ecosystems. · This role goes beyond traditional monitoring, emphasis ...
Toronto3 days ago
-
We are seeking an experienced IBM Security Access Manager (ISAM) Engineer with strong expertise in access management, federation, and single sign-on (SSO) solutions. · ...
Toronto, Ontario1 month ago
-
Welcome to TELUS Digital — where innovation drives impact at a global scale. As an award-winning digital product consultancy and the digital division of TELUS, one of Canada's largest telecommunications providers, we design and deliver transformative customer experiences through ...
Toronto, Ontario1 month ago
-
Who We Are · Welcome to TELUS Digital — where innovation drives impact at a global scale. As an award-winning digital product consultancy and the digital division of TELUS, one of Canada's largest telecommunications providers, we design and deliver transformative customer experi ...
Toronto, ON1 week ago
-
· At PointClickCare our mission is simple: to help providers deliver exceptional care. And that starts with our people. As a leading health tech company that's founder-led and privately held, we empower our employees to push boundaries, innovate, and shape the future of healthca ...
Toronto, Ontario $115,000 - $128,000 (USD) per year5 days ago
-
Job Description · WHAT IS THE OPPORTUNITY? · RBC Technology Infrastructure seeks a full stack Data Scientist (DS) to explore and operationalize big data sources to reduce outage and down time for RBC services that leads to improve user experience and save costs. Seeking a DS with ...
Toronto, Ontario $105,000 - $160,000 (CAD) per year1 week ago
-
RBC Technology Infrastructure seeks a full stack Data Scientist to explore and operationalize big data sources to reduce outage and down time for RBC services that leads to improve user experience and save costs. · ...
Toronto, Ontario1 month ago
-
Job Description · What is the opportunity? · We are seeking a dynamic Director, Strategy & Execution to support our Hybrid Cloud & Intelligent Operations Program. This leader will drive strategy, planning, execution, reporting and business operations across a multi-year Cloud & A ...
Toronto, Ontario $130,000 - $220,000 (CAD) per year2 days ago
-
About The Role · Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems. Our mission is t ...
Toronto, Ontario $115,000 - $170,000 (CAD) per year2 days ago
-
Work Location: · Toronto, Ontario, Canada · Hours · 37.5 · Line Of Business · Technology Solutions · Pay Details · $81,600 - $115,200 CAD · TD is committed to providing fair and equitable compensation opportunities to all colleagues. Growth opportunities and skill development are ...
Toronto, Ontario2 weeks ago
-
We are seeking a Senior AI/ML Engineer to drive the strategic implementation of generative AI and machine learning capabilities across TD's enterprise platforms. · Bachelor's degree in Computer Science, Machine Learning or related technical field. · Minimum 5+ years hands-on AI/ ...
Toronto, Ontario1 month ago
-
Work Location: · Toronto, Ontario, Canada · Hours · 37.5 · Line Of Business · Technology Solutions · Pay Details · $81,600 - $115,200 CAD · TD is committed to providing fair and equitable compensation opportunities to all colleagues. Growth opportunities and skill development are ...
Toronto, Ontario $80,000 - $135,000 (CAD) per year3 days ago
-
Se busca un Director de Apoyo para aplicaciones que liderará y dirija el funcionamiento continuo e implementación de aplicaciones y infraestructura en producción para cumplir con los requisitos comerciales según estándares operativos y servicios. · ...
Toronto, Ontario1 month ago
-
L'entreprise TD recherche un ingénieur en apprentissage automatique expérimenté pour travailler sur des projets de machine learning et d'intelligence artificielle. · ...
Toronto, Ontario1 month ago
-
We're seeking an experienced AI Engineer (AIOps / Infrastructure Focus) to design and deploy production-ready AI/ML solutions that enhance automation, predictive monitoring and intelligent IT operations. · Build AI models for incident prediction & early warning systemsImplement a ...
Toronto $55 - $58 (USD)2 weeks ago
-
Job Description · Experience Level 7 years operations with 5 years in cloud RunOps or SRE leadership · Role Overview · We are looking for a RunOps Architect to lead the design and implementation of operational excellence across cloud platforms This role will focus on building sca ...
Toronto2 days ago
AI Engineer – AIOps, IT Infrastructure ML - Toronto - Astra North Infoteck Inc.
Description
AI Engineer – AIOps / IT Infrastructure ML
Role Overview
The AI Engineer – Intelligent Operations is responsible for designing and implementing production-ready AI and machine learning (AI/ML) solutions that enhance automation, optimize processes, and provide predictive capabilities within IT infrastructure and operations. The primary focus is integrating AI into IT operations workflows and collaborating with teams specializing in observability, IT service management (ITSM), cloud, and infrastructure.
Core Responsibilities
AI and AIOps Engineering
Integration with IT Operations Platforms
MLOps for Infrastructure
Cloud and Platform Engineering
Required Skills
Preferred Attributes
-
AI Engineer
Only for registered members Toronto, ON
-
AI Engineer – Intelligent Operations
Only for registered members Toronto, Ontario
-
Senior Manager, Site Reliability Engineering
Only for registered members Toronto, ON
-
Dynatrace Expert
Only for registered members Toronto, Ontario
-
Dynatrace Platform Architect
Astra North Infoteck Inc.- Toronto
-
Senior Security Engineer
Only for registered members Toronto, Ontario
-
Senior ServiceNow Implementation Specialist
Only for registered members Toronto, Ontario
-
Senior ServiceNow Implementation Specialist
Only for registered members Toronto, ON
-
Intermediate Software Engineer-
Only for registered members Toronto, Ontario
-
Senior Data Scientist
Only for registered members Toronto, Ontario
-
Senior Data Scientist
Only for registered members Toronto, Ontario
-
Director, Strategy
Only for registered members Toronto, Ontario
-
Senior Site Reliability Engineer
Only for registered members Toronto, Ontario
-
Software Engineer II - Machine Learning (B3617)
Only for registered members Toronto, Ontario
-
Senior AI/ML Engineer
Only for registered members Toronto, Ontario
-
Software Engineer II
Only for registered members Toronto, Ontario
-
Director, Production Application Support
Only for registered members Toronto, Ontario
-
Ingénieur II, Apprentissage automatique
Only for registered members Toronto, Ontario
-
AI Engineer – Intelligent Operations
Only for registered members Toronto
-
AIOPs Engineer
Only for registered members Toronto