Site Reliability Engineer – GenAI Platform - MONTREAL & MIRABEL - Astra North Infoteck Inc.

    Astra North Infoteck Inc.
    Astra North Infoteck Inc. MONTREAL & MIRABEL

    1 day ago

    Description

    Experience: 8+ years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting IaaS platforms with networking and system engineer-ing knowledge.

    Roles and Responsibilities:

    • Operate, monitor, and maintain the infrastructure supporting GenAI applications (training, inference, feature store, data ingestion, model serving)

    • Design and build automation for core platform capabilities, reducing manual toil

    • Develop and maintain infrastructure-as-code (IaC) for provisioning and managing compute, storage, network, GPU clusters, Kubernetes / container orchestration, etc.

    • Establish, monitor, and enforce SLOs/SLIs/SLAs, error budgets, alerting, and dashboards

    • Lead incident response, root cause analysis (RCA), postmortems, and systemic remediation

    • Perform capacity planning, scaling strategies, workload scheduling, and resource forecasting

    • Optimize cost vs. performance tradeoffs in large-scale compute environments

    • Harden systems for security, compliance, auditability, and data governance

    • Collaborate across teams (cloud engineers, data engineers, infrastructure, secu-rity) to ensure safe deployment, rollout, rollback, and integration of new systems

    • Define disaster recovery (DR) strategies, backup/restore practices, fault toler-ance mechanisms

    • Maintain runbooks, operational playbooks, documentation, and training materials

    • Participate in on-call rotations and respond to production incidents 24/7 as needed

    • Continuously evaluate and integrate new tools, frameworks, or technologies to enhance platform reliability

    Skills:

    • Production experience in SRE / Infrastructure / ops for large-scale systems

    • Strong programming/scripting skills (Python, Go, Java, or equivalent)

    • Deep experience with containerization (Docker), orchestration (Kubernetes, etc.)

    • Infrastructure-as-code (Terraform, Helm, CloudFormation, Ansible, etc.)

    • Familiarity with GPU / AI compute clusters, high-performance data storage, and distributed architectures

    • Experience with monitoring / observability / logging / alerting tools (Prometheus, Grafana, ELK / EFK, Datadog, etc.)

    • Networking & systems engineering knowledge (TCP/IP, DNS, routing, load bal-ancing, distributed storage)

    • Solid experience in capacity planning, performance tuning, scaling, and incident response

    • Demonstrated ability to lead RCAs, deploy fixes, and drive reliability improve-ments

    • Experience in regulated environments (financial services, compliance, audit, se-curity) is a strong plus

    • Excellent communication, documentation, and cross-team collaboration skills

    • Proven track record of reducing operational toil via automation



  • Work in company

    GenAI Lead

    Only for registered members

    About Highspring · Highspring is a modern consulting and professional services firm specializing in data, AI, engineering, and digital transformation. We partner with organizations to architect, build, and scale technology solutions that drive meaningful business outcomes. Our te ...

    Montreal, Quebec

    6 days ago

  • Work in company

    GenAI Architect

    Only for registered members

    We are · Synechron is a leading global digital transformation consulting firm focused on financial services and technology organizations. Our specialties include end-to-end Artificial Intelligence, Consulting, Digital, Cloud & DevOps, Data, and Software Engineering. Our 13 FinLab ...

    Montreal, Quebec

    1 day ago

  • Work in company

    Technologue Créatif·ve

    Only for registered members

    English will follow · QUI NOUS SOMMES · Sid Lee est un collectif créatif multidisciplinaire qui cherche à faire une différence et à célébrer la culture avec ses idées audacieuses. Ses 700 virtuoses travaillent avec cœur comme une seule équipe à partir des bureaux de Montréal, Tor ...

    Montreal, Quebec

    2 days ago

  • Work in company

    Generative AI Engineer

    Only for registered members

    · About Alexa Translations · Alexa Translations provides translation services in the legal, financial, and securities sectors by leveraging proprietary A.I. technology and a team of highly specialized linguistic experts. Unmatched in speed and quality, our machine translation en ...

    Montreal, QC, Canada $70,000 - $115,000 (CAD) per year

    4 days ago

  • Work in company

    Senior Generative AI Engineer

    Only for registered members

    We are looking for a Senior Generative AI Engineer to develop our next-generation intelligent translation and translation-related service engine, · Research and implement state-of-the-art LLM techniques including continued pre-training, supervised fine-tuning... · ...

    Montréal, QC

    1 month ago

  • Work in company

    Senior Generative AI Engineer

    Only for registered members

    We are looking for a Senior Generative AI Engineer to develop our next-generation intelligent translation and translation-related service engine, using Generative AI (GenAI) and Large Language Model (LLM) technologies. · ResponsibilitiesResearch and implement state-of-the-art LLM ...

    Montreal

    1 month ago

  • Work in company

    Generative AI Engineer

    Only for registered members

    We are looking for a Senior Generative AI Engineer to develop our next-generation intelligent translation and translation-related service engine using Generative AI GenAI and Large Language Model LLM technologies. · Research and implement state-of-the-art LLM techniques including ...

    Montreal

    2 weeks ago

  • Work in company

    Senior Generative AI Engineer

    Only for registered members

    · About Alexa Translations · Alexa Translations provides translation services in the legal, financial, and securities sectors by leveraging proprietary A.I. technology and a team of highly specialized linguistic experts. Unmatched in speed and quality, our machine translation en ...

    Montreal, QC, Canada

    1 week ago

  • Work in company

    Spécialiste en Déploiement de Solutions IA

    Only for registered members

    Ce spécialiste sera responsable des modèles et du déploiements de solutions IA. · ...

    Montreal, Quebec

    1 month ago

  • Work in company

    GenAI Lead

    Only for registered members

    We are seeking a Hands-On GenAI Lead to serve as the technical authority for the architecture, design, delivery, and evolution of enterprise-scale generative AI platforms and solutions. · This role represents the highest individual contributor level-combining deep hands-on engine ...

    Montreal

    6 days ago

  • Work in company

    Principal Software Developer

    Only for registered members

    Nous recherchons un ingénieur logiciel principal exceptionnellement compétent et visionnaire…. · ...

    Montreal, Quebec

    2 weeks ago

  • Work in company

    Software Architect, Applied AI

    Only for registered members

    Job Requisition ID # · 25WD94061 · 25WD94061, Software Architect, Applied AI · French translation to follow/Traduction française à suivre · Position Overview · If you love building real systems that real customers use—and you get genuinely excited about LLMs, RAG, MCP, and agenti ...

    Montreal, Quebec

    1 week ago

  • Work in company

    Architecte TI

    Only for registered members

    Faites Carrière Avec Nous · *Note · : · À cet instant, Ericsson Canada Inc. ne fournit pas d'aide ou de parrainage en matière d'immigration pour ce poste. · À propos de cette opportunité · Nous renforçons nos capacités en architecture d'infrastructure pour soutenir la transformat ...

    Montreal, Quebec

    4 days ago

  • Work in company

    GenAI Lead

    Only for registered members

    About Highspring · Highspring is a modern consulting and professional services firm specializing in data, AI, engineering, and digital transformation. We partner with organizations to architect, build, and scale technology solutions that drive meaningful business outcomes. Our te ...

    Montreal

    6 days ago

  • Work in company

    Lead Gen AI

    Only for registered members

    We are seeking a highly skilled GenAI Lead with deep expertise in AI architecture, solution design, and hands-on development. This role is ideal for someone who thrives on solving complex problems, driving innovation, and leading end-to-end implementation of Generative AI solutio ...

    Montreal, Quebec

    2 weeks ago

  • Work in company

    GenAI Architect

    Only for registered members

    We are · Synechron is a leading global digital transformation consulting firm focused on financial services and technology organizations. Our specialties include end-to-end Artificial Intelligence, Consulting, Digital, Cloud & DevOps, Data, and Software Engineering. Our 13 FinLab ...

    Montreal $140,000 - $145,000 (CAD)

    1 day ago

  • Work in company

    Technologue Créatif·ve

    Only for registered members

    English will follow · QUI NOUS SOMMES · Sid Lee est un collectif créatif multidisciplinaire qui cherche à faire une différence et à célébrer la culture avec ses idées audacieuses. Ses 700 virtuoses travaillent avec cœur comme une seule équipe à partir des bureaux de Montréal, Tor ...

    Montreal

    2 days ago

  • Work in company

    Dé de Données- Staff

    Only for registered members

    Plusgrade est à la recherche d'un Staff Data Developer pour agir en tant que leader technique senior au sein de notre équipe Ingénierie des données ML. · ...

    Montreal, Quebec

    1 month ago

  • Work in company

    Java Developer

    Only for registered members

    · HIRING – Java Developers / Full Stack Engineers / AI Engineers · Montreal, QC (Hybrid) · Financial Services / Investment Banking Client · Multiple Positions Open · We're expanding our engineering team and looking for · experienced developers who enjoy building scalable enterpr ...

    Montreal, Quebec $70,000 - $115,000 (CAD) per year

    3 hours ago

  • Work in company

    Full Stack AI Engineer

    Only for registered members

    About Us · Xsolla is a global commerce company with robust tools and services to help developers solve the inherent challenges of the video game industry. From indie to AAA, companies partner with Xsolla to help them fund, distribute, market, and monetize their games. Grounded in ...

    Montreal, Quebec $80,000 - $120,000 (USD) per year

    2 days ago

Jobs
>
Montréal