Member of Technical Staff, Evaluation - Montreal - Onixai

    Onixai
    Onixai Montreal

    1 day ago

    Full time
    Description

    You will own the quality bar for every AI agent Onix ships.

    Onix is building Personal Intelligence: AI that belongs to you, protects your data, and helps you grow with guidance from real experts. We work with world‑class physicians, researchers, and practitioners. Their knowledge gets turned into AI agents that users trust with real decisions about their health and performance. If an agent hallucinates, gives outdated advice, or drifts from what the expert actually believes, that is a serious problem. Your job is to make sure it does not happen.

    You will build the evaluation infrastructure that catches failures before users do. You will define what "aligned" means for an agent that represents a specific human expert, not a generic chatbot. This is not traditional ML safety work. This is applied alignment in a domain where accuracy has real consequences and the ground truth is a living, breathing expert with opinions.

    The hard part is not writing evals. The hard part is knowing what to eval for. You need to understand the difference between a confident wrong answer and a nuanced right one. Between an expert's actual position and a plausible‑sounding summary. Between safe and useful.

    What You Will Do

    • Design and build evaluation frameworks that measure agent accuracy, faithfulness, safety, and voice fidelity
    • Create automated and human‑in‑the‑loop pipelines for continuous agent quality assessment
    • Define alignment criteria specific to expert‑backed AI agents, not generic LLM benchmarks
    • Work with the agent building team to identify failure modes and build systematic defenses against them
    • Own the metrics that tell us whether an agent is ready to ship

    Who You Are

    You have spent real time thinking about how to measure whether an LLM is actually doing what it should. Not in the abstract, not as a research topic you follow on Twitter, but as something you have built systems around. You understand that evaluation is the hardest part of shipping reliable AI, and you are frustrated by how many teams treat it as an afterthought.

    You are technical enough to build eval pipelines in code and conceptual enough to define what "good" looks like when the answer is not in a test set. You have opinions about where RLHF falls short, why automated evals need human calibration, and how to measure things like tone and nuance that do not fit neatly into a metric.

    You have:

    • Deep experience with LLM evaluation: designing benchmarks, building scoring pipelines, analyzing failure modes
    • Strong Python skills and familiarity with ML tooling (model APIs, embedding systems, vector stores, evaluation frameworks)
    • Understanding of alignment techniques: RLHF, constitutional AI, red teaming, adversarial evaluation
    • Experience working with retrieval‑augmented generation systems and evaluating grounded outputs
    • Comfort reading research papers and translating ideas into practical systems

    You are:

    • Obsessed with correctness. You lose sleep over an agent giving subtly wrong advice
    • A systems thinker. You build frameworks, not one‑off tests
    • Skeptical of benchmarks that do not measure what matters. You ask "what are we actually testing?"
    • Self‑directed. You identify the gaps in quality before anyone tells you to look

    This Role is NOT For You If

    • You want to publish papers, not ship product
    • Your evaluation experience is limited to running standard benchmarks on public models
    • You need a large research team to be productive
    • You are not comfortable making judgment calls about quality in ambiguous domains
    • You want a remote job

    What Success Looks Like

    • Every agent ships with a clear evaluation report and a quantified confidence level
    • Failure modes are caught in evaluation, not by users
    • The eval framework scales to new experts without starting from scratch each time
    • You can explain to a non‑technical team member exactly why an agent passed or failed review
    • The quality of our agents is measurably better than anything built with prompt engineering alone

    How to Apply

    Submit your application and answer:

    • Describe an evaluation system you built for an LLM‑based product. What did you measure, and what did the metrics miss?
    • How would you evaluate whether an AI agent faithfully represents a specific human expert's views, not just general domain knowledge?
    • What is the most common way you have seen teams fool themselves into thinking their AI is working well when it is not?

    #J-18808-Ljbffr

  • Work in company Remote job

    Member of Technical Staff, Pretraining evaluations

    Only for registered members

    +We are training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, · +You will play a key role in helping us make modelling decisions based on experimental outcomes for our large language ...

    Montreal, Quebec

    2 weeks ago

  • Work in company Remote job

    Member of Technical Staff, Data Analysis and Evaluation

    Only for registered members

    We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation semantic search RAG agents. · ...

    Montreal, Quebec

    2 weeks ago

  • Work in company

    Assistant.e Gerant.e

    Only for registered members

    · Assistant Gerant (40 heures par semaine) · Salaire: $47,250 · En tant qu'Assistant Gerant, vous intégrez les pratiques de leadership de Lush · dans toutes les facettes de vos interactions avec la clientèle, le reste du personnel et les · opérations : soyez authentique, faites ...

    Montreal, Quebec, Canada $55,000 - $90,000 (CAD) per year

    4 days ago

  • Work in company

    Assistant Gerant

    Only for registered members

    · Assistant Gerant (40 heures par semaine) · En tant que gestionnaire en formation (MIT), vous intégrez les pratiques de leadership de Lush · dans toutes les facettes de vos interactions avec la clientèle, le reste du personnel et les · opérations : soyez authentique, faites pre ...

    Montreal, Quebec, Canada $55,000 - $90,000 (CAD) per year

    1 week ago

  • Work in company

    Assistant Gerant

    Only for registered members

    · Assistant Gerant (40 heures par semaine) · En tant que gestionnaire en formation (MIT), vous intégrez les pratiques de leadership de Lush · dans toutes les facettes de vos interactions avec la clientèle, le reste du personnel et les · opérations : soyez authentique, faites pre ...

    Montreal, Quebec, Canada $48,000 (CAD) per year

    1 week ago

  • Work in company

    Assistant Gerant

    Only for registered members

    L'assistant manager est responsable de soutenir la gestionnaire dans toutes les facettes de la boutique et d'assurer une expérience client unique et inclusive. · ...

    Montreal, Quebec, Canada

    1 month ago

  • Work in company

    Assistant Gerant

    Only for registered members

    L'Assistant Manager soutient la gestionnaire dans toutes les facettes de ses interactions avec la clientèle, le personnel et les opérations : être authentique, faire preuve de curiosité, diriger avec assurance, s'adapter et évoluer. · ...

    Montréal, QC

    1 month ago

  • Work in company

    Assistant Gerant

    Only for registered members

    En calidad de asistente gerente, se integran las prácticas de liderazgo de Lush en todas las facetas de la interacción con clientes, el personal del establecimiento y las operaciones: Ser real, ser curioso, liderar con seguridad adaptarse y evolucionar. · ...

    Montréal, QC

    1 month ago

  • MIS renforce sa structure de leadership terrain afin d'assurer la cohérence, la responsabilisation et l'excellence opérationnelle dans ses régions. · ACTION1 · ACTION N° 1 · ACTIONNÉE DANS LE TEXTE DU JOB (voir ci-dessus) · ...

    Greater Montreal Metropolitan Area

    3 weeks ago

  • Work in company

    Assistant Gerant

    Only for registered members

    Lush North America a créé des cosmétiques frais et faits à la main pendant les 20 dernières années - gardant les baignoires et les douches de nos clients un peu plus magiques grâce à ses boutiques de vente en ligne au Canada et aux États-Unis. Nous sommes dédiés aux pratiques de ...

    Montreal, Quebec, Canada

    1 month ago

  • Work in company

    Développeur(se) logiciel Staff

    Only for registered members

    Un rôle stratégique au cœur de la logistique Nous recherchons un.e Développeur(euse) logiciel Staff hautement motivé pour rejoindre notre équipe. · ...

    Montreal, Quebec

    1 month ago

  • Work in company

    Responsable d'activité

    Only for registered members

    Responsable des activités du camp junior ILSC Montréal-McGill. · ...

    Montreal, Quebec

    2 weeks ago

  • Work in company

    Responsable boutique

    Only for registered members

    Nous recherchons un(e) Responsable de boutique pour assurer la gestion complète des opérations quotidiennes d'un point de service situé à Montréal. La personne en poste jouera un rôle clé dans la satisfaction de la clientèle, la performance opérationnelle et la gestion d'équipe. ...

    Montreal, Quebec

    1 month ago

  • Situé au cœur du centre-ville, cet employeur de choix en croissance constante recherche un gestionnaire d'équipes de soutien administratif et juridique. · Saisissez l'occasion de rejoindre une organisation agile et dynamique et de jouer un rôle clé dans la fluidité de ses opérati ...

    Montreal, Quebec

    1 month ago

  • Work in company

    Associé de recherche clinique, – francophone

    Only for registered members

    L'Associé(e) principal(e) de recherche clinique participe à la préparation et à l'exécution des essais cliniques de phase I à IV. Il/elle supervise l'avancement des investigations cliniques en effectuant des visites d'évaluation intermédiaire, initiale et finale. · ...

    Montréal, QC

    1 month ago

  • Work in company

    Human resources coordinator

    Only for registered members

    Education: College, CEGEP or other non-university certificate or diploma from a program of 3 months to less than 1 year · Experience: 7 months to less than 1 year · Tasks · Plan and organize daily operations · Plan, develop, implement and evaluate human resources policies and pro ...

    Montreal, Quebec $48,000 - $78,000 (CAD) per year

    1 week ago

  • Work in company

    Bilingual Auto Desk Adjuster 2

    Only for registered members

    Description · Nous recrutons actuellement un Expert en sinistres bilingue - en réclamations automobile niveau 2 pour rejoindre notre équipe au Québec. L'emplacement du poste est flexible, et nous pouvons offrir des arrangements de travail en bureau, hybrides ou entièrement à dist ...

    Montreal, QC, Canada

    5 hours ago

  • Work in company

    administrative assistant

    Only for registered members

    Arrange and co-ordinate seminars conferences etc Assist with staff consultation and grievance procedures Direct and control daily operations Evaluate daily operations Motivate staff Determine and establish office procedures and routines Respond to employee questions and complaint ...

    Montreal, Quebec

    1 month ago

  • Work in company Remote job

    Bilingual Auto Desk Adjuster 2

    Only for registered members

    Nous recrutons actuellement un Expert en sinistres bilingue - en réclamations automobile niveau 2 pour rejoindre notre équipe au Québec. L'emplacement du poste est flexible et nous pouvons offrir des arrangements de travail en bureau hybrides ou entièrement à distance. · L'expert ...

    Montreal, Quebec

    1 month ago

  • Work in company Remote job

    Bilingual Auto Desk Adjuster 2

    Only for registered members

    Nous recrutons actuellement un expert en sinistres bilingue pour rejoindre notre équipe au Québec. L'emplacement du poste est flexible et nous pouvons offrir des arrangements de travail en bureau, hybrides ou entièrement à distance. · ...

    Montreal, Quebec

    3 weeks ago

  • Work in company

    Gestionnaire de Cuisine

    Only for registered members

    Gestionnaire de Cuisine · Services alimentaires · Poste permanent en tournée · Nous sommes nés d'une troupe éclectique. Viens vivre l'expérience d'être toi-même au quotidien pour créer l'extraordinaire. · NOTRE MISSION · Depuis 1984, Le Groupe Cirque du Soleil mise sur un travail ...

    Montreal, Quebec $38,000 - $62,000 (CAD) per year

    5 days ago

Jobs
>
Montréal