LLM Evaluation, Benchmarking

Only for registered members Canada

1 month ago

Default job background
Location: · Remote · Type: · Contract or Full-Time · About the RoleWe're looking for an LLM Evaluation, Benchmarking & Experimentation Engineer to rigorously test our proprietary LLM API and build the infrastructure for systematic model improvement. · Primary focus: · Execute and ...
Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • Work in company Remote job

    LLM Evaluation and Benchmarking Mentor

    Only for registered members

    +I'm seeking a technical mentor to help deepen my understanding of LLM evaluation and benchmarking, with particular attention to high-stakes applications (e.g., mental health), while developing a generalizable framework for reasoning about model performance across domains. · ...

    $50 - $150 (USD) per hour

    2 months ago

  • Work in company

    Scientifique senior de données

    Banque Nationale

    Scientifique senior de données · Ce poste te permet d'avoir un impact positif sur notre organisation, grâce à ton expertise en quantification et gestion du risque, ainsi qu'à tes compétences analytiques et interpersonnelles. · Évaluer la solidité conceptuelle des modèles (spécifi ...

    Montreal

    6 hours ago

  • Work in company Remote job

    AI/ML Engineer

    Only for registered members

    We're looking for an ML engineer to help us evaluate and benchmark language models using proprietary datasets. · Assess how existing models perform against our specialized datasets · ...

    $85 - $125 (USD) per hour

    1 month ago

  • Work in company

    Data Annotator

    Only for registered members

    Evaluate LLM-generated responses on their ability to effectively answer user queries. Conduct fact-checking using trusted public sources and external tools. · ...

    Montreal

    1 month ago

  • Work in company

    Data Annotator

    Only for registered members

    About The Job · Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. · Position: AI Model Evaluator · Type: ...

    Montreal $45,000 - $80,000 (CAD) per year

    1 week ago

  • Work in company

    Senior Financial Analyst | Upto 105/hr Hourly

    Only for registered members

    Mercor connects elite creative and technical talent with leading AI research labs. · Headquartered in San Francisco, · our investors include Benchmark, · General Catalyst, · Peter Thiel, · Adam D'Angelo, · Larry Summers, · and Jack Dorsey. · ...

    Montreal

    1 month ago

  • Work in company

    Research Physicist

    Only for registered members

    +We are seeking a Research Physicist to join our team of elite creative and technical talent. As a Physics AI Evaluator, you will write and refine prompts to guide model behavior in physics contexts. · + ...

    Montreal

    1 month ago

  • Work in company

    LLM Evaluation Specialist

    Only for registered members

    Evaluate LLM-generated responses on their ability to effectively answer user queries. Conduct fact-checking using trusted public sources and external tools. Generate high-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuraci ...

    Montreal

    1 month ago

  • Work in company

    Financial Reporting Analyst | Upto 105/hr Hourly

    Only for registered members

    +Job summary · Write and refine prompts to guide model behavior in financial contexts.ResponsibilitiesWrite and refine prompts to guide model behavior in financial contexts. · Evaluate LLM-generated responses to finance-related user queries for accuracy, reasoning quality, and cl ...

    Montreal

    1 month ago

  • Work in company

    LLM Evaluation Specialist

    Only for registered members

    +Evaluate LLM-generated responses for effectiveness in answering user queries. Conduct fact-checking using trusted public sources and external tools. · ...

    Montreal

    1 month ago

  • Work in company

    Conversational AI Evaluator

    Only for registered members

    Mercor conecta talento creativo y técnico con laboratorios de investigación de IA. Se buscan evaluadores para evaluar respuestas generadas por modelos LLM. · ...

    Montreal

    1 month ago

  • Work in company

    Linguist

    Only for registered members

    Evaluate LLM-generated responses for effectiveness in answering user queries. · Evaluate LLM-generated responses for effectiveness in answering user queries. · Conduct fact-checking using trusted public sources and external tools. · Generate high-quality human evaluation data by ...

    Montreal

    1 month ago

  • Work in company

    Conversational AI Specialist

    Only for registered members

    +Mercor connects elite creative and technical talent with leading AI research labs. · +Evaluate LLM-generated responses for effectiveness in answering user queries. · Conduct fact-checking using trusted public sources and external tools. · +Bachelor's degreeNative speaker or ILR ...

    Montreal

    1 month ago

  • Work in company

    Data Annotator

    Only for registered members

    Evaluate LLM-generated responses for effectiveness in answering user queries. · Evaluate model responses align with expected conversational behavior and system guidelines. ...

    Montreal

    1 month ago

  • Work in company

    Conversational AI Evaluator

    Only for registered members

    About Mercor connects elite creative and technical talent with leading AI research labs. We are looking for an experienced Conversational AI Evaluator to join our team. · Evaluate LLM-generated responses for effectiveness in answering user queries. Conduct fact-checking using tru ...

    Montreal

    1 month ago

  • Work in company

    Content Reviewer

    Only for registered members

    +Mercor connects elite creative and technical talent with leading AI research labs. · +Bachelor's degree · Native speaker or ILR 5/primary fluency (C2 on the CEFR scale) in French · +,valid_job:1} ...

    Montreal

    1 month ago

  • Work in company

    Sr. Quality Assurance Specialist Up to 45/hr

    Only for registered members

    About The Job · Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. · Position: AI Model Evaluator · Type: ...

    Montreal

    1 week ago

  • Work in company

    Quality Assurance Specialist

    Only for registered members

    Evaluate LLM-generated responses for effectiveness in answering user queries. Conduct fact-checking using trusted public sources and external tools. Generate high-quality human evaluation data by annotating response strengths areas for improvement and factual inaccuracies. · Eval ...

    Montreal

    1 month ago

  • Work in company

    Assistant Professor of Physics

    Only for registered members

    Write and refine prompts to guide model behavior in physics contexts. · Evaluate LLM-generated responses to physics-related queries for conceptual accuracy, · mathematical correctness, and reasoning quality. · Conduct fact-checking using authoritative public sources and domain kn ...

    Montreal

    1 month ago

  • Work in company

    Data Annotator

    Only for registered members

    Mercor connects elite creative and technical talent with leading AI research labs. · Bachelor's degree · Significant experience using large language models (LLMs) · ...

    Montreal

    1 month ago