LLM Evaluation, Benchmarking

Only for registered members Canada

1 month ago

Location: · Remote · Type: · Contract or Full-Time · About the RoleWe're looking for an LLM Evaluation, Benchmarking & Experimentation Engineer to rigorously test our proprietary LLM API and build the infrastructure for systematic model improvement. · Primary focus: · Execute and ...

Job description

Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.

Get full access

Access all high-level positions and get the job of your dreams.

Similar jobs

Work in company Remote job

LLM Evaluation and Benchmarking Mentor

Only for registered members

+I'm seeking a technical mentor to help deepen my understanding of LLM evaluation and benchmarking, with particular attention to high-stakes applications (e.g., mental health), while developing a generalizable framework for reasoning about model performance across domains. · ...

$50 - $150 (USD) per hour

2 months ago

Work in company

Scientifique senior de données

Banque Nationale

Scientifique senior de données · Ce poste te permet d'avoir un impact positif sur notre organisation, grâce à ton expertise en quantification et gestion du risque, ainsi qu'à tes compétences analytiques et interpersonnelles. · Évaluer la solidité conceptuelle des modèles (spécifi ...

Montreal

6 hours ago

Work in company Remote job

AI/ML Engineer

Only for registered members

We're looking for an ML engineer to help us evaluate and benchmark language models using proprietary datasets. · Assess how existing models perform against our specialized datasets · ...

$85 - $125 (USD) per hour

1 month ago

Work in company

Data Annotator

Only for registered members

Evaluate LLM-generated responses on their ability to effectively answer user queries. Conduct fact-checking using trusted public sources and external tools. · ...

Montreal

1 month ago

Work in company

Data Annotator

Only for registered members

About The Job · Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. · Position: AI Model Evaluator · Type: ...

Montreal $45,000 - $80,000 (CAD) per year

1 week ago

Work in company

Senior Financial Analyst | Upto 105/hr Hourly

Only for registered members

Mercor connects elite creative and technical talent with leading AI research labs. · Headquartered in San Francisco, · our investors include Benchmark, · General Catalyst, · Peter Thiel, · Adam D'Angelo, · Larry Summers, · and Jack Dorsey. · ...

Montreal

1 month ago

Work in company

Research Physicist

Only for registered members

+We are seeking a Research Physicist to join our team of elite creative and technical talent. As a Physics AI Evaluator, you will write and refine prompts to guide model behavior in physics contexts. · + ...

Montreal

1 month ago

Work in company

LLM Evaluation Specialist

Only for registered members

Evaluate LLM-generated responses on their ability to effectively answer user queries. Conduct fact-checking using trusted public sources and external tools. Generate high-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuraci ...

Montreal

1 month ago

Work in company

Financial Reporting Analyst | Upto 105/hr Hourly

Only for registered members

+Job summary · Write and refine prompts to guide model behavior in financial contexts.ResponsibilitiesWrite and refine prompts to guide model behavior in financial contexts. · Evaluate LLM-generated responses to finance-related user queries for accuracy, reasoning quality, and cl ...

Montreal

1 month ago

Work in company

LLM Evaluation Specialist

Only for registered members

+Evaluate LLM-generated responses for effectiveness in answering user queries. Conduct fact-checking using trusted public sources and external tools. · ...

Montreal

1 month ago

Work in company

Conversational AI Evaluator

Only for registered members

Mercor conecta talento creativo y técnico con laboratorios de investigación de IA. Se buscan evaluadores para evaluar respuestas generadas por modelos LLM. · ...

Montreal

1 month ago

Work in company

Linguist

Only for registered members

Evaluate LLM-generated responses for effectiveness in answering user queries. · Evaluate LLM-generated responses for effectiveness in answering user queries. · Conduct fact-checking using trusted public sources and external tools. · Generate high-quality human evaluation data by ...

Montreal

1 month ago

Work in company

Conversational AI Specialist

Only for registered members

+Mercor connects elite creative and technical talent with leading AI research labs. · +Evaluate LLM-generated responses for effectiveness in answering user queries. · Conduct fact-checking using trusted public sources and external tools. · +Bachelor's degreeNative speaker or ILR ...

Montreal

1 month ago

Work in company

Data Annotator

Only for registered members

Evaluate LLM-generated responses for effectiveness in answering user queries. · Evaluate model responses align with expected conversational behavior and system guidelines. ...

Montreal

1 month ago

Work in company

Conversational AI Evaluator

Only for registered members

About Mercor connects elite creative and technical talent with leading AI research labs. We are looking for an experienced Conversational AI Evaluator to join our team. · Evaluate LLM-generated responses for effectiveness in answering user queries. Conduct fact-checking using tru ...

Montreal

1 month ago

Work in company

Content Reviewer

Only for registered members

+Mercor connects elite creative and technical talent with leading AI research labs. · +Bachelor's degree · Native speaker or ILR 5/primary fluency (C2 on the CEFR scale) in French · +,valid_job:1} ...

Montreal

1 month ago

Work in company

Sr. Quality Assurance Specialist Up to 45/hr

Only for registered members

Montreal

1 week ago

Work in company

Quality Assurance Specialist

Only for registered members

Evaluate LLM-generated responses for effectiveness in answering user queries. Conduct fact-checking using trusted public sources and external tools. Generate high-quality human evaluation data by annotating response strengths areas for improvement and factual inaccuracies. · Eval ...

Montreal

1 month ago

Work in company

Assistant Professor of Physics

Only for registered members

Write and refine prompts to guide model behavior in physics contexts. · Evaluate LLM-generated responses to physics-related queries for conceptual accuracy, · mathematical correctness, and reasoning quality. · Conduct fact-checking using authoritative public sources and domain kn ...

Montreal

1 month ago

Work in company

Data Annotator

Only for registered members

Mercor connects elite creative and technical talent with leading AI research labs. · Bachelor's degree · Significant experience using large language models (LLMs) · ...

Montreal

1 month ago

LLM Evaluation, Benchmarking

Job description

Similar jobs

LLM Evaluation and Benchmarking Mentor

Scientifique senior de données

AI/ML Engineer

Data Annotator

Data Annotator

Senior Financial Analyst | Upto 105/hr Hourly

Research Physicist

LLM Evaluation Specialist

Financial Reporting Analyst | Upto 105/hr Hourly

LLM Evaluation Specialist

Conversational AI Evaluator

Linguist

Conversational AI Specialist

Data Annotator

Conversational AI Evaluator

Content Reviewer

Sr. Quality Assurance Specialist Up to 45/hr

Quality Assurance Specialist

Assistant Professor of Physics

Data Annotator

Directory

for Recruiters

Information