新規登録・ログインをしてスカウトメールや保存した求人を確認しよう
新規登録・ログインをして求人を探そう
求人ID : 1560023 更新日 : 2026年04月01日

AI QA Specialist (LLM Evaluation)

採用企業 AI QA Specialist (LLM Evaluation)
勤務地 東京都 23区, 新宿区
雇用形態 正社員
給与 700万円 ~ 1400万円

ワークスタイル

服装カジュアル 副業OK フレックスタイム制

募集要項

As an AI QA Specialist, you will lead the design, construction, and operation of the quality evaluation infrastructure for AI agents.

  • Own the entire process from evaluation metric selection and design to integrating automated evaluation pipelines into CI/CD
  • Plan and execute red teaming to detect safety risks before release
  • Quantitatively verify the effectiveness of quality improvements through A/B test analysis based on statistical experimental design
  • Feed evaluation signals back to the research and development teams, creating a compound-interest loop for model improvement
  • Ensure the quality of products used in production by ~200 companies through a "science of quality" approach

Job Description

  • Evaluation Infrastructure Design & Development
    • Design, build, and maintain evaluation sets (synthetic data + real logs)
    • Select and design evaluation metrics (win rate, task success, factuality, harm detection)
    • Build automated evaluation pipelines and integrate them into CI/CD
    • Design agent harnesses (multi-turn, tool use, long-context support)
  • Safety & Quality Verification
    • Plan and execute red-teaming (adversarial testing)
    • Build safety and policy compliance verification frameworks
    • Design and run prompt/tool regression tests
    • Analyze and improve issues related to hallucination, bias, and output quality
  • Statistical Analysis & Reporting
    • Design and analyze statistical experiments (A/B tests, significance testing)
    • Create quality reports and improvement proposals
    • Visualize regression detection and quality trends
    • Feed evaluation signals back to research and development teams

応募必要条件

職務経験 6年以上
キャリアレベル 中途経験者レベル
英語レベル ビジネス会話レベル (英語使用比率: 25%程度)
日本語レベル 無し
最終学歴 専門学校卒
現在のビザ 日本での就労許可が必要です

スキル・資格

You May Be a Good Fit If You

  • Bachelor's degree or equivalent practical experience in Computer Science, Software Engineering, Artificial Intelligence, Machine Learning, Mathematics, Physics, or related fields
  • 3+ years of practical experience as a software engineer or QA engineer
  • Knowledge of LLM / generative AI evaluation methods (prompt evaluation, quantitative output quality measurement, hallucination detection, etc.)
  • Foundational knowledge of statistics and experimental design
  • Experience building evaluation pipelines in Python
  • Experience integrating tests into CI/CD pipelines
  • Experience designing prompt / tool regression tests

Strong Candidates May Also Have

  • NLP / ML evaluation benchmark design experience
  • Knowledge of AI safety / Responsible AI
  • Red teaming / penetration testing experience
  • Experience evaluating multi-agent workflows, tool use, and long-context scenarios
  • Large-scale data processing experience (Spark / BigQuery, etc.)
  • Ability to read, comprehend, and reproduce research papers
  • Technical communication ability in English

 

勤務地

  • 東京都 23区, 新宿区

労働条件

雇用形態 正社員
給与 700万円 ~ 1400万円
勤務時間 10:00~19:00
業種 インターネット・Webサービス

職種

会社概要

会社の種類 中小企業 (従業員300名以下)
外国人の割合 外国人 半数