Search for More Jobs
Get alerts for jobs like this Get jobs like this tweeted to you
Company: AMD
Location: HKI, Uusimaa, Finland
Career Level: Mid-Senior Level
Industries: Technology, Software, IT, Electronics

Description



WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. 

AMD together we advance_



Join AMD Silo AI's evaluation team as a hands-on evaluation engineer. We need a strong engineer to implement, scale, and operationalize our evaluation frameworks for large-scale language model development for multilingual settings. 

You'll be the technical implementation backbone of our evaluation strategy, translating research insights into robust, scalable evaluation systems. Working closely with the pre- and post- training team, you'll focus on the engineering execution that makes high-quality LLM evaluation possible at scale. 

The role offers significant technical ownership and the chance to shape how evaluation is done. You'll have the opportunity to work on cutting-edge LLM evaluation challenges while building systems and creating benchmarks that directly impact open-source model development decisions. 

Main responsibilities:

  • Extend and modernize our benchmark suite to ensure we are using the most relevant evaluations for base models and post-trained models, with an additional emphasis on expanding coverage of European and low resource language evaluations 
  • Publish code, benchmark datasets, and analysis notebooks under permissive licenses; engage with upstream tools and contribute fixes or extensions 
  • Optimize evaluation pipelines for distributed computing environments and multi-GPU setups 
  • Develop lightweight proxy tasks and ablation protocols that surface issues early in long training runs 

Collaboration with others:

  • Work closely with pre-training and post-training teams to surface the right information, and help drive decision making for training techniques, data mixes, and data pipelines 
  • Coordinate with dev infra on experiment tracking, reporting and logging, establishing requirements and driving needed changes 
  • Collaborate with the OpenEuroLLM project on evaluations for European languages. 

Main goals for first 6 months 

  • Audit current evaluation infrastructure, identify technical bottlenecks and scalability issues 
  • Framework analysis: Evaluate existing evaluation tools and frameworks, documenting gaps between research needs and current technical capabilities 
  • Take ownership of the technical side of the existing evaluation framework maintenance 
  • Define a roadmap for extended experiment tracking capabilities  

Ideal candidate profile - Required Skills and Qualifications:

  • Python programming and software engineering best practices 
  • Experience with PyTorch/Transformers ecosystem 
  • Experience with evaluation of large machine learning models  
  • MLOps familiarity: experiment tracking, model versioning, automated pipelines 
  • Computer Science or Engineering background: BS/MS in related field  
  • We welcome candidates from mid-level to senior level depending on experience and demonstrated capabilities 

We would like to see 

  • Multilingual evaluation experience, particularly European languages 
  • Distributed computing experience (multi-GPU evaluation pipelines) 
  • Academic publications or industry blog posts on ML evaluation 
  • Experience with EleutherAI LM Evaluation Harness or similar frameworks 
  • Working knowledge of more than one language 
  • Strong communicator and collaborator 

 

#LI-DB1

#LI-HYBRID



Benefits offered are described:  AMD benefits at a glance.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.


 Apply on company website