Beschreibung:
Meta was built to help people connect and share, and over the last decade our tools have played a critical part in changing how people around the world communicate with one another. With over a billion people using the service and more than fifty offices around the globe, a career at Meta offers countless ways to make an impact in a fast growing organization. Meta is seeking LLM Evaluation Scientists to join our Meta Superintelligence Lab, focusing on the evaluation and benchmarking of large language models (LLMs) across language and multimodal domains. We are committed to advancing the field of artificial intelligence by developing rigorous methodologies and tools to assess and improve the capabilities, safety, and reliability of cutting-edge AI systems. We are looking for individuals passionate about LLM evaluation, benchmarking, prompt engineering, data analysis, and the development of robust evaluation frameworks. As an LLM Evaluation Scientist, you will have the opportunity to shape the future of AI by ensuring our models meet the highest standards of performance and safety at scale. Our internships are twelve (12) to sixteen (16), or twenty-four (24) weeks long and we have various start dates throughout the year.
Design, implement, and maintain comprehensive evaluation protocols for large language models, including both automated and human-in-the-loop assessments. Develop and curate high-quality datasets and benchmarks to measure model performance, safety, fairness, and robustness across a variety of tasks and modalities. Analyze model outputs to identify strengths, weaknesses, and failure modes, and provide actionable insights to research and engineering teams. Collaborate with researchers, engineers, and cross-functional partners to define evaluation goals, communicate findings, and drive improvements in model quality. Develop tools and infrastructure to streamline and scale evaluation processes, including dashboards, annotation platforms, and reporting systems. Stay up-to-date with the latest research in LLM evaluation, benchmarking, and responsible AI, and incorporate best practices into Meta's workflows. Disseminate evaluation results through internal reports, presentations, and, when appropriate, external publications. Contribute to the development of evaluation methodologies that can be applied to Meta product development and deployment
Ihr Profil:
Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Artificial Intelligence, Generative AI, or a relevant technical field Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment Experience with Python, C++, C, Java or other related languages Experience building systems based on machine learning and/or deep learning methods Intent to return to the degree program after the completion of the internship/co-op Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading workshops or conferences such as NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, ICCV, ECCV, or similar Experience working and communicating cross functionally in a team environment Experience in advancing AI techniques, including core contributions to open source libraries and frameworks in computer vision Publications or experience in machine learning, AI, computer vision, optimization, computer science, statistics, applied mathematics, or data science Experience solving analytical problems using quantitative approaches Experience setting up ML experiments and analyzing their results Experience manipulating and analyzing complex, large scale, high-dimensionality data from varying sources Experience in utilizing theoretical and empirical research to solve problems Experience with deep learning frameworks
| Quelle: | Website des Unternehmens |
| Datum: | 11 Dez 2025 (geprüft am 20 Dez 2025) |
| Stellenangebote: | Praktikum |
| Bereich: | IT |
| Sprachkenntnisse: | Englisch |