Agentic AI, LLM Evaluation, and Trustworthy Systems Research Internship

Siemens

Lavoro da casa

Scienza/Ricerca, Inglese

1 Visite			0 Candidati

Registrarsi

Descrizione del lavoro:

Job ID
510552

Posted since
16-Jun-2026

Organization
Foundational Technologies

Field of work
Internal Services

Company
Siemens Corporation

Experience level
Early Professional

Job type
Full-time

Work mode
Remote only

Employment type
Fixed Term

Location(s)

* Princeton - New Jersey - United States of America

Agentic AI, LLM Evaluation, and Trustworthy Systems Research Internship
Here at Siemens, we take pride in enabling sustainable progress through technology. We do this through empowering customers by combining the real and digital worlds. Improving how we live, work, and move today and for the next generation! We know that the only way a business thrive is if our people are thriving. That's why we always put our people first. Our global, diverse team would be happy to support you and challenge you to grow in new ways.
Siemens Research & Predevelopment (RPD) is the central R&D department of Siemens and thus has a key role to shape the future of our products. RPD acts as a strategic partner to support the executive units of Siemens. In consequence the main research focus is on future technologies for industry, infrastructure, mobility, and healthcare. In this context, we are looking for an Intern that supports our Software Systems and Processes team in Princeton, NJ by researching and developing scalable intelligent systems using LLMs and semantic technologies.
Transform the everyday with us!
Are you passionate about ensuring the reliability and robustness of cutting-edge AI systems? We're looking for an innovative PhD intern to join our team and contribute to groundbreaking research focused on implementing a Verification and Validation (V&V) framework for multi-agent systems.
Modern software is rapidly moving from static applications to agentic AI systems that plan, reason, call tools, coordinate across agents, and adapt over multiple steps. As these LLM-powered systems enter industrial workflows, the critical challenge is no longer only building capable agents-it is evaluating, verifying, and validating that they behave reliably, safely, and transparently in complex, uncertain environments. In this internship, you will research and prototype next-generation methods for LLM and multi-agent system evaluation, including benchmarks, guardrails, failure-mode analysis, runtime monitoring, formal methods, and testing technologies. You will help advance trustworthy AI for real-world industrial software systems where robustness, explainability, and dependable performance matter.
The internship provides a unique experience to contribute to innovative industrial applications while mentored by experienced professionals in an international setting.
This role is preferred to be on-site in Princeton, NJ, for a hands-on and collaborative experience, however remote candidates will be considered. The position is a full-time role for at least 3 months with the possibility of extension.
Key Responsibilities
* Research, design, and prototype V&V methods for multi-agent and agentic AI systems, with emphasis on reliability, safety, repeatability, explainability, and robustness under uncertain operating conditions.
* Develop evaluation harnesses, benchmarks, and test scenarios for LLM-based agents, including tool use, multi-step reasoning, orchestration, failure-mode analysis, and adversarial or edge-case behavior.
* Implement proof-of-concept prototypes in Python using modern AI and agent frameworks, formal methods, testing technologies, and retrieval-augmented or knowledge-grounded architectures where appropriate.
* Investigate verification strategies such as model checking, property-based testing, fuzz testing, static or dynamic analysis, runtime monitoring, guardrails, and trace-based observability for complex intelligent systems.
* Collaborate with researchers and engineers to define milestones, run experiments, analyze results, and translate research insights into scalable industrial software concepts.
* Document findings, contribute to scientific publications or technical reports, and present results clearly to internal and external technical audiences.
Basic Qualifications
* Currently enrolled in a PhD program in Computer Science, Artificial Intelligence, Machine Learning, Software Engineering, Formal Methods, or a closely related technical field.
* 3+ years of research or hands-on experience in AI, machine learning, generative AI, software engineering, formal methods, autonomous systems, or intelligent agent systems.
* Strong programming skills in Python and practical experience with modern ML or LLM tooling such as PyTorch, Hugging Face Transformers, LangChain, LangGraph, AutoGen, Semantic Kernel, CrewAI, or comparable frameworks.
* Hands-on experience building, evaluating, or testing LLM-powered applications, agentic workflows, multi-agent systems, or AI-enabled software engineering tools.
* Strong understanding of software architecture, software engineering principles, testing methodologies, experimentation, and empirical evaluation of complex systems.
* Demonstrated ability to conduct independent research, read and synthesize technical literature, analyze complex problems, prototype solutions, and communicate findings clearly.
* Proficient in English, both written and verbal.
* The position requires the person to be in the United States of America and hold a valid work permit in the US for the duration of the internship.
Preferred Skills
* Research experience in formal verification, model checking, theorem proving, runtime verification, AI safety, robust AI, explainable AI (XAI), or trustworthy machine learning.
* Experience with evaluation of LLMs or agents, including hallucination analysis, benchmark design, tool-use evaluation, prompt-injection testing, red teaming, or reliability metrics.
* Familiarity with RAG architectures, vector databases, knowledge graphs, semantic technologies, ontologies, or graph-based reasoning.
* Understanding of reinforcement learning, planning, reward modeling, preference optimization, or post-training approaches for LLMs and autonomous agents.
* Experience with cloud-native or distributed systems concepts, microservice architectures, APIs, CI/CD, Git, Docker, Kubernetes, Azure, AWS, or comparable platforms.
* Experience with testing frameworks for complex software systems, including property-based testing, fuzz testing, simulation-based testing, static analysis, or execution-based evaluation.
* Track record of research publications, open-source contributions, academic projects, or demonstrable prototypes related to AI, software engineering, formal methods, or agentic systems.
* Excellent problem-solving skills, attention to detail, and ability to quickly learn and apply new technologies, tools, and research methods.
* Strong written and verbal communication skills, with the ability to articulate complex technical concepts to research and engineering audiences.
About Siemens:
We are a global technology company focused on industry, infrastructure, transport, and healthcare. From more resourceefficient factories, resilient supply chains, and smarter buildings and grids, to sustainable transportation as well as advanced healthcare, we create technology with purpose adding real value for customers. Learn more about Siemens here.
Our Commitment to Equity and Inclusion in our Diverse Global Workforce:
We value your unique identity and perspective. We are fully committed to providing equitable opportunities and building a workplace that reflects the diversity of society, while ensuring that we attract the best talent based on qualifications, skills, and experiences. We welcome you to bring your authentic self and transform the everyday with us.
#LI-JS
#LI-Remote
#ArtificialIntelligence, #MachineLearning, #GenerativeAI

You'll Benefit From
Siemens offers a variety of health and wellness benefits to our employees. Details regarding our benefits can be found here: https://www.benefitsquickstart.com/siemens/index.html
The pay range for this position is $32-$47 per hour. The actual wage offered may be lower or higher depending on budget and candidate experience, knowledge, skills, qualifications and premium geographic location.

Equal Employment Opportunity Statement
Siemens is an Equal Opportunity Employer encouraging inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to their race, color, creed, religion, national origin, citizenship status, ancestry, sex, age, physical or mental disability unrelated to ability, marital status, family responsibilities, pregnancy, genetic information, sexual orientation, gender expression, gender identity, transgender, sex stereotyping, order of protection status, protected veteran or military status, or an unfavorable discharge from military service, and other categories protected by federal, state or local law.

EEO is the Law
Applicants and employees are protected from discrimination on the basis of race, color, religion, sex, national origin, or any characteristic protected by Federal or other applicable law.

Reasonable Accommodations
If you require a reasonable accommodation in completing a job application, interviewing, completing any pre-employment testing, or otherwise participating in the employee selection process, please fill out the accommodations form by clicking on this link Accommodation for disability form. If you're unable to complete the form, you can reach out to our AskHR team for support at 1-866-743-6367. Please note our AskHR representatives do not have visibility of application or interview status.

Pay Transparency
Siemens follows Pay Transparency laws.

California Privacy Notice
California residents have the right to receive additional notices about their personal information. To learn more, click here.

Criminal History
Qualified applications with arrest or conviction records will be considered for employment in accordance with applicable local and state laws

Visualizza tutto

Provenienza:	Web dell'azienda
Pubblicato il:	17 Gui 2026
Tipo di impiego:	Lavoro
Settore:	Conglomerato
Durata di lavoro:	3 mesi
Compensation:	47 USD
Lingue:	Inglese

Registrarsi

Agentic AI, LLM Evaluation, and Trustworthy Systems Research Internship

Chi siamo

Azienda

Lavoro

Studi