Publier un stage
fr
Détails de l'offre
Emploi > Stages > Science/Recherche > Etats-Unis > San Jose > Détails de l'offre 

Student Researcher [Seed Vision - AI Platform] - 2026 Start (PhD)

TikTok
Etats-Unis  San Jose, Etats-Unis
Stage, Science/Recherche, Anglais
169
Visites
0
Candidats

Description du poste:

The Seed Vision AI Platform team builds infrastructure and tooling to support large-scale training, evaluation, and deployment of vision foundation models. Our mission is to accelerate research and production through scalable, high-quality, and well-curated visual data pipelines, covering raw data processing, filtering, annotation, and training-ready formatting across video, image, and multimodal modalities. PhD internships at ByteDance provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community-building and development events, and collaboration with industry experts. Applications will be reviewed on a rolling basis - we encourage you to apply early. Please state your availability clearly in your resume (Start date, End date). Responsibilities: - Design and optimize data processing pipelines for large-scale image, video, and multimodal datasets used in model pretraining and fine-tuning. - Conduct research on data deduplication, filtering, and quality evaluation to maximize training signal efficiency. - Collaborate with model teams to close the loop between data characteristics and downstream performance. - Explore data-centric machine learning methods, including synthetic data generation, dataset pruning, and active data selection. - Build high-throughput systems for dataset tracking, versioning, and feedback-based iteration

Profil requis du candidat:

Minimum Qualifications: - Currently pursuing a PhD in Computer Vision, Machine Learning, Systems, or a related field. - Research experience in data-centric ML, vision data pipelines, or training dataset optimization. - Familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow) and data processing stacks (e.g., Spark, Ray, DALI). - Strong engineering skills in Python and/or distributed data systems. Preferred Qualifications: - Experience working with large-scale visual datasets (e.g., LAION, WebVid, ImageNet, Ego4D). - Background in data evaluation, synthetic data curation, or auto-labeling systems. - Familiarity with vision foundation model pretraining workflows (e.g., CLIP, DINO, EVA, InternImage). - Understanding of data-model alignment loops and evaluation-driven dataset iteration

Origine: Site web de l'entreprise
Publié: 27 Aoû 2025  (vérifié le 15 Dec 2025)
Type de poste: Stage
Secteur: Internet / Nouveaux Médias
Langues: Anglais
123.055 emplois et stages
dans 158 pays
S'inscrire
Entreprises
Offres
Pays