Pubblicare uno stage
it
Offerta
Lavoro > Stage > Scienza/Ricerca > Stati Uniti > San Jose > Offerta 

Student Researcher [Seed Vision - AI Platform] - 2026 Start (PhD)

TikTok
Stati Uniti  San Jose, Stati Uniti
Stage, Scienza/Ricerca, Inglese
166
Visite
0
Candidati
Registrarsi

Descrizione del lavoro:

The Seed Vision AI Platform team builds infrastructure and tooling to support large-scale training, evaluation, and deployment of vision foundation models. Our mission is to accelerate research and production through scalable, high-quality, and well-curated visual data pipelines, covering raw data processing, filtering, annotation, and training-ready formatting across video, image, and multimodal modalities. PhD internships at ByteDance provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community-building and development events, and collaboration with industry experts. Applications will be reviewed on a rolling basis - we encourage you to apply early. Please state your availability clearly in your resume (Start date, End date). Responsibilities: - Design and optimize data processing pipelines for large-scale image, video, and multimodal datasets used in model pretraining and fine-tuning. - Conduct research on data deduplication, filtering, and quality evaluation to maximize training signal efficiency. - Collaborate with model teams to close the loop between data characteristics and downstream performance. - Explore data-centric machine learning methods, including synthetic data generation, dataset pruning, and active data selection. - Build high-throughput systems for dataset tracking, versioning, and feedback-based iteration

Requisiti del candidato:

Minimum Qualifications: - Currently pursuing a PhD in Computer Vision, Machine Learning, Systems, or a related field. - Research experience in data-centric ML, vision data pipelines, or training dataset optimization. - Familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow) and data processing stacks (e.g., Spark, Ray, DALI). - Strong engineering skills in Python and/or distributed data systems. Preferred Qualifications: - Experience working with large-scale visual datasets (e.g., LAION, WebVid, ImageNet, Ego4D). - Background in data evaluation, synthetic data curation, or auto-labeling systems. - Familiarity with vision foundation model pretraining workflows (e.g., CLIP, DINO, EVA, InternImage). - Understanding of data-model alignment loops and evaluation-driven dataset iteration

Provenienza: Web dell'azienda
Pubblicato il: 27 Ago 2025  (verificato il 14 Dic 2025)
Tipo di impiego: Stage
Settore: Internet / New Media
Lingue: Inglese
Registrarsi
124.206 lavori e stage
in 158 Paesi
Registrati