Publica unas prácticas
es
Detalles de la Oferta
Empleo > Prácticas > Ciencia/Investigación > EE.UU. > San Jose > Detalles de la Oferta 

Student Researcher [Seed Vision - AI Platform] - 2026 Start (PhD)

TikTok
Estados Unidos  San Jose, Estados Unidos
Prácticas, Ciencia/Investigación, Inglés
241
Visitas
0
Candidatos
Regístrate

Descripción del puesto:

The Seed Vision AI Platform team builds infrastructure and tooling to support large-scale training, evaluation, and deployment of vision foundation models. Our mission is to accelerate research and production through scalable, high-quality, and well-curated visual data pipelines, covering raw data processing, filtering, annotation, and training-ready formatting across video, image, and multimodal modalities. PhD internships at ByteDance provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community-building and development events, and collaboration with industry experts. Applications will be reviewed on a rolling basis - we encourage you to apply early. Please state your availability clearly in your resume (Start date, End date). Responsibilities: - Design and optimize data processing pipelines for large-scale image, video, and multimodal datasets used in model pretraining and fine-tuning. - Conduct research on data deduplication, filtering, and quality evaluation to maximize training signal efficiency. - Collaborate with model teams to close the loop between data characteristics and downstream performance. - Explore data-centric machine learning methods, including synthetic data generation, dataset pruning, and active data selection. - Build high-throughput systems for dataset tracking, versioning, and feedback-based iteration

Requerimientos del candidato/a:

Minimum Qualifications: - Currently pursuing a PhD in Computer Vision, Machine Learning, Systems, or a related field. - Research experience in data-centric ML, vision data pipelines, or training dataset optimization. - Familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow) and data processing stacks (e.g., Spark, Ray, DALI). - Strong engineering skills in Python and/or distributed data systems. Preferred Qualifications: - Experience working with large-scale visual datasets (e.g., LAION, WebVid, ImageNet, Ego4D). - Background in data evaluation, synthetic data curation, or auto-labeling systems. - Familiarity with vision foundation model pretraining workflows (e.g., CLIP, DINO, EVA, InternImage). - Understanding of data-model alignment loops and evaluation-driven dataset iteration

Origen: Web de la compañía
Publicado: 27 Ago 2025  (comprobado el 03 Ene 2026)
Tipo de oferta: Prácticas
Sector: Internet / Nuevos Medios
Idiomas: Inglés
Regístrate
120.473 empleos y prácticas
en 157 países
Regístrate
Empresas
Ofertas
Países