Publish an internship
en
View Offer
Work > Internships > Science/Research > USA > San Jose > View Offer 

Student Researcher [Seed Vision - AI Platform] - 2026 Start (PhD)

TikTok
United States  San Jose, United States
Internship, Science/Research, English
163
Visits
0
Applicants
Register

Job Description:

The Seed Vision AI Platform team builds infrastructure and tooling to support large-scale training, evaluation, and deployment of vision foundation models. Our mission is to accelerate research and production through scalable, high-quality, and well-curated visual data pipelines, covering raw data processing, filtering, annotation, and training-ready formatting across video, image, and multimodal modalities. PhD internships at ByteDance provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community-building and development events, and collaboration with industry experts. Applications will be reviewed on a rolling basis - we encourage you to apply early. Please state your availability clearly in your resume (Start date, End date). Responsibilities: - Design and optimize data processing pipelines for large-scale image, video, and multimodal datasets used in model pretraining and fine-tuning. - Conduct research on data deduplication, filtering, and quality evaluation to maximize training signal efficiency. - Collaborate with model teams to close the loop between data characteristics and downstream performance. - Explore data-centric machine learning methods, including synthetic data generation, dataset pruning, and active data selection. - Build high-throughput systems for dataset tracking, versioning, and feedback-based iteration

Candidate Requirements:

Minimum Qualifications: - Currently pursuing a PhD in Computer Vision, Machine Learning, Systems, or a related field. - Research experience in data-centric ML, vision data pipelines, or training dataset optimization. - Familiarity with deep learning frameworks (e.g., PyTorch, TensorFlow) and data processing stacks (e.g., Spark, Ray, DALI). - Strong engineering skills in Python and/or distributed data systems. Preferred Qualifications: - Experience working with large-scale visual datasets (e.g., LAION, WebVid, ImageNet, Ego4D). - Background in data evaluation, synthetic data curation, or auto-labeling systems. - Familiarity with vision foundation model pretraining workflows (e.g., CLIP, DINO, EVA, InternImage). - Understanding of data-model alignment loops and evaluation-driven dataset iteration

Source: Company website
Posted on: 27 Aug 2025  (verified 14 Dec 2025)
Type of offer: Internship
Industry: Internet / New Media
Languages: English
Register
124.530 jobs and internships
in 158 countries
Register
Recruiters
Top Jobs
Countries