Seminar @ Cornell Tech: CS Candidate Mark Zhao
Seminar @ Cornell Tech: CS Candidate Mark Zhao
In this talk, Zhao will emphasize the importance of building scalable systems across the entire ML pipeline. In particular, Zhao will explore how large-scale ML training pipelines, including those deployed at Meta, require distributed data storage and ingestion systems to manage massive training datasets. Optimizing these data systems is essential as data demands continue to grow. To achieve this, Zhao will demonstrate how synergistic optimizations across the training data pipeline can unlock performance and efficiency gains beyond what isolated system optimizations can achieve.