CV

General Information

Full Name Sidong Zhang
Languages English, Mandarin

Education

  • Sep. 2020 - present
    Doctor of Philosophy
    UMass Amherst’s College of Information and Computer Sciences, Amherst, MA, USA
    • Teaching assistant of graduate level CS 589 machine learning and CS 651 optimization class
    • Research assistant in Information Fusion lab, currently funded by an NIH RO3
  • Sep. 2018 - Jan. 2021
    Master of Science
    UMass Amherst’s College of Information and Computer Sciences, Amherst, MA, USA
  • Sep. 2018 - Jan. 2021
    Bachelor of Engineering
    Nanjing University, Software Institute, Nanjing, China

Experience

  • Feb. 2024 - Sep. 2024
    Audio-Visual Speech Separation via Bottleneck Iterative Network
    UMass Amherst’s College of Information and Computer Sciences & Dolby Laboratories
    • Accepted to ICML 2025 Workshop on Machine Learning for Audio
    • We work on audio-visual speech separation on noisy audio mixtures of NTCD-TIMIT and LRS3 data set
    • We propose a novel multimodal fusion framework on audio and video modality with a bottleneck iterative structure
    • Our proposed model outperforms state-of-the-art on SI-SDRi and only requires 50\% of SOTA training time
  • Sep. 2020 - July 2025
    Longitudinal Multimodal Modeling for Alzheimer’s Early Detection in the Wild
    UMass Amherst’s College of Information and Computer Sciences
    • We introduce information theory-based unsupervised representation learning on brain MRIs as a complement to risk factors, with an estimated mutual information value to evaluate the strength of the dependency to risk factors
    • We both train CNN from scratch and finetune foundation models as two representation learning models
    • We optimize the procedure of training Alzheimer's forecasting model to include two stages with partial model parameter freezing for stabler validation performance during training process
    • We constructed a representative subset from the ADNI dataset aligned with the TADPOLE challenge, consisting of 1,183 patients for training, 169 for validation, and 336 for testing
    • We evaluate the forecasting performance using both micro F1-score on CN-MCI-AD forecasting and the precision of capturing MCI to AD transition timing across 100 repeated experiments to ensure statistical significance
  • Jan. 2025 – May 2025
    Encoding Domain Insights into Multi-modal Fusion: Improved Performance at the Cost of Robustness
    UMass Amherst’s College of Information and Computer Sciences
    • Accepted to ICML 2025 Workshop on Methods and Opportunities at Small Scale
    • We work on comparing fusion methods with and without domain knowledge embedded in the model structure
    • We conduct experiments on MMSD dataset with both the original task of detecting sarcasm and a synthetic task to control the level of domain knowledge, with additional Gaussian noise added to the inputs to examine robustness
    • We find that aligning fusion design with priors boosts clean-data accuracy with limited data but significantly diminishes robustness to noisy inputs
  • Feb. 2019 - May. 2019
    Clustered Vertical Attention for Irregular Time Series Modelling
    UMass Amherst’s College of Information and Computer Sciences
    • Results were submitted to ICML 2019 TimeSeriesWorkshop
    • We worked on PhysioNet Challenge 2012 data set
    • We improved the prediction accuracy of the in-hospital survival of ICU patients via existing imputation methods
    • We ran Minimum Spanning Tree algorithm to determine the most correlated imputed clusters
    • We trained separate Attention models for clusters and predicted on a Long Short-term Memory Attention model
    • The model got accuracy improvements of 1.5%, 1.3% and 1.8% on 3 different imputation methods

Skill

  • {"Languages"=>"English (Proficient), Mandarin (Native)"}
  • {"Programming language"=>"Java, Python, C, Lisp, Markdown, Latex"}
  • {"Development tools"=>"Pytorch"}