Matthew Gwilliam

I am a fifth year Ph.D student in the department of Computer Science at the University of Maryland (UMD), advised by Professor Abhinav Shrivastava. I am studying computer vision.

I completed my B.S. in Computer Science at Brigham Young University in 2019. During undergrad I worked part-time at Qualtrics, and after graduation I worked there full-time before I started my Ph.D in 2020.

While at BYU, I was fortunate to work with Ryan Farrell, who helped me grow as a researcher and a person and ultimately decide to pursue my graduate degree.

During my PhD, I have enjoyed opportunities to work as an intern at Amazon, SRI, and NVIDIA.

Email  /  CV  /  Google Scholar

profile photo
Research

I am interested in computer vision models that learn without labels. More specifically, I am interested in methods that can learn universal image representations in an unsupervised manner. Currently, that work focuses on models based on diffusion and implicit neural representation (but mostly on INR). I am working with the sorts of tasks that these models are useful for: video retrieval, compression, generation; image classification, clustering, etc.

diff_ssl Do Text-free Diffusion Models Learn Discriminative Visual Representations?
Matthew Gwilliam*, Soumik Mukhopadhyay*, Yosuke Yamaguchi, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Tianyi Zhou, Abhinav Shrivastava
European Conference on Computer Vision ECCV 2024 (ECCV) , 2024
Project Page |Paper

Explore diffusion models as unified unsupervised image representation learning models for many recognition tasks. Propose DifFormer and DifFeed, novel mechanisms for fusing diffusion features for image classification.

diff_ssl Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics
Shishira R Maiya*, Anubhav Gupta*, Matthew Gwilliam, Max Ehrlich, Abhinav Shrivastava
European Conference on Computer Vision ECCV 2024 (ECCV) , 2024
Project Page |Paper

Develop implicit neural video models that perform well not only for compression, but also for retrieval, chat, and more.

xinc_teaser Explaining the Implicit Neural Canvas (XINC): Connecting Pixels to Neurons by Tracing their Contributions
Namitha Padmanabhan*, Matthew Gwilliam*, Pulkit Kumar, Shishira Maiya, Max Ehrlich, Abhinav Shrivastava
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2024
Project Page |Paper |Code

XINC dissects Implicit Neural Representation (INR) models to understand how neurons represent images and videos and to reveal the inner workings of INRs.

unsup_teaser Elusive Images: Beyond Coarse Analysis for Fine-Grained Recognition
Connor Anderson Matthew Gwilliam, Evelyn Gaskin Ryan Farrell
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2024
Paper

Identify error types and image difficulty among state-of-the-art models for fine-grained visual categorization.

diff_ssl A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Matthew Gwilliam, Michael Cogswell, Meng Ye, Karan Sikka, Abhinav Shrivastava, Ajay Divakaran
Under Review
Project Page |Paper

Propose an alternative to paragraphs for long video retrieval, specifically, the 10k Words problem: every video should be able to be matched with any valid description, so we generate many descriptions for every video (for 3 long video datasets), evaluate accordingly, and introduce novel finetuning for better performance.

diff_ssl Diffusion Models Beat GANs on Image Classification
Matthew Gwilliam*, Soumik Mukhopadhyay*, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Tianyi Zhou, Abhinav Shrivastava
preprint only
Project Page |Paper

Show the potential of diffusion models as unified unsupervised image representation learners.

hnerv_teaser HNeRV: A Hybrid Neural Representation for Videos
Hao Chen, Matthew Gwilliam, Ser-Nam Lim, Abhinav Shrivastava
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2023
Project Page | Paper | Code

Combine the strengths of implicit (NeRV) and explicit (autoencoder) representation to create a hybrid neural representation for video with good properties for representation, compression, and editing.

cnerv_teaser CNeRV: Content-adaptive Neural Representation for Visual Data
Hao Chen, Matthew Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava
British Machine Vision Conference (BMVC), 2022 (ORAL)
Project Page | Paper

Make implicit video representation networks generalize to unseen data by swapping time embedding for content-aware embedding that is computed as a unique summary of each frame.

unsup_teaser Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning
Matthew Gwilliam, Abhinav Shrivastava
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2022
Project Page | Paper | Code

Examine, compare, and contrast popular unsupervised image representation learning methods, showing that there are significant differences based on specific algorithm used, and "supervised vs. unsupervised" comparisons which neglect these differences tend to over-generalize.

race_bias_teaser Rethinking Common Assumptions to Mitigate Racial Bias in Face Recognition Datasets
Matthew Gwilliam, Srinidhi Hegde Lade Tinubu Alex Hanson
IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) , 2021
Paper | Code

Reveal the role of data in racial bias for facial recognition systems, and the flaws underlying the assumption that balanced data results in fair performance.

unsup_teaser Fair Comparison: Quantifying Variance in Results for Fine-grained Visual Categorization
Matthew Gwilliam, Adam Teuscher Connor Anderson Ryan Farrell
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2021
Paper

Uncover the large, often-ignored amount of variance in FGVC systems across training runs, on the dataset level, but more particularly in terms of the classification performance for individual classes.

unsup_teaser Intelligent Image Collection: Building the Optimal Dataset
Matthew Gwilliam, Ryan Farrell
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2020
Paper

Propose smart practices to optimize image curation, such that classification accuracy is maximized for a given constrained dataset size.


Template credits