About me

Greetings! I'm your captain for this conversation, Tzu-Heng (Brian), a 3rd-year CS Ph.D. student at the University of Wisconsin-Madison (UW-Madison), under the guidance of Prof. Frederic Sala. Before joining UW-Madison, I earned my B.S. degree in CS from National Chengchi University (NCCU), where I was fortunately advised by Prof. Man-Kwan Shan (NCCU) and Dr. Ling-Jyh Chen (Academia Sinica) in the realm of spatio-temporal machine learning in large-scale sensor networks. In 2019, I interned at Argonne National Laboratory in the team, Array of Things, working with Dr. Charles Catlett and Dr. Rajesh Sankaran. In 2018, I ventured into human resource-focused machine learning with Prof. Changya Hu (NCCU).

I am passionate about advancing machine learning to empower models to learn more with less supervision. Currently, I am focusing on Data-centric AI, particularly in designing efficient methods for multimodal data curation, zero-cost labeling systems, and data-efficient LLM customization. These strategies are rooted in weak supervision frameworks to build foundation models with reduced need for human annotations. Additionally, I am developing a new notion, parameter marketplace, to accelerate training while monetizing parameters as second profit center.

Beyond my academic pursuits, I have taken on leadership roles, serving as the President (2022 - 2023) and the Vice President (2021 - 2022) in Student Association of Taiwan (SAT) at UW-Madison.

Research Topic

  • design icon

    Economic & Game Theory

    I am passionate about integrating economic insights into large-scale machine learning training and exploring provable strategies with multi-agents in game theory.

  • design icon

    Foundation Models

    I am interested in developing foundation models using efficient adaption approaches and investigating their potential to reduce human efforts on labeling.

  • design icon

    Weak Supervision

    I focus on weak supervision, a new programming paradigm for machine learning, to automatically learn more with less labeled data. Standing on it, I research data-efficient training to customize foundation models.

  • design icon

    Spatio-Temporal Modeling

    I studied in machine learning with spatio-temporal data to make inference, including time-series forcasting, detecting anomalies, enhancing data quality, and modeling sensor correlations.

Recent News

  • Henry william

    Attending NeurIPS'23 in New Orleans!

    Dec. 2023

    "6 papers from our group (Sprocket Lab) are going to present in NeurIPS! Welcome to chat with me about parameter marketplace and its future."

  • Emily evans

    Paper Accepted by NeurIPS'23

    Sep. 2023

    "Train 'n Trade: Foundations of Parameter Markets"

    Paper Abstract: Organizations typically train large models individually. This is costly and time-consuming, particularly for large-scale foundation models. Such vertical production is known to be suboptimal. Inspired by this economic insight, we ask whether it is possible to leverage others' expertise by trading the constituent parts in models, i.e., sets of weights, as if they were market commodities. While recent advances in aligning and interpolating models suggest that doing so may be possible, a number of fundamental questions must be answered to create viable parameter markets. In this work, we address these basic questions, propose a framework containing the infrastructure necessary for market operations to take place, study strategies for exchanging parameters, and offer means for agents to monetize parameters. Excitingly, compared to agents who train siloed models from scratch, we show that it is possible to mutually gain by using the market, even in competitive settings. This suggests that the notion of parameter markets may be a useful paradigm for improving large-scale model training in the future.

  • Emily evans

    Paper Accepted by NeurIPS'23

    Sep. 2023

    "Geometry-Aware Adaptation for Pretrained Models"

    Paper Abstract: Machine learning models---including prominent zero-shot models---are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes---or, in the case of zero-shot prediction, to improve its performance---without any additional training. Our technique is a drop-in replacement of the standard prediction rule, swapping with the Fréchet mean. We provide a comprehensive theoretical analysis for this approach, studying (i) learning-theoretic results trading off label space diameter, sample complexity, and model dimension, (ii) characterizations of the full range of scenarios in which it is possible to predict any unobserved class, and (iii) an optimal active learning-like next class selection procedure to obtain optimal training classes for when it is not possible to predict the entire range of unobserved classes. Empirically, using easily-available external metrics, our proposed approach, Loki, gains up to 29.7% relative improvement over SimCLR on ImageNet and scales to hundreds of thousands of classes. When no such metric is available, Loki can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pretrained zero-shot models such as CLIP.

  • Emily evans

    Paper Accepted by ICCV'23 Datacomp Workshop

    Sep. 2023

    "Multimodal Data Curation via Object Detection and Filter Ensembles"

    Paper Abstract: We propose an approach for curating multimodal data that we used for our entry in the 2023 DataComp competition filtering track. Our technique combines object detection and weak supervision-based ensembling. In the first of two steps in our approach, we employ an out-of-the-box zero-shot object detection model to extract granular information and produce a variety of filter designs. In the second step, we employ weak supervision to ensemble filtering rules. This approach results in a 4% performance improvement when compared to the best-performing baseline, producing the top-ranking position in the small scale track at the time of writing. Furthermore, in the medium scale track, we achieve a noteworthy 4.2% improvement over the baseline by simply ensembling existing baselines with weak supervision.

  • Emily evans

    Rank #1 in the Datacomp'23 competition

    Aug. 2023

    "Top-ranking position in the Datacomp'23 competition (small-scale filtering track)"

  • Emily evans

    Paper Accepted by ICLR'23 DL4C Workshop

    Mar. 2023

    "ScriptoriumWS: A Code Generation Assistant for Weak Supervision"

    Paper Abstract: Weak supervision is a popular framework for overcoming the labeled data bottleneck: the need to obtain labels for training data. In weak supervision, multiple noisy-but-cheap sources are used to provide guesses of the label and are aggregated to produce high-quality pseudolabels. These sources are often expressed as small programs written by domain experts—and so are expensive to obtain. Instead, we argue for using code-generation models to act as coding assistants for crafting weak supervision sources. We study prompting strategies to maximize the quality of the generated sources, settling on a multi-tier strategy that incorporates multiple types of information. We explore how to best combine hand-written and generated sources. Using these insights, we introduce ScriptoriumWS, a weak supervision system that, when compared to hand-crafted sources, maintains accuracy and greatly improves coverage.

  • Paper Accepted

    Paper Accepted by NeurIPS'22

    Sep. 2022

    "AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels".

    Paper Abstract: Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed by labeling functions (LFs). While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of constructing labeling functions for domains with complex or high-dimensional features. To address this, a handful of methods have proposed automating the LF design process using a small set of ground truth labels. In this work, we introduce AutoWS-Bench-101: a framework for evaluating automated WS (AutoWS) techniques in challenging WS settings---a set of diverse application domains on which it has been previously difficult or impossible to apply traditional WS techniques.

  • Jessica miller

    President of SAT at UW-Madison

    May. 2022

    "The Student Association of Taiwan (SAT) at the University of Wisconsin–Madison serves as a forum for Taiwanese students, locals as well as international students to meet, develop friendships, and bond with each other in a friendly environment. Our main mission is to bring students together from all fields of study for recreational, academic, and cultural purposes. SAT hopes to serve as a channel of communication for all its members. We welcome students of all ethnicities and nationalities to participate in our organization. We hope to create delightful, precious memories for all Taiwanese students in Madison as well as local Americans and international students."

  • Henry william

    Joining CS Dept at UW-Madison

    Aug. 2021

    "Here is the start of my Ph.D. journey. Mamba mentality always."

Experience

Resume [PDF]

Education

  1. University of Wisconsin, Madison (UW-Madison)

    Aug. 2021 — Present (3rd-year)

    Ph.D. in Computer Science.

  2. National Chengchi University (NCCU)

    Sep. 2016 — Jul. 2020

    B.S. in Computer Science.

Publications

  1. Train 'n Trade: Foundations of Parameter Markets

    NeurIPS'23

    Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala

    [PDF]
  2. Geometry-Aware Adaptation for Pretrained Models

    NeurIPS'23

    Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

    [PDF]
  3. Multimodal Data Curation via Object Detection and Filter Ensembles

    ICCV'23 Towards the Next Generation of Computer Vision Datasets (TNGCV) Workshop

    Tzu-Heng Huang*, Changho Shin*, Sui Jiet Tay, Dyah Adila, Frederic Sala

    [PDF]
  4. ScriptoriumWS: A Code Generation Assistant for Weak Supervision

    ICLR'23 Deep Learning for Code (DL4C) Workshop & 2023 Midwest Machine Learning Symposium

    Tzu-Heng Huang, Catherine Cao, Spencer Schoenberg, Harit Vishwakarma, Nicholas Roberts, Frederic Sala

    [PDF]
  5. AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

    NeurIPS'22

    Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala

    [PDF]
  6. Key Sensor Discovery for Quality Audit of Air Sensor Networks

    MobiSys'20

    Tzu-Heng Huang, Cheng-Hsien Tsai, Man-Kwan Shan

    [PDF]

Experience

  1. Graduate Research Student, University of Wisconsin-Madison

    Feb. 2022 — Present, advised by Frederic Sala

  2. Research Intern, Argonne National Laboratory

    Jun. 2019 — Sep. 2019, advised by Charles Catlett and Rajesh Sankaran

  3. Research Assistant, National Chengchi University

    Sep. 2018 — Aug. 2021, advised by Man-Kwan Shan

  4. Research Intern, Academia Sinica

    Feb. 2018 — Jul. 2020, advised by Ling-Jyh Chen

  5. Research Assistant, National Chengchi University

    Jul. 2017 — Jul. 2020, advised by Changya Hu

Blog

Contact

Contact Form