About me

Greetings! I'm your captain for this conversation, Tzu-Heng (Brian), a 4th-year CS Ph.D. student at the University of Wisconsin-Madison (UW-Madison), working with Prof. Frederic Sala. Currently, I am interning in Apple AIML, working on large-scale multimodal models, advising by Dr. Javier Movellan and Manjot Bilkhu. Before joining UW-Madison, I earned my B.S. degree in CS from National Chengchi University (NCCU), where I was fortunately advised by Prof. Man-Kwan Shan (NCCU) and Dr. Ling-Jyh Chen (Academia Sinica) in the realm of spatio-temporal machine learning and large-scale sensor networks. In 2019, I interned in Argonne National Laboratory in the team, Array of Things, working with Dr. Charlie Catlett and Dr. Rajesh Sankaran.

I am passionate about advancing machine learning to empower models to learn more with less supervision. I am focusing on Data-centric AI, particularly in designing efficient methods for multimodal data curation, zero-cost labeling systems, and data-efficient LLM customization. These strategies are rooted in weak supervision frameworks to build foundation models with fewer human annotations. Additionally, I am developing a new notion, parameter marketplace, to accelerate training while monetizing parameters as a second profit center.

Beyond my academic pursuits, I have taken on the leadership roles, serving as the President (2022 - 2023) and the Vice President (2021 - 2022) in Student Association of Taiwan (SAT) in UW-Madison.

In 2023, we established Awan.AI, a startup focusing on AI solutions for traditional Chinese medicine (TCM). We have built LLMs tailored for TCM and tongue syndrome diagnosis systems.

Recent News

  • Emily evans

    NeurIPS'24 Spotlight Paper

    Sep. 2024

    "Alchemist has been accepcted in NeurIPS as a spotlight paper."

  • Emily evans

    Paper Accepcted by NeurIPS'24

    Sep. 2024

    "The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators"

    Paper Abstract: Large pretrained models can be used as annotators, helping replace or augment crowdworkers and enabling distilling generalist models into smaller specialist models. Unfortunately, this comes at a cost: employing top-of-the-line models often requires paying thousands of dollars for API calls, while the resulting datasets are static and challenging to audit. To address these challenges, we propose a simple alternative: rather than directly querying labels from pretrained models, we task models to generate programs that can produce labels. These programs can be stored and applied locally, re-used and extended, and cost orders of magnitude less. Our system, Alchemist, obtains comparable to or better performance than large language model-based annotation in a range of tasks for a fraction of the cost: on average, improvements amount to a 12.9% enhancement while the total labeling costs across all datasets are reduced by a factor of approximately 500×.

  • Henry william

    Interning in Apple

    May. 2024

    "This summer, I join Apple as a AIML Research Intern, working with Javier Movellan and Manjot Bilkhu on data-centric AI for multimodal models."

  • Henry william

    Attending NeurIPS'23 in New Orleans!

    Dec. 2023

    "6 papers from our group (Sprocket Lab) are going to present in NeurIPS! Welcome to chat with me about parameter marketplace and its future."

  • Emily evans

    Paper Accepted by NeurIPS'23

    Sep. 2023

    "Train 'n Trade: Foundations of Parameter Markets"

    Paper Abstract: Organizations typically train large models individually. This is costly and time-consuming, particularly for large-scale foundation models. Such vertical production is known to be suboptimal. Inspired by this economic insight, we ask whether it is possible to leverage others' expertise by trading the constituent parts in models, i.e., sets of weights, as if they were market commodities. While recent advances in aligning and interpolating models suggest that doing so may be possible, a number of fundamental questions must be answered to create viable parameter markets. In this work, we address these basic questions, propose a framework containing the infrastructure necessary for market operations to take place, study strategies for exchanging parameters, and offer means for agents to monetize parameters. Excitingly, compared to agents who train siloed models from scratch, we show that it is possible to mutually gain by using the market, even in competitive settings. This suggests that the notion of parameter markets may be a useful paradigm for improving large-scale model training in the future.

  • Emily evans

    Paper Accepted by NeurIPS'23

    Sep. 2023

    "Geometry-Aware Adaptation for Pretrained Models"

    Paper Abstract: Machine learning models---including prominent zero-shot models---are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes---or, in the case of zero-shot prediction, to improve its performance---without any additional training. Our technique is a drop-in replacement of the standard prediction rule, swapping with the Fréchet mean. We provide a comprehensive theoretical analysis for this approach, studying (i) learning-theoretic results trading off label space diameter, sample complexity, and model dimension, (ii) characterizations of the full range of scenarios in which it is possible to predict any unobserved class, and (iii) an optimal active learning-like next class selection procedure to obtain optimal training classes for when it is not possible to predict the entire range of unobserved classes. Empirically, using easily-available external metrics, our proposed approach, Loki, gains up to 29.7% relative improvement over SimCLR on ImageNet and scales to hundreds of thousands of classes. When no such metric is available, Loki can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pretrained zero-shot models such as CLIP.

  • Emily evans

    Paper Accepted by ICCV'23 Datacomp Workshop

    Sep. 2023

    "Multimodal Data Curation via Object Detection and Filter Ensembles"

    Paper Abstract: We propose an approach for curating multimodal data that we used for our entry in the 2023 DataComp competition filtering track. Our technique combines object detection and weak supervision-based ensembling. In the first of two steps in our approach, we employ an out-of-the-box zero-shot object detection model to extract granular information and produce a variety of filter designs. In the second step, we employ weak supervision to ensemble filtering rules. This approach results in a 4% performance improvement when compared to the best-performing baseline, producing the top-ranking position in the small scale track at the time of writing. Furthermore, in the medium scale track, we achieve a noteworthy 4.2% improvement over the baseline by simply ensembling existing baselines with weak supervision.

  • Emily evans

    Rank #1 in the Datacomp'23 competition

    Aug. 2023

    "Top-ranking position in the ICCV Datacomp'23 competition (small-scale filtering track)"

  • Emily evans

    Paper Accepted by ICLR'23 DL4C Workshop

    Mar. 2023

    "ScriptoriumWS: A Code Generation Assistant for Weak Supervision"

    Paper Abstract: Weak supervision is a popular framework for overcoming the labeled data bottleneck: the need to obtain labels for training data. In weak supervision, multiple noisy-but-cheap sources are used to provide guesses of the label and are aggregated to produce high-quality pseudolabels. These sources are often expressed as small programs written by domain experts—and so are expensive to obtain. Instead, we argue for using code-generation models to act as coding assistants for crafting weak supervision sources. We study prompting strategies to maximize the quality of the generated sources, settling on a multi-tier strategy that incorporates multiple types of information. We explore how to best combine hand-written and generated sources. Using these insights, we introduce ScriptoriumWS, a weak supervision system that, when compared to hand-crafted sources, maintains accuracy and greatly improves coverage.

  • Paper Accepted

    Paper Accepted by NeurIPS'22

    Sep. 2022

    "AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels".

    Paper Abstract: Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed by labeling functions (LFs). While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of constructing labeling functions for domains with complex or high-dimensional features. To address this, a handful of methods have proposed automating the LF design process using a small set of ground truth labels. In this work, we introduce AutoWS-Bench-101: a framework for evaluating automated WS (AutoWS) techniques in challenging WS settings---a set of diverse application domains on which it has been previously difficult or impossible to apply traditional WS techniques.

  • Jessica miller

    President of SAT at UW-Madison

    May. 2022

    "Our main mission is to bring taiwanese students together from all fields of study for recreational, academic, and cultural purposes.

  • Henry william

    Joining CS Dept at UW-Madison

    Aug. 2021

    "Here is the start of my Ph.D. journey. Mamba mentality always."

Experience

Resume [PDF]

Education

  1. University of Wisconsin, Madison (UW-Madison)

    Aug. 2021 — Present (4th-year)

    Ph.D. in Computer Science. Minoring in Economics.

  2. National Chengchi University (NCCU)

    Sep. 2016 — Jul. 2020

    B.S. in Computer Science.

Publications

  1. The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators

    NeurIPS'24 (Spotlight)

    Tzu-Heng Huang, Catherine Cao, Vaishnavi Bhargava, Frederic Sala

    [PDF]
  2. MoRe Fine-Tuning with 10x Fewer Parameters

    ICML'24 Efficient Systems for Foundation Models (ES-FoMo) Workshop & ICML'24 Foundation Models in the Wild Workshop

    Wenxuan Tan, Nicholas Roberts, Tzu-Heng Huang, Jitian Zhao, John Cooper, Samuel Guo, Chengyu Duan, Frederic Sala

    [PDF]
  3. Train 'n Trade: Foundations of Parameter Markets

    NeurIPS'23

    Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala

    [PDF]
  4. Geometry-Aware Adaptation for Pretrained Models

    NeurIPS'23

    Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

    [PDF]
  5. Multimodal Data Curation via Object Detection and Filter Ensembles

    ICCV'23 Towards the Next Generation of Computer Vision Datasets (TNGCV) Workshop

    Tzu-Heng Huang*, Changho Shin*, Sui Jiet Tay, Dyah Adila, Frederic Sala

    [PDF]
  6. ScriptoriumWS: A Code Generation Assistant for Weak Supervision

    ICLR'23 Deep Learning for Code (DL4C) Workshop & 2023 Midwest Machine Learning Symposium

    Tzu-Heng Huang, Catherine Cao, Spencer Schoenberg, Harit Vishwakarma, Nicholas Roberts, Frederic Sala

    [PDF]
  7. AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

    NeurIPS'22

    Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala

    [PDF]
  8. Key Sensor Discovery for Quality Audit of Air Sensor Networks

    MobiSys'20

    Tzu-Heng Huang, Cheng-Hsien Tsai, Man-Kwan Shan

    [PDF]

Experience

  1. AIML Research Intern, Apple

    May. 2024 — Present, advised by Javier Movellan and Manjot Bilkhu

  2. Founder, Awan.AI LLC

    May. 2023 — Apr. 2024, established with Eric Lin and Jet Lin

  3. Graduate Research Student, University of Wisconsin-Madison

    Feb. 2022 — Present, advised by Frederic Sala

  4. Research Intern, Argonne National Laboratory

    Jun. 2019 — Sep. 2019, advised by Charlie Catlett and Rajesh Sankaran

  5. Research Assistant, National Chengchi University

    Sep. 2018 — Aug. 2021, advised by Man-Kwan Shan

  6. Research Intern, Academia Sinica

    Feb. 2018 — Jul. 2020, advised by Ling-Jyh Chen

  7. Research Assistant, National Chengchi University

    Jul. 2017 — Jul. 2020, advised by Changya Hu

Blog

Contact

Contact Form