This is a reference implementation of "Daehwa Kim and Chris Harrison, SoundBubble: Finger-Bound Virtual Microphone using Headset/Glasses Beamforming (CHI 2026)". SoundBubble leverages an array microphone, acoustic beamforming, and hand tracking to capture acoustic signals coming from the hand during user interaction, and provides useful input sensing in XR.
Find details on our website: https://daehwakim.com/soundbubble
We provide a pipeline to listen to beamformed output audio with a sound bubble, visualize signal inputs to models, and open weights for the swiping model to support user drawing in the world.
-
Attach the microphone to the Meta Quest 3S and follow the calibration instructions described in our paper
We need Python for audio processing + machine learning and Unity for hand tracking.
python3.9 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
export OPENBLAS_NUM_THREADS=1
Install Unity Hub and open SoundBubble_Unity. We used Editor version 6000.0.26f1.
If you are on macOS, change the serverIP (line 14) in script SoundBubble_Unity/Assets/Scenes/ArrayReceiver.cs to your laptop address. Then, go File>Build Profiles>Android to build the project to Meta Quest.
If you are on Windows, change the serverIP (line 14) in script SoundBubble_Unity/Assets/Scenes/ArrayReceiver.cs to localhost. Then, hit the play button to upload the Unity app to Quest.
python main.py
Then the script will show terminal output to select audio devices and a user interface for SoundBubble. To activate the SoundBubble interface, select the desired audio input and output. For example:
Available Input (Microphone) Devices:
Index 0: Daehwa’s iPhone Microphone
Input channels: 1
Sample rate: 48000.0
Index 1: UMA16v2
Input channels: 16
Sample rate: 44100.0
Index 2: MacBook Air Microphone
Input channels: 1
Sample rate: 48000.0
Available Output (Playback) Devices:
Index 1: UMA16v2
Output channels: 2
Sample rate: 44100.0
Index 3: MacBook Air Speakers
Output channels: 2
Sample rate: 48000.0
Select microphone device index: 1
Select headphone device index: 3
The input should be UMA16v2 (our 16-channel microphone) and the output can be any device.
Four panels in the interface show model inputs and predictions.

Use the function keys to see different effects in your Unity app, as shown in Preview of Repo.



