The Security Gap in Open-Source Audio Watermarks
Why we built Open Access for audio watermarking
Open research has accelerated audio watermarking. But open deployment introduces a harder security question. Audio is now easy to generate, easy to transform, and increasingly difficult to verify at scale.
Community efforts around open-sourced watermark algorithms have made the field more visible, more inspectable, and easier to benchmark. But they have also exposed a harder deployment problem: when a watermarking system is fully exposed, how much security remains once attackers can optimize directly against it?
As generative audio scales, the question is no longer only can we embed a transparent watermark? It is can that watermark remain meaningful under attack, and can we verify it without turning the watermark system itself into an attack surface?
That tension is what led us to build an Open Access infrastructure.
Introduction: the open-source security paradox
As generative audio has moved from research novelty to production reality, watermarking has become one of the most widely discussed tools for copyright protection, provenance, and AI-content authentication.
One of the most visible examples in this conversation is Meta AudioSeal, which has helped make open-source audio watermarking more accessible to the broader research community.
That visibility matters. Research transparency matters. But transparency and deployment security are not the same thing.
When watermark systems are fully open—from encoder to decoder to detection logic—the same accessibility that helps inspection can also give attackers a precise optimization target. In practice, open-sourcing the full watermark stack can become equivalent to publishing the engineering blueprint of the defense itself.
To understand how serious this problem is, we evaluated mainstream open-source audio watermark systems under a white-box threat model. Our conclusion is direct: under white-box access, open-source audio watermark systems can be far more fragile than they appear in standard demos.
This article focuses on three attack classes:
- destruction — blinding the watermark recovery path
- grafting — transferring watermark identity to unrelated audio
- forgery — rewriting the hidden payload into an attacker-chosen message
The point is not that open research is bad. The point is that open source is not the same thing as secure deployment.
1. The Eraser: destruction under white-box access
The most basic attack asks a simple question: How hard is it to perturb the audio just enough that the watermark can no longer be recovered reliably?
We tested 500 watermarked audio files, each 10 seconds long, at AudioSeal’s native sampling rate of 16 kHz. The original files began with 100% recovery accuracy. We then applied a white-box adversarial perturbation: an imperceptible, optimized signal designed specifically to disrupt watermark recovery.
- · Success rate: 100%
- · Time to break: 2–3 optimization steps
- · Runtime: about 2 seconds on a consumer GPU
In practice, the watermark recovery path was blinded almost immediately. Accuracy fell from 100% to below 50%—close to random guessing.
The implication is straightforward: once the model is exposed, watermark erasure becomes cheap, fast, and highly reliable.
Visualization 1 — Eraser attack efficacy
Watermark recovery before vs. after the Eraser attack
Vis 01 / Eraser Efficacy
Watermark recovery collapse
2. The Copy-Paste: grafting and identity transfer
The second attack asks a more troubling question: If a watermark signal can be isolated from one file, can it be transferred onto a completely different file and still be read as valid?
To test this, we extracted watermark masks from watermarked audio and overlaid them onto unrelated, previously unwatermarked clips.
- · Overall viability: high
- · Perfect transfers: 77 out of 500 reached 100% recovery
- · Common range: many other transfers landed between 87.5% and 93.75%
This is not a positive outcome for the system. A high transfer success rate means that watermark-bearing identity can be moved from one asset to another unrelated asset and still be accepted as valid. In practical terms, that means copyright-linked or provenance-linked information can be transplanted onto audio that never originally carried it.
That is why grafting should be understood as a form of identity theft for audio provenance.
Visualization 2 — Grafting attack distribution
How often transferred watermark content was still accepted as valid
The problem is not that transfer sometimes works. The problem is that it works at all.
Vis 02 / Identity Transfer
Grafting Accuracy Distribution (500 Samples)
A high transfer success rate indicates severe identity theft vulnerability.
3. The Brainwash: targeted forgery of hidden payloads
The third attack is the most severe. Destroying a watermark is one thing. Transferring a watermark is worse. But the highest-risk scenario is targeted forgery: making the system recover a completely different hidden message chosen by the attacker.
We instructed the attack to rewrite the original hidden payload into a random target message. The success criterion was strict: the recovered message had to match the attacker-chosen target perfectly.
- · Success rate: 100%
- · Time to break: 3–16 optimization steps
- · Runtime: about 2–6 seconds on a consumer GPU
- · Average: roughly 5 steps
Under that threat model, the watermark system was fully compromised. With only a small allowed waveform disturbance, the attack consistently forced recovery of the attacker-chosen target watermark. This is no longer simple removal. It is payload forgery.
Visualization 3 — Forgery attack flow
Original audio → optimized perturbation toward target → forged audio outputs target message
Vis 03 / Targeted Forgery
Forcing an attacker-chosen payload
Why white-box access changes the game
Why do these attacks work so well? Because under white-box access, the attacker is not guessing. The attacker can optimize directly against the watermark system itself.
At 16 kHz audio, every second of waveform gives the attacker 16,000 dimensions they can manipulate. Once the model is open, those dimensions become controllable knobs. Optimization can then use the model’s own behavior to find the fastest path toward destruction, transfer, or forgery.
That is the white-box problem in practice. We do not need to overcomplicate it. The important point is simple: when the system is fully exposed, the attack becomes dramatically easier to engineer.
From vulnerability to design choice
Taken together, these three attacks point to the same underlying issue: the open-source approach creates real security problems in deployment.
Part of the reason is a design choice. Many open-source watermark systems are built first for transparency, availability, or research accessibility—not with security as the primary product constraint. That does not make them useless for research. But it does limit the kinds of real-world scenarios they can safely support.
We made a different decision at OfSpectrum. From the beginning, we treated security as a first-order design requirement. That is one of the key reasons we chose an Open Access model: to reduce the three risks shown above while still allowing serious external evaluation.
Our solution: Open Access over open source
We believe in real technical transparency and real community validation. But we do not believe that security-sensitive watermark infrastructure should expose every core implementation detail in ways that make white-box compromise trivial.
That is why we chose Open Access. Open Access means researchers, labs, and enterprise teams can evaluate the system, test real workflows, and pressure-test the API—without turning the entire core watermark stack into an always-available attack blueprint.
Our Open Access infrastructure is designed to support secure evaluation and production-oriented use cases through:
- Verified API access for encoding and decoding workflows
- Payload protection for watermark-linked provenance data
- Evolving implementation layers rather than a permanently exposed static core
- Scaling support for higher-volume production pipelines
In other words: we want evaluation to be more open without making security-critical infrastructure easier to weaponize.
Privacy, provenance, and a stronger community
We do not think privacy-preserving provenance should remain the concern of a small technical circle. It should be something researchers, builders, institutions, and the broader public can all benefit from.
Our goal is not simply to publish a system. It is to help create a healthier community around audio provenance, privacy, and trustworthy evaluation—one where more people can participate, test, and contribute without forcing security-sensitive infrastructure into the wrong access model.
If you care about audio watermarking, provenance, adversarial ML, privacy, or trustworthy generative media, we invite you to join that effort with us.
