mattermost/mattermost-plugin-calls · issue #1143 · branch fix/1143-warn-rfc1918-docker-ice
A plugin-layer pre-flight diagnostic that detects the exact failure shape #1143 reports: Mattermost Calls running in Docker, advertising a 172.x bridge IP as an ICE candidate, coturn rejecting that peer with 403, and the call dropping minutes later on allocation timeout. Zero behavior change, a single loud log line at activate time, named IPs included, with the exact setting the operator should change.
The reporter ran Mattermost Calls inside Docker with a self-hosted coturn. Calls connected fine, media flowed for minutes, then dropped. The Mattermost plugin logs showed nothing. The clue was buried inside coturn's logs: peer 172.21.0.3 lifetime updated · CREATE_PERMISSION processed, error 403: Forbidden IP. The Docker bridge address was being advertised as an ICE host candidate, coturn refused to relay to a non-allowlisted private IP, and the call eventually died on allocation timeout.
The plugin only passes ICEHostOverride through to github.com/mattermost/rtcd/service/rtc.NewServer; the actual ICE candidate gathering lives there. When ICEHostOverride is empty and rtcd enumerates local interfaces, it picks up whatever the kernel hands it, including the Docker bridge address.
The operator has no signal that this is happening. The plugin logs say "activated"; the call starts; media flows; then on the first TURN relay refresh, coturn rejects the unreachable peer and the session falls apart. Diagnosing requires SSHing into the coturn host, tailing its log, and matching peer addresses against your Docker network.
That diagnosis loop costs hours every time the misconfiguration ships. The fix below saves those hours by making the plugin itself shout, once, at startup.
sequenceDiagram
autonumber
participant Op as Operator
participant Plg as Calls plugin (in Docker)
participant Rtcd as rtcd ICE gather
participant Cli as Client
participant Cot as coturn
Op->>Plg: enable Calls, ICEHostOverride=""
Plg->>Rtcd: start with ICEHostOverride=""
Rtcd->>Rtcd: enumerate interfaces, see 172.21.0.3
Rtcd-->>Cli: ICE candidate 172.21.0.3
Cli-->>Cot: relay to peer 172.21.0.3
Cot-->>Cli: 403 Forbidden IP
Note over Cli,Cot: call drops, no log from plugin
A new file server/ice_diagnostics.go adds three small primitives that run once at activate.go: an RFC1918/RFC6598/ULA predicate, an interface enumerator, and a container detector. The composing function checkICEDockerMisconfiguration emits a single LogError only when four conditions all hold, the operator is not warned in any other configuration.
The warning fires only when all of the following are true:
ICEHostOverride is empty (the operator did not opt out of the diagnostic)/.dockerenv and a /proc/1/cgroup scan for docker, containerd, or kubepodsThe log message names the offending IPs and tells the operator exactly which setting to set:
A LAN-only deployment where every participant sits on the same Docker network is a legitimate setup. Silently dropping the RFC1918 candidate would break that. The diagnostic is purely additive: it does not change which candidates are advertised. It just makes the most common misconfiguration visible.
The real candidate filter, the ability to exclude RFC1918 host candidates from the ICE gather rather than warn about them, belongs in github.com/mattermost/rtcd/service/rtc.NewServer, not in the plugin. The README on the branch offers to send that second PR as a follow-up after this one merges.
flowchart TD
A[activate.go entered] --> B{ICEHostOverride empty?}
B -- no --> Z[return silently]
B -- yes --> C{Inside container?}
C -- no --> Z
C -- yes --> D[Enumerate interfaces]
D --> E{Any routable IP?}
E -- yes --> Z
E -- no --> F{Any private IP?}
F -- no --> Z
F -- yes --> G["LogError with named IPs
and the setting to change"]
G --> H[continue plugin start - no behaviour change]
style G fill:#1e0a0a,stroke:#ef4444,color:#fca5a5
style Z fill:#0a1e10,stroke:#10b981,color:#86efac
172.15.0.0/16 (just outside RFC1918) and 192.167.0.0/16 (just outside). Tested explicitly.fc00::/7 ULA addresses as private. Public IPv6 (2606:4700:...) is correctly treated as routable.The test file covers the predicate against the full RFC1918 + RFC6598 + ULA + loopback + link-local + four public-internet boundary cases.
| Address | Expected | Why this case matters |
|---|---|---|
| 172.21.0.3 | private | The exact address in the bug report |
| 10.0.0.5 | private | RFC1918 10.0.0.0/8 |
| 192.168.1.1 | private | RFC1918 192.168.0.0/16 |
| 100.64.1.2 | private | RFC6598 carrier-grade NAT, cloud NAT gateways leak this |
| fd00::1 | private | IPv6 ULA fc00::/7 |
| 127.0.0.1 / ::1 | private | Loopback, always refuse-to-advertise |
| 169.254.1.1 | private | Link-local, should not leak |
| 172.15.0.1 | public | Must NOT be flagged, just outside 172.16/12 |
| 192.167.255.1 | public | Must NOT be flagged, just outside 192.168/16 |
| 8.8.8.8 / 1.1.1.1 | public | Public DNS resolvers, sanity |
| 2606:4700:4700::1111 | public | Public IPv6 (Cloudflare), IPv6 sanity |
Two additional tests pin the documented behaviour: nil is treated as private (refuse-to-advertise default), and the package-level CIDR slice parses all five entries correctly.
The branch is committed locally and ready to push. The mail framing is honest: this is the plugin-visible half of the bug; the candidate filter itself belongs in mattermost/rtcd, offered as a second PR.
Local branch fix/1143-warn-rfc1918-docker-ice, commit 7a1ea51. Three Go unit tests on isPrivateIP, full PR-shaped README.
Mattermost's review cadence is slower than HF's (corporate, multi-reviewer). Drafting the PR is right; expect 1-2 weeks of review traffic.
Mail body anchors on the branch URL and on HALCYON (the public proof of WebRTC mesh expertise). Compensation range stated: €40-70/h, €2,500-3,000/mo, part-time afternoon-evening CEST until July.
Realistic odds for this lead alone: ~30-50% conversation, ~5-15% contract. Mattermost's hiring cycle is longer than a startup's; this is more likely to produce a referral or a contract-with-procurement than an immediate retainer.