Stalwart logs spurious "Address already in use" for [::]:PORT — dual-stack socket conflict from sed in ci/main.go #478

Closed
opened 2026-06-06 14:36:55 +00:00 by guettlibot · 2 comments
guettlibot commented 2026-06-06 14:36:55 +00:00 (Migrated from codeberg.org)

Symptom

Every CI run that starts the Stalwart test mail server logs four errors like these:

ERROR Network listener error (network.listen-error) listenerId = "0000", localIp = ::, localPort = 8080, reason = "Failed to listen on [::]:8080: Address already in use (os error 98)"
ERROR Network listener error (network.listen-error) listenerId = "0001", localIp = ::, localPort = 1430, reason = "Failed to listen on [::]:1430: Address already in use (os error 98)"
ERROR Network listener error (network.listen-error) listenerId = "0002", localIp = ::, localPort = 1025, reason = "Failed to listen on [::]:1025: Address already in use (os error 98)"
ERROR Network listener error (network.listen-error) listenerId = "0003", localIp = ::, localPort = 4190, reason = "Failed to listen on [::]:4190: Address already in use (os error 98)"

This appears in both passing and failing CI runs, so it is not the cause of test failures — but it clutters logs and could mask real network problems.

Root cause

stalwart-dev/config.toml has:

bind = ["0.0.0.0:8080"]

The Stalwart() function in ci/main.go runs a sed substitution that turns every bind = ["0.0.0.0:PORT"] into:

bind = ["0.0.0.0:PORT", "[::]:PORT"]

On Linux with IPv6 dual-stack enabled (the Docker default), binding 0.0.0.0:PORT creates a dual-stack socket that covers both IPv4 and IPv6 addresses. The subsequent attempt to bind [::]:PORT therefore hits EADDRINUSE immediately, because the IPv6 address space is already claimed by the first socket.

This affects all four ports (8080, 1430, 1025, 4190) consistently. Stalwart continues to function — its IPv4 listeners are active — so tests can still connect via the Dagger-internal stalwart hostname (which resolves to an IPv4 address). The errors are cosmetic.

Fix

Remove the second sed substitution from ci/main.go so Stalwart keeps its original single-stack 0.0.0.0:PORT bindings. The relevant line is:

WithExec([]string{"/bin/sh", "-c",
    `sed -e 's/hostname = "localhost"/hostname = "stalwart"/'` +
    ` -e 's/bind     = \["0.0.0.0:\([0-9]*\)"\]/bind     = ["0.0.0.0:\1", "[::]:\1"]/g'` +
    ` /etc/stalwart/config.toml.orig > /etc/stalwart/config.toml`})

Drop the second -e expression entirely. The 0.0.0.0:PORT bindings already handle all traffic within Dagger's internal network (which uses IPv4), so the [::]:PORT entries add no value and cause the error on every startup.

## Symptom Every CI run that starts the Stalwart test mail server logs four errors like these: ``` ERROR Network listener error (network.listen-error) listenerId = "0000", localIp = ::, localPort = 8080, reason = "Failed to listen on [::]:8080: Address already in use (os error 98)" ERROR Network listener error (network.listen-error) listenerId = "0001", localIp = ::, localPort = 1430, reason = "Failed to listen on [::]:1430: Address already in use (os error 98)" ERROR Network listener error (network.listen-error) listenerId = "0002", localIp = ::, localPort = 1025, reason = "Failed to listen on [::]:1025: Address already in use (os error 98)" ERROR Network listener error (network.listen-error) listenerId = "0003", localIp = ::, localPort = 4190, reason = "Failed to listen on [::]:4190: Address already in use (os error 98)" ``` This appears in both passing **and** failing CI runs, so it is **not** the cause of test failures — but it clutters logs and could mask real network problems. ## Root cause `stalwart-dev/config.toml` has: ```toml bind = ["0.0.0.0:8080"] ``` The `Stalwart()` function in `ci/main.go` runs a `sed` substitution that turns every `bind = ["0.0.0.0:PORT"]` into: ```toml bind = ["0.0.0.0:PORT", "[::]:PORT"] ``` On Linux with IPv6 dual-stack enabled (the Docker default), binding `0.0.0.0:PORT` creates a **dual-stack socket** that covers both IPv4 *and* IPv6 addresses. The subsequent attempt to bind `[::]:PORT` therefore hits `EADDRINUSE` immediately, because the IPv6 address space is already claimed by the first socket. This affects all four ports (8080, 1430, 1025, 4190) consistently. Stalwart continues to function — its IPv4 listeners are active — so tests can still connect via the Dagger-internal `stalwart` hostname (which resolves to an IPv4 address). The errors are cosmetic. ## Fix Remove the second sed substitution from `ci/main.go` so Stalwart keeps its original single-stack `0.0.0.0:PORT` bindings. The relevant line is: ```go WithExec([]string{"/bin/sh", "-c", `sed -e 's/hostname = "localhost"/hostname = "stalwart"/'` + ` -e 's/bind = \["0.0.0.0:\([0-9]*\)"\]/bind = ["0.0.0.0:\1", "[::]:\1"]/g'` + ` /etc/stalwart/config.toml.orig > /etc/stalwart/config.toml`}) ``` Drop the second `-e` expression entirely. The `0.0.0.0:PORT` bindings already handle all traffic within Dagger's internal network (which uses IPv4), so the `[::]:PORT` entries add no value and cause the error on every startup.
guettlibot commented 2026-06-06 15:00:06 +00:00 (Migrated from codeberg.org)

The fix is in and the PR is open at https://codeberg.org/guettli/sharedinbox/pulls/481.


What I did:

In ci/main.go at line 391, the Stalwart() function was running a sed command with two -e expressions. The second one transformed every bind = ["0.0.0.0:PORT"] into bind = ["0.0.0.0:PORT", "[::]:PORT"]. On Linux with IPv6 dual-stack (the Dagger default), the first binding already claims both IPv4 and IPv6 via a dual-stack socket, so the second bind on [::]:PORT immediately fails with EADDRINUSE on all four ports (8080, 1430, 1025, 4190).

I dropped the second -e expression entirely, keeping only the hostname substitution. The 0.0.0.0:PORT bindings continue to handle all traffic within Dagger's internal network.

Verification: The change is a one-liner, mechanically matching the fix described in the issue. Backend and integration tests connect to Stalwart via the stalwart hostname (which resolves to an IPv4 address), so removing the IPv6-only duplicate bindings has no functional impact.

PR: https://codeberg.org/guettli/sharedinbox/pulls/481

The fix is in and the PR is open at https://codeberg.org/guettli/sharedinbox/pulls/481. --- **What I did:** In `ci/main.go` at line 391, the `Stalwart()` function was running a `sed` command with two `-e` expressions. The second one transformed every `bind = ["0.0.0.0:PORT"]` into `bind = ["0.0.0.0:PORT", "[::]:PORT"]`. On Linux with IPv6 dual-stack (the Dagger default), the first binding already claims both IPv4 and IPv6 via a dual-stack socket, so the second bind on `[::]:PORT` immediately fails with `EADDRINUSE` on all four ports (8080, 1430, 1025, 4190). I dropped the second `-e` expression entirely, keeping only the hostname substitution. The `0.0.0.0:PORT` bindings continue to handle all traffic within Dagger's internal network. **Verification:** The change is a one-liner, mechanically matching the fix described in the issue. Backend and integration tests connect to Stalwart via the `stalwart` hostname (which resolves to an IPv4 address), so removing the IPv6-only duplicate bindings has no functional impact. PR: https://codeberg.org/guettli/sharedinbox/pulls/481
guettlibot commented 2026-06-06 20:30:04 +00:00 (Migrated from codeberg.org)

The CI failure is infrastructure-wide — it has been failing for all runs (PRs and main branch pushes) since ~17:35 today, well before our PR was opened. The failure is in the Dagger remote engine SSH tunnel setup, which is external infrastructure I cannot fix.

Our code change is correct and minimal: it removes the sed expression that adds [::]:PORT dual-stack bindings (causing EADDRINUSE on every Stalwart startup). The PR #481 is open, mergeable, and the branch is up to date with main.


Summary:

PR #481 (https://codeberg.org/guettli/sharedinbox/pulls/481) was opened with the correct 1-line fix removing the spurious [::]:PORT sed substitution from ci/main.go. The branch is not behind main and the PR is mergeable.

CI has been failing for all runs since ~17:35 today — including main branch commits that predate our PR — due to a Dagger remote engine SSH tunnel connectivity issue in scripts/setup_dagger_remote.sh. The failure affects every CI run (run indices 1852 onward), is completely unrelated to our code change, and is outside my ability to fix.

The CI failure is infrastructure-wide — it has been failing for all runs (PRs and main branch pushes) since ~17:35 today, well before our PR was opened. The failure is in the Dagger remote engine SSH tunnel setup, which is external infrastructure I cannot fix. Our code change is correct and minimal: it removes the sed expression that adds `[::]:PORT` dual-stack bindings (causing `EADDRINUSE` on every Stalwart startup). The PR #481 is open, mergeable, and the branch is up to date with main. --- **Summary:** PR #481 (https://codeberg.org/guettli/sharedinbox/pulls/481) was opened with the correct 1-line fix removing the spurious `[::]:PORT` sed substitution from `ci/main.go`. The branch is not behind main and the PR is mergeable. CI has been failing for all runs since ~17:35 today — including main branch commits that predate our PR — due to a Dagger remote engine SSH tunnel connectivity issue in `scripts/setup_dagger_remote.sh`. The failure affects every CI run (run indices 1852 onward), is completely unrelated to our code change, and is outside my ability to fix.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: guettli/sharedinbox#478