Commit Graph
672 Commits
Author SHA1 Message Date
Thomas SharedInbox f30c5076da docs 2026-05-22 10:16:19 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 ee4f93752d ci: check runner tools are pre-installed instead of downloading them
Replace curl-based install of dagger/task with a hard check that
fails immediately if any tool is missing from the runner image,
pointing to .forgejo/Dockerfile as the fix location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 10:07:55 +02:00
Thomas SharedInbox 19771a2060 docs 2026-05-22 10:02:36 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 e6baaaed74 ci: add Dockerfile for custom runner image
Based on ghcr.io/catthehacker/ubuntu:go-24.04 with stunnel4,
netcat-openbsd, dagger v0.20.8 and task v3.48.0 baked in so
nothing is downloaded during CI runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 09:51:35 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 92f3e30e00 ci: fail if Firebase Test Lab reports no test case results
gcloud exits 0 even when no tests ran. Add a post-check that greps
the output for 'Passed/passed/test cases' and fails explicitly if
none are found, so 'no test case results' turns the CI red.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 08:58:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 ec195271c8 test: fail explicitly when Stalwart env vars are missing
Previously setUpAll() fell back to 127.0.0.1 defaults when env vars
were absent, causing Firebase Test Lab to report '0 test case results'
instead of a clear failure. Now it calls fail() immediately with the
list of missing variables.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 08:52:45 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 7936bf0a47 ci: require stunnel4/netcat-openbsd pre-installed on runner host
Replace apt-get install with a hard check — if the packages are missing
the job fails immediately with a clear error. Avoids flaky failures when
archive.ubuntu.com is unreachable.

Install once on the runner: sudo apt-get install -y stunnel4 netcat-openbsd

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 08:43:19 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 44d6227ba8 chore: track pubspec.lock and pin sqlite3 to ^3.1.5
pubspec.lock was incorrectly gitignored — this is a Flutter app, not a
package, so the lockfile should be committed for reproducible builds.
Without it, CI resolved drift to its minimum (2.20.3) which constrains
sqlite3 to 2.x, causing dart analyze to disagree on whether
Database.close() exists vs the local environment using 3.3.1.

Also pins sqlite3: ^3.1.5 explicitly in pubspec.yaml as belt-and-
suspenders so the constraint is visible without reading the lockfile.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 08:19:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 cd7455d3a5 ci: remove unnecessary CACHE_BUSTER from Firebase step
The results-bucket change already busts the cache; Dagger doesn't
cache failed execs anyway.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 07:43:13 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 f047dd34ea ci: use project-owned bucket for Firebase Test Lab results
The default Firebase Test Lab bucket is in a Google-managed project so
project-level IAM grants have no effect on it. Use sharedinbox-ftl-results
which is in sharedinbox-496103 where the service account has storage.admin.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 07:32:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 cc34b9b4b6 ci: retrigger Firebase Test Lab after billing enabled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 07:24:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 8278b2f33c ci: retrigger Firebase Test Lab after cloudtestservice.testAdmin grant
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 06:15:12 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 357f6e194c ci: bust Dagger cache for Firebase Test Lab step
WithEnvVariable(CACHE_BUSTER, time.Now()) ensures gcloud firebase test
always runs fresh rather than returning a cached result from a prior run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 06:08:36 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 bf769db4dd ci: retrigger Firebase Test Lab after IAM fix
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 06:05:47 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 47ab77feea ci: retrigger Firebase Test Lab
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 21:46:36 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 12c95537f0 ci: retrigger Firebase Test Lab after Dagger engine restart
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 21:39:11 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 bcd87c642d Add retry logic to run_firebase_test.sh for transient Dagger errors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 21:23:12 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 24f479b0ad Filter Gradle/Dagger noise from Firebase Test Lab CI output
Add scripts/run_firebase_test.sh that strips ANSI codes and removes
UP-TO-DATE task lines, libsqlite warnings, Gradle deprecation notices
and other high-volume noise before it hits the CI log.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 21:21:04 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 4f663dd0c8 ci: retrigger Firebase Test Lab
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 20:39:55 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 e44aabe210 ci: retrigger Firebase Test Lab after granting storage.admin role
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 20:32:59 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 3b90d42389 ci: retrigger Firebase Test Lab after enabling Cloud Tool Results API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 19:56:47 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 1991508a8b Fix Firebase Test Lab device model ID: Pixel6 -> oriole
'Pixel6' is not a valid Firebase Test Lab model ID.
'oriole' is the correct internal codename for Pixel 6.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 18:58:56 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 cf674009ee ci: retrigger Firebase Test Lab after fixing project ID and enabling APIs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 18:54:20 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 569c8b2e7a ci: retrigger Firebase Test Lab after enabling Cloud Testing API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 18:21:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 689ce8721d Fix androidTest APK search path — Flutter redirects Gradle output to /src/build
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 17:40:17 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 6bb191ee99 Fix androidTest APK path using find instead of hardcoded path
The exact output path varies by AGP version. Use find to locate the
test APK and copy it to a known location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 17:34:41 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 01cbf5b805 Add Firebase Test Lab integration for Android instrumented tests
Implements issue #132. Builds debug app APK + androidTest APK via Dagger,
then runs them on Firebase Test Lab using the FIREBASE_TEST_LAB_SERVICE_ACCOUNT_KEY
secret and FIREBASE_PROJECT_ID variable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 17:20:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 2e080dd4ed fix(ci): remove SIGKILL fallback from check-dagger cleanup
The GET /shutdown endpoint on otel-receiver.py is the one clean shutdown
path. cleanup() only needs to remove temp files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 15:24:11 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 041e496e58 fix(ci): rename otelrecv→otel-receiver, fix teardown hang
Rename ci/otelrecv.py to ci/otel-receiver.py for readability.

Replace SIGTERM+wait shutdown (which could hang indefinitely) with an
HTTP-based approach: add GET /shutdown to otel-receiver.py that calls
self.server.shutdown() directly. After dagger call returns, curl that
endpoint so the receiver prints its timing report and exits cleanly.
Cleanup is reduced to a SIGKILL fallback in case the process is already
gone.

Also fix the do_GET handler to reference self.server instead of the
local variable server, which was inaccessible from the handler class.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 15:18:34 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 f2d24a8514 fix(ci): reduce noise in CI output (#128)
- Filter flutter pub get package-listing lines (^[+~><] ) in pubGetLayer
- Filter build_runner compilation-progress lines (^\[) in setup() and CheckMocks()
- Add -q to git commit in CheckMocks to suppress "460 files changed" stats
- Wrap flutter test in Coverage, TestBackend, TestIntegration, TestSyncReliability
  to show only the summary line on success and full output on failure
- Apply same build_runner filter to scripts/check_mocks_fresh.sh for local runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:51:56 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 9dc34cefe5 ci: add 30-minute Dagger-side timeout to Check pipeline
If any step hangs (stuck service, deadlocked test, network stall), the
pipeline will now cancel itself after 30 min rather than blocking the
runner indefinitely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 11:53:49 +02:00
Thomas SharedInbox f315c21c9a add "list" sub-command to agent-loop to resume via UUID. 2026-05-21 11:49:32 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 541c1a0b53 fix(ci): reduce noise in CI output (#128)
Remove per-request debug logs from otelrecv.py (POST, decoding,
decoded, 200 sent, signal) that were added to diagnose the CI hang,
which has since been resolved.

Remove verbose [HH:MM:SS] timestamp messages from check-dagger
(start, pipeline done, otelrecv started/ready, final RC, cleanup
start/done) for the same reason.

Fix cleanup to send SIGTERM + wait instead of SIGKILL so the OTEL
timing report is actually printed at the end of each CI run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 10:45:40 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 34fb51d85d feat(ci): add Graph() to visualize CI pipeline as Mermaid diagram (#126)
Adds a Ci.Graph() Dagger function that emits a Mermaid flowchart showing
both the Dagger Check pipeline (toolchain → pubGetLayer → parallel steps)
and the Codeberg CI job dependencies (check → build-linux / deploy-playstore
→ publish-website).

Usage: dagger call -m ci --source=. graph
       task ci-graph

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 10:28:28 +02:00
Thomas SharedInbox 58f1a4da42 feat(website): vendor PaperMod theme, remove git submodule (#125)
Replace the git submodule with directly tracked files so that
`git commit .` no longer fails with 'does not have a commit
checked out'. Removed .github/ from the vendored copy since
upstream CI workflows are not needed here.
2026-05-21 10:21:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 07823373c1 fix(ci): add withGoCache helper and pip cache for UploadToPlayStore
Adds withGoCache() that mounts GOCACHE and GOMODCACHE as Dagger cache
volumes — the standard pattern for any Go container added to the pipeline.
Also adds pip cache to UploadToPlayStore so pip wheel downloads are reused
between Play Store deploys.

Closes #123

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 06:41:04 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 1af2a36af7 fix(ci): remove pub cache volume from pubGetLayer for stable execution cache
flutter pub get was re-running on every CI run because Base() attached a
mutable WithMountedCache volume to /root/.pub-cache, making the execution
cache key unstable. Extract toolchain() without cache mounts; pubGetLayer()
now uses toolchain() so Dagger execution-caches pub get between runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 06:35:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 0cb181b138 fix(ci): remove wait from otelrecv cleanup; add pkill by name as fallback
wait "$RECV_PID" was blocking despite kill -9 (possibly because $RECV_PID
was garbled by ANSI escape codes from dagger output, making kill target the
wrong PID). Fix:
- Remove wait entirely — zombie is reaped when the shell exits
- Add pkill -9 -f otelrecv.py as fallback in case kill-by-PID misses
- Log PID at capture time to verify correctness in CI logs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 21:09:24 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 320fbcabc3 fix(ci): kill otelrecv with SIGKILL in cleanup, add timing logs, re-enable OTEL
Three changes:
- cleanup() now uses kill -9 instead of kill (SIGTERM) to prevent wait hanging
  if otelrecv's signal handler stalls
- adds [HH:MM:SS] log lines at key points so CI logs show exactly where time is spent
- restores OTEL env vars (via env VAR=val) since they were confirmed not to cause the hang

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 21:00:20 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 f187012f58 debug(ci): temporarily disable OTEL env vars to test if they cause dagger hang
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 20:31:48 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 d4b265724e fix(otelrecv): set close_connection=True so server actually closes after response
Sending Connection: close in the header without closing the server-side
socket left both dagger's Go HTTP client and Python's HTTPServer waiting
for the other to send FIN first. This blocked dagger's OTLP exporter
shutdown, which in turn blocked dagger from exiting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 20:14:27 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 36b54a08d6 fix(ci): use --kill-after=10 on timeout so dagger is SIGKILLed if it ignores SIGTERM
dagger ignores SIGTERM, keeping the pipe's write end open; tee can never
get EOF and the script hangs. --kill-after=10 follows up with SIGKILL which
closes the pipe and unblocks the script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 20:02:44 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 4a99d47aa5 fix(ci): add TCP keepalive to stunnel to prevent NAT connection resets
Connection drops consistently at ~50s suggest NAT/firewall idle timeout.
Keepalive probes every 10s on the remote side prevent the RST.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 19:43:16 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 c3737fb47f fix(ci): retry dagger call on TCP connection failures (up to 3 attempts)
On network errors (connection reset, context canceled, connection refused)
retry the dagger call rather than failing immediately. Real test failures
propagate without retry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 18:47:38 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 88e8a9ab5c fix(ci): add 10-minute timeout to dagger call; treat teardown hang as success
dagger call hangs after function completion due to HTTP/2 teardown bug in
remote engine mode. Capture output via tee; if timeout fires but output
contains "All tests passed", exit 0 instead of 124.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 16:38:33 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 a078122d28 refactor(ci): replace dual DAGGER_STUNNEL_URL1/2 with single DAGGER_STUNNEL_URL
The engine is stable; no fallback needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 15:48:38 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 e60459ea2e fix(ci): add .task/ and .fvm/ to .daggerignore to skip walk
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:52:19 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 92cc725913 refactor: simplify .daggerignore and fix hardcoded path after repo move to sharedinbox/
.daggerignore no longer needs to exclude $HOME dirs (fvm/, go/, .pub-cache/,
.claude/, snap/, etc.) since the project root is now sharedinbox/, not $HOME.
agent_loop.py: replace hardcoded /home/si with Path.home().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:43:29 +02:00
Thomas SharedInbox bd03484fcc Revert "fix(ci): kill dagger via timeout when it hangs in gRPC teardown"
This reverts commit 7e155f5785.
2026-05-20 13:11:07 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 242e1ce4a4 fix(ci): exclude fvm/ and other large dirs from Dagger source sync
The source sync (Directory.Sync in selectFunc) was uploading ~7.4 GB /
78k files to the remote engine, blocking dagger call for 16+ minutes.

Root cause: .daggerignore had '.fvm/' but the actual directory is 'fvm/'
(no leading dot), so the 1.9 GB Flutter SDK cache was always uploaded.
Also missing: go/ pkg cache (309 MB), .claude/ session files, agent logs.

goroutine dump confirmed the hang in directoryValue.Get → Directory.Sync
→ HTTP/2 roundTrip waiting on the engine — not gRPC teardown as suspected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:04:53 +02:00