sharedinbox

Author	SHA1	Message	Date
Thomas SharedInboxandClaude Sonnet 4.6	3019fdf145	refactor(deploy_cron): trigger Forgejo Actions workflow via fgj instead of deploying locally Replace local `task publish-website` invocation with `fgj actions workflow run website.yml` so the deploy runs in CI rather than on the local machine. Remove failure-tracking state files and issue-creation logic — Forgejo Actions handles its own reporting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 17:42:20 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	6e22683f5b	fix(crash_screen): remove duplicate gitLine definition left by rebase conflict resolution Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 17:02:39 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	dc181d0d85	fix: add git hash to crash screen and extend DB path retries (#179 ) Two issues from #179: - crash_screen.dart now reads GIT_HASH compile-time constant and includes 'Git Commit: <hash>' in both the on-screen UI and the copied report, so crash reports always show the exact build that crashed. - _resolveDatabasePath() retry delays extended from [100, 300, 600] ms (total ~1 s, 4 attempts) to [200, 500, 1000, 2000, 4000] ms (total ~7.7 s, 6 attempts) to handle slow/non-standard Android devices where the path_provider Pigeon channel takes several seconds to become ready. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 16:53:07 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	47824c5711	Handle transient git fetch failures gracefully Exit cleanly instead of crashing so the next cron run retries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 14:13:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	5ad6599951	fix(agent_loop): match CI run to PR branch via event_payload, not head_branch The Forgejo workflow_runs API has no head_branch field. For pull_request events the branch lives in event_payload["pull_request"]["head"]["ref"]; for push events it is in prettyref. The old code used run.get("head_branch") which always returned None, causing _latest_ci_run_for_branch to never find the run and the loop to declare "no CI run after 15 min" and set the issue to State/Question — even when CI had already passed. Also fixes a pre-existing test mock that was missing the session_name kwarg. Adds TestLatestCiRunForBranch covering both event types and the regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 13:36:21 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	49176623b3	fix(ci): use file: prefix for SSH key in publish-website env:SSH_PRIVATE_KEY passes the key through shell $() which strips the trailing newline, causing dagger to write a truncated key that OpenSSH rejects with "error in libcrypto". Using file: reads it directly from disk, preserving exact content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 12:20:09 +02:00
Thomas SharedInbox	55c15177d8	fix(publish-website): survive SSH failure in generate_build_history (#164 ) The Dagger container running generate_build_history.py may not always reach the deployment server (network constraints on the Dagger engine). Rather than aborting the entire publish-website pipeline, log the SSH verbose output (already added in the previous debug commit) and return an empty file list so Hugo still builds and rsync still deploys the site — just without updated build-history pages. This unblocks the cron deploy that has been failing since `c259d2da`.	2026-05-23 12:17:58 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	54cd6623c4	debug(ci): add ssh -v to generate_build_history for exit-255 diagnosis Temporary: print verbose SSH output on failure to identify why the connection fails from inside the dagger container. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 12:13:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	7e234b4835	fix(ci): chmod 700 /root/.ssh in GenerateBuildHistory container Dagger mounts the secret file with 0600 but the parent directory may get created with world-readable permissions, causing SSH to refuse the key with exit 255. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 12:09:35 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	565b6f8e33	fix(publish-website): add -i to ssh call in generate_build_history.py All other ssh/scp calls in the dagger module use explicit -i /root/.ssh/id_ed25519. This one was missing it, causing exit 255 inside the dagger container. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 12:02:30 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	bf3accd676	deploy.sh: read SSH_PRIVATE_KEY from key file, not .env Dagger parses .env directly and fails on multiline quoted values. Move SSH_PRIVATE_KEY out of .env and export it from ~/.ssh/id_ed25519 in the wrapper instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 11:47:48 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	57902e8218	deploy: give up and open issue after 5 failures on same commit Tracks consecutive failure count in .fail_count. On the 5th failure for the same SHA, creates a Prio/High + State/Ready Codeberg issue. Before creating, checks local .last_issue_sha and queries Codeberg open issues to avoid duplicates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 11:37:57 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	c259d2dabe	deploy: create Codeberg issue when deploy fails and main is unchanged If the last deploy failed and origin/main has not advanced, opens a Prio/High + State/Ready issue via tea with the failing SHA, commit link, and captured deploy output. Skips duplicate issues (tracked by .last_issue_sha). Cron interval changed to */5. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 11:24:21 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	8d49a6b267	deploy.sh: source .env, add dagger to PATH from nix store Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 11:18:44 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	eecef1a4a8	add deploy.sh wrapper: finds task via nix store, short crontab line Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 11:17:30 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	ad150bce53	add deploy_cron.py: local 15-min cron deploy, skip if main unchanged Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 11:07:41 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	b6a2f91820	security: fix log/state file permissions, Firebase key on disk, TLS cleanup - agent_loop.py: create log dir with mode 0700 and enforce it on existing dirs; open log files with mode 0600; chmod state file to 0600 after every write. Prevents other local processes from reading agent output (which may contain credential paths) or tampering with the state file's pid field. - ci/main.go (TestAndroidFirebase): replace echo "$FIREBASE_SA_KEY" > /tmp/key.json with bash process substitution --key-file=<(echo "$FIREBASE_SA_KEY") The key is now passed via a file descriptor — it never touches disk, so it cannot be stranded by a failed gcloud auth call or snapshotted into the Dagger layer cache. - ci.yml / deploy.yml: add "Cleanup TLS credentials" step (if: always()) at the end of every job that calls setup_dagger_remote.sh. Removes /tmp/dagger-tls, /tmp/stunnel-dagger.conf, /tmp/stunnel.pid from the self-hosted runner after each job, so client certs do not accumulate between job runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 10:54:53 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	509a0bc954	fix(ci): remove Gradle cache mount from pubGetLayer() flutter pub get is pure Dart — it never invokes Gradle. The mutable gradle-cache volume mount caused the same execution-cache instability we just fixed for the pub cache: Dagger sees a changed volume and cache-misses pubGetLayer() on every run. The Gradle cache stays in Base(), which is only used for steps that actually build Android code. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 10:15:39 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	6cfc3dfda4	fix(ci): remove pub cache volume from Base() and pubGetLayer() The mutable flutter-pub-cache volume made the execution cache key unstable — pub get cache-missed every run because the volume's mutable layer changed the snapshot hash. Removing the volume lets Dagger snapshot packages inside the execution-cache layer, which is stable and reclaimable via dagger prune. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 10:11:08 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	8a4ca223e9	fix: retry path_provider on PlatformException at database open (#153 , #157 ) On some Android versions the path_provider Pigeon channel ('dev.flutter.pigeon.path_provider_android.PathProviderApi.getApplicationSupportPath') is not ready when initDatabasePath() runs before runApp(). The existing code already catches PlatformException there, leaving _dbPath null — but the LazyDatabase callback called getApplicationSupportDirectory() a second time without any protection, causing an unhandled crash on those devices. Fix: extract _resolveDatabasePath() which retries three times with back-off (100 ms → 300 ms → 600 ms) before re-throwing with a descriptive error message. By the time the database is first accessed (after runApp()), the channel is almost always available; if it still isn't, the CrashScreen is shown with a clear explanation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 10:08:04 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	1a7b585dd4	fix(agent-loop): filter issues by author; comment when setting State/Question (#158 ) - Only pick up issues created by guettli, guettlibot, or guettlibot2 to prevent the loop from acting on external/bot issues. - Post an explanatory comment on the issue whenever the loop sets State/Question (agent killed, no CI run, no push detected), so the reason is visible without digging through cron logs. Closes #158. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 10:04:44 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	959ce92a69	fix(ci): drop false-positive 'error' grep in Firebase test check Firebase CLI emits "A non-retryable error occurred." even for passing runs. The grep -qwi 'error' triggered on this message despite gcloud exiting 0 and the result table showing Passed. The gcloud exit code, device-count, and Passed checks are sufficient to detect real failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 23:22:25 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	9cd18ba70e	feat: agent loop uses PRs; ci.yml fast-only; hourly deploy workflow (#156 ) - agent_loop.py: agents now create an `issue-N-fix` branch and open a PR; the loop discovers the PR via `fgj pr list`, tracks its CI run, squash-merges on green, and falls back to the global-CI path if no PR exists (backward compat). Adds `_find_pr_for_branch`, `_latest_ci_run_for_branch`, `_merge_pr` helpers. - .forgejo/workflows/ci.yml: strip to the single fast `check` job only (removes build-linux, deploy-playstore, publish-website). - .forgejo/workflows/deploy.yml (new, replaces android-emulator-tests.yml): scheduled hourly + workflow_dispatch; runs firebase tests, Play Store deploy, Linux build/deploy, website publish; on completion sets CI/Full-Pass or CI/Full-Fail label on the repo's DEPLOY_HEALTH_ISSUE tracking issue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 22:05:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	b48cb98813	fix(agent-loop): detect agent crash — do not close issue when no new CI run appeared If the agent exits immediately (e.g. rate-limit), the loop was closing the pending issue against the previous CI run, which was still green. Fix: record the latest CI run ID when an issue agent starts. If the run ID hasn't changed when the agent exits, the agent pushed nothing → set State/Question instead of closing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 21:52:02 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	acd9483e8b	chore: replace flutter_markdown with flutter_markdown_plus (#147 ) flutter_markdown 0.7.7+1 has been discontinued in favour of flutter_markdown_plus. Switch the dependency and update both import sites. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 16:44:10 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	7e3a63f507	ci: validate gcloud auth stderr, fail on 'error' in output, check test count (#145 ) - Capture gcloud auth stderr separately and fail on unexpected output; ignore the two known informational lines ("Activated service account credentials for: [...]" and "Updated property [core/project].") while keeping a strict "fail if unknown stderr" check for anything else. - Replace the narrow pattern grep (non-retryable error\|infrastructure_failure\| test execution failed) with a broad whole-word case-insensitive grep for 'error', so any infrastructure or Firebase error in the output causes CI failure. - Verify that the number of device result rows in the result table matches the expected device count (1), so a silent test-run failure cannot slip through. - Add scripts/test_firebase_check.sh with 18 unit tests for the three new bash patterns (auth stderr filter, error-word detection, device count). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 16:31:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	ea712bdda9	docs: document dagger.Secret usage for sensitive credentials (#142 ) All production secrets (SSH key, Android keystore, Play Store config, Firebase service account) are already typed as dagger.Secret and injected via WithMountedSecret / WithSecretVariable. Add a Secrets section to DAGGER.md to make this explicit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 16:07:21 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	e057e1f483	fix: set Owner: "ci" on gradle and pub cache mounts The gradle-cache volume was mounted without an owner, so the root-owned volume caused "Permission denied" when the ci user tried to create gradle-8.14-all.zip.lck during bundleRelease. Add Owner: "ci" to all three WithMountedCache calls so the ci user can write to the caches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 15:55:30 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	cc51abd1fa	fix: reduce CI noise from apt-get, sdkmanager, stunnel, and Gradle (#140 ) - Add -qq to apt-get update/install in Dagger toolchain to suppress verbose package-list output (hundreds of lines on cold cache) - Wrap sdkmanager in silent-on-success pattern — only shows output on failure, like the build_runner and flutter pub get steps - Set debug = warning in stunnel config to suppress LOG5 (info/notice) startup lines while keeping LOG4 (warning) and above - Add org.gradle.welcome=never to android/gradle.properties to suppress the "Welcome to Gradle N.NN!" banner - Filter SKIPPED Gradle tasks, Gradle Daemon startup messages, and gcloud support-page promo lines in run_firebase_test.sh Errors and warnings are preserved in all cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 15:37:12 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	9e4a36b330	fix: drop -u 1000 from useradd in Dagger toolchain — UID already taken in flutter image The cirruslabs/flutter:3.41.6 image already has UID 1000 assigned to another user, so `useradd -u 1000` exits with code 4 ("UID not unique") and the ci user is never created. Dagger then fails to resolve `owner: "ci"` on subsequent WithDirectory calls. Removing the explicit UID lets useradd pick the next available one. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 15:19:05 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	f9a5aa0372	fix: do not run Flutter as root in CI (#138 ) Create a non-root user 'ci' (UID 1000) in the Dagger toolchain container, transfer ownership of the Flutter SDK and Android SDK to that user, and switch to it with WithUser("ci"). Update all cache mount paths from /root/ to /home/ci/ and set Owner: "ci" on every WithDirectory call so Flutter can write build output. Flutter emits a strong warning when run as root; this change eliminates that warning by running the tool as a regular user. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 15:09:42 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	a1cd31a2eb	fix: survive PlatformException(channel-error) in registerBackgroundSync (#149 ) On some Android devices (e.g. Android S1RXS32.50-13-25) the WorkManager platform channel fails to connect at startup, throwing PlatformException(channel-error, ...). registerBackgroundSync() now catches PlatformException and MissingPluginException (plus any other unexpected failure) and silently disables background sync rather than crashing the app. Test added: test/unit/background_sync_test.dart verifies the function completes without throwing in the unit-test environment (where the native plugin is absent). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 14:23:40 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	78b3d40a70	fix(agent-loop): use fgj for writes; tea api silently ignores auth errors `tea api` exits 0 even on 401 responses, so `_close_issue` and `_set_labels` appeared to succeed but did nothing. Issues were never actually closed, causing them to be picked up again every cron tick. Switch all write operations (close issue, set labels) and issue-list reads to `fgj`, which has proper authentication. Keep `tea api` only for CI run fetches where `fgj` times out (504). Add ~/go/bin to the cron PATH so fgj is found. Also add an error check in `_tea_get` for API-level error responses, and strip State/InProgress when closing an issue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 14:22:07 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	f7d021c62a	fix: survive MissingPluginException on startup, fix crash report URL (#146 ) Two fixes: 1. notification_service.dart: initNotifications() now catches MissingPluginException (and any other init failure) so the app no longer crashes when flutter_local_notifications is unavailable on some Android devices. _initialized tracks success; showNewMailNotification skips the plugin call when it never initialised. 2. crash_screen.dart: "Report Issue on Codeberg" no longer puts the full report in the URL query string. Long stack traces exceeded browser URL-length limits and caused "create issue failed". The URL now carries only the pre-filled title; the user copies the full report via "Copy to Clipboard" and pastes it in the issue body. Tests added: - test/unit/notification_service_test.dart: verifies initNotifications() completes without throwing when the plugin channel is unavailable. - test/widget/crash_screen_test.dart: verifies the Codeberg URL contains the title but no &body= parameter. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 13:01:34 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	ea52e89934	fix: run build_runner once via shared codegenBase, fix CheckMocks staleness detection (#137 ) Previously build_runner compiled separately for each setup() variant (checkSrc, backendSrc, integrationSrc, etc.) since their differing source inputs produced distinct Dagger cache keys. CheckMocks also ran build_runner twice: once inside setup() and again explicitly — and the second run always compared two freshly-generated outputs, so stale mocks in the repo were never detected. Introduce codegenBase() that runs build_runner on the minimal common source (lib/, test/, assets/, pubspec.) excluding committed generated files. All setup() calls now share this single Dagger cache entry, so build_runner compiles only once per pipeline run instead of once per source variant. Fix CheckMocks to start from pubGetLayer() + committed source (including any stale .mocks.dart), commit that state as the git baseline, then run build_runner once. The subsequent git diff now correctly detects stale mocks in the repository, matching the behaviour of check_mocks_fresh.sh. Also update Graph() to reflect the new codegenBase node. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 12:23:52 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	d72df5086c	feat: close issues in Python loop after CI passes, not in agent (#134 ) Previously issue agents were instructed to close the issue via prompt text immediately after pushing. If CI then failed, the issue was already closed. Now the loop tracks a pending_issue across cron ticks: - When an agent finishes (issue or ci-fix), the issue number is extracted from state before it is cleared. - If CI is still running, a "pending-ci" state preserves the issue number. - If CI fails, the ci-fix agent is started with the issue number in state so it survives the fix cycle. - Once CI passes, _close_issue() is called from Python — never by the agent. The agent prompt no longer instructs the agent to close the issue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 12:02:16 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	e46dc2961f	feat(agent-loop): improve output format with header, URLs, and no prefix (#133 ) - Add `---------------------- Starting YYYY-MM-DD HH:MMZ` header at each run - Remove `[agent_loop]` prefix from all output lines - Show full Codeberg URL for CI runs instead of bare run ID - Show full issue URL and title when referencing issues - Store issue_title in state file so "still running" messages include the title Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 11:50:30 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	3bd38e7a69	fix(agent-loop): update AGENTS.md and fix test invocation for InProgress workflow (#131 ) State/Ready → State/InProgress is already set by agent_loop.py before the agent starts. Update AGENTS.md to reflect that agents invoked via the loop must not set InProgress themselves (only manual workflows need to). Also fix TestMain tests that called main() directly, which caused argparse to consume sys.argv; they now call _run_loop() instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 11:41:28 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	d36d9a679d	fix: fail Android CI when gcloud reports non-retryable error (#143 ) Previously, `gcloud firebase test android run` could exit 0 while printing "A non-retryable error occurred." in its output. The old check `&& echo "$out" \|\| { exit 1; }` only caught non-zero exit codes, and the success grep `'Passed\|passed\|test cases'` was too broad — "test cases" can appear in Firebase output before the error, giving a false positive. The fix captures gcloud's exit code explicitly via `rc=$?`, adds an explicit error-string check for known Firebase failure phrases (non-retryable error, infrastructure_failure, test execution failed), and tightens the success pattern to `'Passed\|passed'` only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 11:30:56 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	23cbe4611c	fix: resolve startup crash and CrashScreen button crashes (#127 ) Two bugs caused the crash-at-startup report: 1. CrashScreen used the widget's build context (above its own MaterialApp) for ScaffoldMessenger.of() in button callbacks. When the screen is the root widget — the runApp() path after a startup crash — there is no ScaffoldMessenger above it, so both 'Copy to Clipboard' and 'Report Issue on Codeberg' crashed with a null check error. Fix: wrap Scaffold.body in Builder to obtain a context that is a descendant of the Scaffold. 2. path_provider_android 2.2.21 updated to Pigeon 26, which causes a channel-error on startup for some Android devices. Pin to <2.2.21 (resolves to 2.2.20, which uses the stable pre-Pigeon-26 implementation). Additionally, make initDatabasePath() catch PlatformException so a channel error at the very start of main() no longer hard-crashes the app; _openConnection()'s lazy fallback retries after runApp() completes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 11:16:09 +02:00
Thomas SharedInbox	c4e7042430	agent-loop: pick Prio/High issues first among Ready issues	2026-05-22 10:54:27 +02:00
Thomas SharedInbox	f30c5076da	docs	2026-05-22 10:16:19 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	ee4f93752d	ci: check runner tools are pre-installed instead of downloading them Replace curl-based install of dagger/task with a hard check that fails immediately if any tool is missing from the runner image, pointing to .forgejo/Dockerfile as the fix location. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 10:07:55 +02:00
Thomas SharedInbox	19771a2060	docs	2026-05-22 10:02:36 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	e6baaaed74	ci: add Dockerfile for custom runner image Based on ghcr.io/catthehacker/ubuntu:go-24.04 with stunnel4, netcat-openbsd, dagger v0.20.8 and task v3.48.0 baked in so nothing is downloaded during CI runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 09:51:35 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	92f3e30e00	ci: fail if Firebase Test Lab reports no test case results gcloud exits 0 even when no tests ran. Add a post-check that greps the output for 'Passed/passed/test cases' and fails explicitly if none are found, so 'no test case results' turns the CI red. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 08:58:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	ec195271c8	test: fail explicitly when Stalwart env vars are missing Previously setUpAll() fell back to 127.0.0.1 defaults when env vars were absent, causing Firebase Test Lab to report '0 test case results' instead of a clear failure. Now it calls fail() immediately with the list of missing variables. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 08:52:45 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	7936bf0a47	ci: require stunnel4/netcat-openbsd pre-installed on runner host Replace apt-get install with a hard check — if the packages are missing the job fails immediately with a clear error. Avoids flaky failures when archive.ubuntu.com is unreachable. Install once on the runner: sudo apt-get install -y stunnel4 netcat-openbsd Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 08:43:19 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	44d6227ba8	chore: track pubspec.lock and pin sqlite3 to ^3.1.5 pubspec.lock was incorrectly gitignored — this is a Flutter app, not a package, so the lockfile should be committed for reproducible builds. Without it, CI resolved drift to its minimum (2.20.3) which constrains sqlite3 to 2.x, causing dart analyze to disagree on whether Database.close() exists vs the local environment using 3.3.1. Also pins sqlite3: ^3.1.5 explicitly in pubspec.yaml as belt-and- suspenders so the constraint is visible without reading the lockfile. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 08:19:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	cd7455d3a5	ci: remove unnecessary CACHE_BUSTER from Firebase step The results-bucket change already busts the cache; Dagger doesn't cache failed execs anyway. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 07:43:13 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	f047dd34ea	ci: use project-owned bucket for Firebase Test Lab results The default Firebase Test Lab bucket is in a Google-managed project so project-level IAM grants have no effect on it. Use sharedinbox-ftl-results which is in sharedinbox-496103 where the service account has storage.admin. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 07:32:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	cc34b9b4b6	ci: retrigger Firebase Test Lab after billing enabled Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 07:24:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	8278b2f33c	ci: retrigger Firebase Test Lab after cloudtestservice.testAdmin grant Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 06:15:12 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	357f6e194c	ci: bust Dagger cache for Firebase Test Lab step WithEnvVariable(CACHE_BUSTER, time.Now()) ensures gcloud firebase test always runs fresh rather than returning a cached result from a prior run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 06:08:36 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	bf769db4dd	ci: retrigger Firebase Test Lab after IAM fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 06:05:47 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	47ab77feea	ci: retrigger Firebase Test Lab Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 21:46:36 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	12c95537f0	ci: retrigger Firebase Test Lab after Dagger engine restart Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 21:39:11 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	bcd87c642d	Add retry logic to run_firebase_test.sh for transient Dagger errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 21:23:12 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	24f479b0ad	Filter Gradle/Dagger noise from Firebase Test Lab CI output Add scripts/run_firebase_test.sh that strips ANSI codes and removes UP-TO-DATE task lines, libsqlite warnings, Gradle deprecation notices and other high-volume noise before it hits the CI log. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 21:21:04 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	4f663dd0c8	ci: retrigger Firebase Test Lab Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 20:39:55 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	e44aabe210	ci: retrigger Firebase Test Lab after granting storage.admin role Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 20:32:59 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	3b90d42389	ci: retrigger Firebase Test Lab after enabling Cloud Tool Results API Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 19:56:47 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	1991508a8b	Fix Firebase Test Lab device model ID: Pixel6 -> oriole 'Pixel6' is not a valid Firebase Test Lab model ID. 'oriole' is the correct internal codename for Pixel 6. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 18:58:56 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	cf674009ee	ci: retrigger Firebase Test Lab after fixing project ID and enabling APIs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 18:54:20 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	569c8b2e7a	ci: retrigger Firebase Test Lab after enabling Cloud Testing API Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 18:21:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	689ce8721d	Fix androidTest APK search path — Flutter redirects Gradle output to /src/build Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:40:17 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	6bb191ee99	Fix androidTest APK path using find instead of hardcoded path The exact output path varies by AGP version. Use find to locate the test APK and copy it to a known location. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:34:41 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	01cbf5b805	Add Firebase Test Lab integration for Android instrumented tests Implements issue #132. Builds debug app APK + androidTest APK via Dagger, then runs them on Firebase Test Lab using the FIREBASE_TEST_LAB_SERVICE_ACCOUNT_KEY secret and FIREBASE_PROJECT_ID variable. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:20:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	2e080dd4ed	fix(ci): remove SIGKILL fallback from check-dagger cleanup The GET /shutdown endpoint on otel-receiver.py is the one clean shutdown path. cleanup() only needs to remove temp files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 15:24:11 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	041e496e58	fix(ci): rename otelrecv→otel-receiver, fix teardown hang Rename ci/otelrecv.py to ci/otel-receiver.py for readability. Replace SIGTERM+wait shutdown (which could hang indefinitely) with an HTTP-based approach: add GET /shutdown to otel-receiver.py that calls self.server.shutdown() directly. After dagger call returns, curl that endpoint so the receiver prints its timing report and exits cleanly. Cleanup is reduced to a SIGKILL fallback in case the process is already gone. Also fix the do_GET handler to reference self.server instead of the local variable server, which was inaccessible from the handler class. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 15:18:34 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	f2d24a8514	fix(ci): reduce noise in CI output (#128 ) - Filter flutter pub get package-listing lines (^[+~><] ) in pubGetLayer - Filter build_runner compilation-progress lines (^\[) in setup() and CheckMocks() - Add -q to git commit in CheckMocks to suppress "460 files changed" stats - Wrap flutter test in Coverage, TestBackend, TestIntegration, TestSyncReliability to show only the summary line on success and full output on failure - Apply same build_runner filter to scripts/check_mocks_fresh.sh for local runs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 14:51:56 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	9dc34cefe5	ci: add 30-minute Dagger-side timeout to Check pipeline If any step hangs (stuck service, deadlocked test, network stall), the pipeline will now cancel itself after 30 min rather than blocking the runner indefinitely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 11:53:49 +02:00
Thomas SharedInbox	f315c21c9a	add "list" sub-command to agent-loop to resume via UUID.	2026-05-21 11:49:32 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	541c1a0b53	fix(ci): reduce noise in CI output (#128 ) Remove per-request debug logs from otelrecv.py (POST, decoding, decoded, 200 sent, signal) that were added to diagnose the CI hang, which has since been resolved. Remove verbose [HH:MM:SS] timestamp messages from check-dagger (start, pipeline done, otelrecv started/ready, final RC, cleanup start/done) for the same reason. Fix cleanup to send SIGTERM + wait instead of SIGKILL so the OTEL timing report is actually printed at the end of each CI run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 10:45:40 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	34fb51d85d	feat(ci): add Graph() to visualize CI pipeline as Mermaid diagram (#126 ) Adds a Ci.Graph() Dagger function that emits a Mermaid flowchart showing both the Dagger Check pipeline (toolchain → pubGetLayer → parallel steps) and the Codeberg CI job dependencies (check → build-linux / deploy-playstore → publish-website). Usage: dagger call -m ci --source=. graph task ci-graph Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 10:28:28 +02:00
Thomas SharedInbox	58f1a4da42	feat(website): vendor PaperMod theme, remove git submodule (#125 ) Replace the git submodule with directly tracked files so that `git commit .` no longer fails with 'does not have a commit checked out'. Removed .github/ from the vendored copy since upstream CI workflows are not needed here.	2026-05-21 10:21:09 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	07823373c1	fix(ci): add withGoCache helper and pip cache for UploadToPlayStore Adds withGoCache() that mounts GOCACHE and GOMODCACHE as Dagger cache volumes — the standard pattern for any Go container added to the pipeline. Also adds pip cache to UploadToPlayStore so pip wheel downloads are reused between Play Store deploys. Closes #123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 06:41:04 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	1af2a36af7	fix(ci): remove pub cache volume from pubGetLayer for stable execution cache flutter pub get was re-running on every CI run because Base() attached a mutable WithMountedCache volume to /root/.pub-cache, making the execution cache key unstable. Extract toolchain() without cache mounts; pubGetLayer() now uses toolchain() so Dagger execution-caches pub get between runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 06:35:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	0cb181b138	fix(ci): remove wait from otelrecv cleanup; add pkill by name as fallback wait "$RECV_PID" was blocking despite kill -9 (possibly because $RECV_PID was garbled by ANSI escape codes from dagger output, making kill target the wrong PID). Fix: - Remove wait entirely — zombie is reaped when the shell exits - Add pkill -9 -f otelrecv.py as fallback in case kill-by-PID misses - Log PID at capture time to verify correctness in CI logs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 21:09:24 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	320fbcabc3	fix(ci): kill otelrecv with SIGKILL in cleanup, add timing logs, re-enable OTEL Three changes: - cleanup() now uses kill -9 instead of kill (SIGTERM) to prevent wait hanging if otelrecv's signal handler stalls - adds [HH:MM:SS] log lines at key points so CI logs show exactly where time is spent - restores OTEL env vars (via env VAR=val) since they were confirmed not to cause the hang Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 21:00:20 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	f187012f58	debug(ci): temporarily disable OTEL env vars to test if they cause dagger hang Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 20:31:48 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	d4b265724e	fix(otelrecv): set close_connection=True so server actually closes after response Sending Connection: close in the header without closing the server-side socket left both dagger's Go HTTP client and Python's HTTPServer waiting for the other to send FIN first. This blocked dagger's OTLP exporter shutdown, which in turn blocked dagger from exiting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 20:14:27 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	36b54a08d6	fix(ci): use --kill-after=10 on timeout so dagger is SIGKILLed if it ignores SIGTERM dagger ignores SIGTERM, keeping the pipe's write end open; tee can never get EOF and the script hangs. --kill-after=10 follows up with SIGKILL which closes the pipe and unblocks the script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 20:02:44 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	4a99d47aa5	fix(ci): add TCP keepalive to stunnel to prevent NAT connection resets Connection drops consistently at ~50s suggest NAT/firewall idle timeout. Keepalive probes every 10s on the remote side prevent the RST. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 19:43:16 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	c3737fb47f	fix(ci): retry dagger call on TCP connection failures (up to 3 attempts) On network errors (connection reset, context canceled, connection refused) retry the dagger call rather than failing immediately. Real test failures propagate without retry. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 18:47:38 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	88e8a9ab5c	fix(ci): add 10-minute timeout to dagger call; treat teardown hang as success dagger call hangs after function completion due to HTTP/2 teardown bug in remote engine mode. Capture output via tee; if timeout fires but output contains "All tests passed", exit 0 instead of 124. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 16:38:33 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	a078122d28	refactor(ci): replace dual DAGGER_STUNNEL_URL1/2 with single DAGGER_STUNNEL_URL The engine is stable; no fallback needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 15:48:38 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	e60459ea2e	fix(ci): add .task/ and .fvm/ to .daggerignore to skip walk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:52:19 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	92cc725913	refactor: simplify .daggerignore and fix hardcoded path after repo move to sharedinbox/ .daggerignore no longer needs to exclude $HOME dirs (fvm/, go/, .pub-cache/, .claude/, snap/, etc.) since the project root is now sharedinbox/, not $HOME. agent_loop.py: replace hardcoded /home/si with Path.home(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:43:29 +02:00
Thomas SharedInbox	bd03484fcc	Revert "fix(ci): kill dagger via timeout when it hangs in gRPC teardown" This reverts commit `7e155f5785`.	2026-05-20 13:11:07 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	242e1ce4a4	fix(ci): exclude fvm/ and other large dirs from Dagger source sync The source sync (Directory.Sync in selectFunc) was uploading ~7.4 GB / 78k files to the remote engine, blocking dagger call for 16+ minutes. Root cause: .daggerignore had '.fvm/' but the actual directory is 'fvm/' (no leading dot), so the 1.9 GB Flutter SDK cache was always uploaded. Also missing: go/ pkg cache (309 MB), .claude/ session files, agent logs. goroutine dump confirmed the hang in directoryValue.Get → Directory.Sync → HTTP/2 roundTrip waiting on the engine — not gRPC teardown as suspected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:04:53 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	7e155f5785	fix(ci): kill dagger via timeout when it hangs in gRPC teardown After tests complete, dagger call hangs in gRPC connection close to the remote engine — OTEL shuts down cleanly (spans stop) but the process never exits. Wrapping with timeout 900s and treating exit 124 as success unblocks CI and lets the OTEL timing report print. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 12:36:13 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	95d114cc38	debug(otelrecv): add stderr logging to diagnose CI hang Log each POST request, decode step, 200 response, signal receipt, and server shutdown to understand where the hang occurs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 12:22:04 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	d5e3974d94	fix(otelrecv): send explicit Content-Length + Connection: close Without Content-Length the Go HTTP/1.1 client can't tell the response body is empty, causing dagger call to hang waiting for more data. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 12:07:57 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	1c27dc4f71	fix(ci): use http/protobuf OTEL protocol with binary protobuf receiver http/json is not supported by the Go OTEL SDK used in Dagger v0.20.8. Switch to http/protobuf (the SDK default) and rewrite the Python receiver to decode binary protobuf using stdlib struct — no pip required. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 11:46:58 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	691f2beec2	fix(ci): switch timing from OTEL receiver to --progress=plain pipe filter Dagger v0.20.8 only supports 'grpc' and 'http/protobuf' OTLP protocols; 'http/json' triggers a WARN and exports nothing. The new approach pipes dagger's --progress=plain output through a Python script that echoes it in real-time and prints a timing table at EOF. No HTTP server, no port files, no protocol issues — works locally and in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 11:43:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	ac2178916e	refactor(ci): replace Go OTEL receiver with Python (stdlib, no deps) python3 is pre-installed on ubuntu-latest so the timing report now also runs in CI, not just locally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 11:30:08 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	b10696a41e	fix(ci): remove tmp timing file — receiver writes directly to stdout TIMINGFILE=$(mktemp) was an unnecessary /tmp path. The receiver already prints its report to stdout on shutdown; wait $RECV_PID captures it in place. Only PORTFILE remains in /tmp (unique via mktemp, deleted in cleanup). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 10:38:26 +02:00
Thomas SharedInboxandClaude Sonnet 4.6	3471e1fd2c	feat(ci): OTEL timing receiver for check-dagger Adds ci/otelrecv/main.go — a minimal OTLP HTTP/JSON trace receiver that listens on a random port (port 0) so parallel runs never collide. The check-dagger Taskfile task now starts the receiver in the background, passes the port via a mktemp file, runs dagger with OTEL env vars set, then prints a per-span timing report on shutdown. Falls back to plain dagger call when Go is not available (e.g. CI containers without Go). First run will show raw attribute keys so we can learn Dagger's exact telemetry format and refine the cached/live detection logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 10:27:57 +02:00
Thomas SharedInbox	f23328fd1f	ci: empty commit to verify cache stability	2026-05-20 09:39:47 +02:00