- Add merge verification to the pending_issue PR path (section 2), matching
the catch-up scan fix: if the PR is still open after _merge_pr returns, set
State/Question instead of claiming success and leaving the issue closed with
an unmerged PR.
- Replace _latest_ci_run() with _latest_main_ci_run() that filters to
non-pull_request events on the 'main' prettyref. The old limit=1 query
could return a PR-branch run, causing section 3 to misread CI as failed
and spawn a ci-fix agent when main was actually fine.
- Guard against redundant ci-fix agents: when the same main CI run ID has been
failing since the previous ci-fix started (agent pushed to a branch, not
main), check for any in-flight CI run before spawning another agent.
- Issue agent prompt: explicitly forbid "Closes #N" / "Fixes #N" in commit
messages. The loop is responsible for closing issues after CI passes;
commit-keyword auto-close would race with or bypass that logic.
- Global ci-fix prompt: restore "push directly to main" (ci-fix agents need to
land on main to clear the main CI run) and keep the "no issue references"
guard added in the previous commit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous PRs (#150, #179) added partial implementations that left duplicate
code via a rebase conflict: plain (non-linked) text above the stacktrace and a
clickable link section below it. This consolidates both into a single clickable
link above the stacktrace.
Also makes `gitHash` an injectable constructor parameter so tests can exercise
the link without needing a release build.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs fixed:
1. Catch-up scan (section 2b) called _merge_pr and immediately returned,
claiming success even when fgj exits 0 but the merge silently failed
(e.g. branch-protection rules not satisfied). PR #163 was retried 30+
times in a row because the PR stayed open after each attempt.
Fix: verify the PR is no longer open after the merge call; if it is still
open, set the issue to State/Question instead of looping forever.
2. ci-fix agents wrote "Closes #198" in commit messages, causing Forgejo to
auto-close issue #198 ("Unable to load asset: assets/changelog.txt") even
though the commit only fixed the unrelated Play Store upload.
Fix: both ci-fix prompts now explicitly forbid issue-number references in
commit messages and close operations. Also save ci_run_id_at_start in
the ci-fix state (was only done for issue agents) so future guard logic
can compare run IDs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add catch-up scan in agent_loop that finds all open issue-N-fix PRs and
merges those with passed CI, using event-filtered API query (limit=50)
to cover weeks of history instead of the previous ~1.5 h window.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When flutter_secure_storage's platform channel is unavailable (e.g. on
certain Android devices), getPassword() throws MissingPluginException.
Previously this was not recognised as a permanent error, so the IMAP and
JMAP sync loops retried indefinitely with exponential back-off, filling
the sync log with repeated failures (as shown in the screenshot).
Treat MissingPluginException as a permanent error in both _AccountSync
and _JmapAccountSync so the loop stops immediately instead of retrying.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ci.yml: add paths filters to push and pull_request triggers so the full
Dagger check only runs when source-relevant files change (lib/, test/,
android/, linux/, scripts/, ci/, Taskfile.yml, etc.). Pure website,
docs, and assets/changelog.txt commits no longer trigger ci.yml.
- deploy.yml: add check-changes job that diffs HEAD~1..HEAD and outputs
android/linux booleans. On workflow_dispatch both are always true.
test-android-firebase, deploy-playstore, and deploy-apk are now
conditional on android==true; build-linux is conditional on linux==true.
label-deploy-health only fires when at least one build job actually ran
(not all skipped) and treats 'skipped' as acceptable in ALL_SUCCEEDED.
- ci/main.go Graph(): update Mermaid diagram to reflect the new two-
workflow structure (ci.yml fast-check + deploy.yml with change-gated jobs).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap the '## sharedinbox.de' heading in a markdown hyperlink to https://sharedinbox.de
- Add a dedicated 'Git Commit' table row with a clickable link to the commit on Codeberg when GIT_HASH is set
- Update clipboard test to assert the heading link is present in copied markdown
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Play Store AAB upload was failing with httplib2.error.RedirectMissingLocation
when Google's API returned a redirect during the resumable upload initiation.
Switched from google-api-python-client (which uses httplib2 internally) to
pure requests-based AuthorizedSession, which handles redirects correctly.
Closes#198
The previous tests patched google_auth_httplib2 and googleapiclient which
no longer exist in the new implementation. Rewrite to mock AuthorizedSession
and _upload_aab_resumable, covering the same scenarios: happy path, retry
on transient errors, backoff delays, and exhausted attempts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
httplib2 treats 308 Resume Incomplete responses (used by Google's
resumable upload API) as redirects and raises RedirectMissingLocation
when the response lacks a Location header. Switch to
google.auth.transport.requests.AuthorizedSession + direct HTTP calls
so the upload uses the requests library, which handles 308 correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a 3-attempt retry loop around the resumable AAB upload that catches
httplib2.error.RedirectMissingLocation (a transient network error) and
retries with exponential backoff (10s, 20s). A fresh MediaFileUpload is
created on each attempt because resumable upload objects cannot be reused
after failure. Also adds TestUploadRetry covering the retry path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrap the resumable bundle upload in a loop of up to _MAX_UPLOAD_ATTEMPTS (3)
attempts. On httplib2.error.RedirectMissingLocation, recreate MediaFileUpload
(resumable uploads cannot reuse the same object) and wait 10 s / 20 s before
retrying. After all attempts are exhausted, raise RuntimeError chained to the
last exception. Add tests covering the retry path, backoff delays, fresh
MediaFileUpload on each attempt, and exhaustion.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch deploy_playstore.py from requests/AuthorizedSession to the
googleapiclient.discovery client with google-auth-httplib2, so that
AuthorizedHttp(timeout=300) enforces a hard socket timeout on all
requests and num_retries=3 on every .execute() call enables automatic
retries for transient failures.
Update flake.nix and ci/main.go to install the new dependencies
(google-api-python-client, google-auth-httplib2, httplib2) instead of
the old google-auth + requests pair.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace local `task publish-website` invocation with `fgj actions workflow run website.yml`
so the deploy runs in CI rather than on the local machine. Remove failure-tracking state
files and issue-creation logic — Forgejo Actions handles its own reporting.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two issues from #179:
- crash_screen.dart now reads GIT_HASH compile-time constant and includes
'Git Commit: <hash>' in both the on-screen UI and the copied report, so
crash reports always show the exact build that crashed.
- _resolveDatabasePath() retry delays extended from [100, 300, 600] ms
(total ~1 s, 4 attempts) to [200, 500, 1000, 2000, 4000] ms (total
~7.7 s, 6 attempts) to handle slow/non-standard Android devices where
the path_provider Pigeon channel takes several seconds to become ready.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary
- Increases the retry delays in `_resolveDatabasePath()` from `[100, 300, 600]` ms (~1 s) to `[200, 500, 1000, 2000]` ms (~3.7 s).
- Adds a regression test (`test/unit/database_path_test.dart`) that verifies `initDatabasePath()` does not throw when the `path_provider` channel is unavailable.
## Root cause
On some slow Android devices (e.g. the Motorola reported in #166), the `path_provider` Pigeon channel is not ready even several seconds after `runApp()` returns. The previous back-off budget of ~1 s was not enough, causing `_resolveDatabasePath()` to exhaust all retries and throw a `PlatformException`, crashing the app with the message shown in the issue.
## Test plan
- [ ] `flutter test test/unit/database_path_test.dart` passes (new regression test)
- [ ] `flutter test test/unit/` — all 325 unit tests pass
- [ ] `flutter analyze` — no issues
Fixes#166
Co-authored-by: Thomas SharedInbox <sharedinbox@thomas-guettler.de>
Reviewed-on: https://codeberg.org/guettli/sharedinbox/pulls/169
The Forgejo workflow_runs API has no head_branch field. For pull_request
events the branch lives in event_payload["pull_request"]["head"]["ref"];
for push events it is in prettyref. The old code used run.get("head_branch")
which always returned None, causing _latest_ci_run_for_branch to never find
the run and the loop to declare "no CI run after 15 min" and set the issue to
State/Question — even when CI had already passed.
Also fixes a pre-existing test mock that was missing the session_name kwarg.
Adds TestLatestCiRunForBranch covering both event types and the regression.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
env:SSH_PRIVATE_KEY passes the key through shell $() which strips the
trailing newline, causing dagger to write a truncated key that OpenSSH
rejects with "error in libcrypto". Using file: reads it directly from
disk, preserving exact content.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Dagger container running generate_build_history.py may not always
reach the deployment server (network constraints on the Dagger engine).
Rather than aborting the entire publish-website pipeline, log the SSH
verbose output (already added in the previous debug commit) and return
an empty file list so Hugo still builds and rsync still deploys the
site — just without updated build-history pages.
This unblocks the cron deploy that has been failing since c259d2da.
Temporary: print verbose SSH output on failure to identify why the
connection fails from inside the dagger container.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dagger mounts the secret file with 0600 but the parent directory may
get created with world-readable permissions, causing SSH to refuse
the key with exit 255.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All other ssh/scp calls in the dagger module use explicit -i /root/.ssh/id_ed25519.
This one was missing it, causing exit 255 inside the dagger container.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dagger parses .env directly and fails on multiline quoted values.
Move SSH_PRIVATE_KEY out of .env and export it from ~/.ssh/id_ed25519
in the wrapper instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tracks consecutive failure count in .fail_count. On the 5th failure
for the same SHA, creates a Prio/High + State/Ready Codeberg issue.
Before creating, checks local .last_issue_sha and queries Codeberg
open issues to avoid duplicates.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If the last deploy failed and origin/main has not advanced, opens a
Prio/High + State/Ready issue via tea with the failing SHA, commit link,
and captured deploy output. Skips duplicate issues (tracked by
.last_issue_sha). Cron interval changed to */5.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- agent_loop.py: create log dir with mode 0700 and enforce it on
existing dirs; open log files with mode 0600; chmod state file
to 0600 after every write. Prevents other local processes from
reading agent output (which may contain credential paths) or
tampering with the state file's pid field.
- ci/main.go (TestAndroidFirebase): replace
echo "$FIREBASE_SA_KEY" > /tmp/key.json
with bash process substitution
--key-file=<(echo "$FIREBASE_SA_KEY")
The key is now passed via a file descriptor — it never touches
disk, so it cannot be stranded by a failed gcloud auth call or
snapshotted into the Dagger layer cache.
- ci.yml / deploy.yml: add "Cleanup TLS credentials" step
(if: always()) at the end of every job that calls
setup_dagger_remote.sh. Removes /tmp/dagger-tls,
/tmp/stunnel-dagger.conf, /tmp/stunnel.pid from the self-hosted
runner after each job, so client certs do not accumulate between
job runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
flutter pub get is pure Dart — it never invokes Gradle. The mutable
gradle-cache volume mount caused the same execution-cache instability
we just fixed for the pub cache: Dagger sees a changed volume and
cache-misses pubGetLayer() on every run.
The Gradle cache stays in Base(), which is only used for steps that
actually build Android code.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The mutable flutter-pub-cache volume made the execution cache key unstable —
pub get cache-missed every run because the volume's mutable layer changed the
snapshot hash. Removing the volume lets Dagger snapshot packages inside the
execution-cache layer, which is stable and reclaimable via dagger prune.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On some Android versions the path_provider Pigeon channel
('dev.flutter.pigeon.path_provider_android.PathProviderApi.getApplicationSupportPath')
is not ready when initDatabasePath() runs before runApp(). The existing code
already catches PlatformException there, leaving _dbPath null — but the
LazyDatabase callback called getApplicationSupportDirectory() a second time
without any protection, causing an unhandled crash on those devices.
Fix: extract _resolveDatabasePath() which retries three times with back-off
(100 ms → 300 ms → 600 ms) before re-throwing with a descriptive error
message. By the time the database is first accessed (after runApp()), the
channel is almost always available; if it still isn't, the CrashScreen is
shown with a clear explanation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>