Commit Graph
24 Commits
Author SHA1 Message Date
Bot of Thomas Güttler e2bb299300 fix(ci): exclude chaos_monkey_test from regular CI (#518) 2026-06-07 04:24:10 +02:00
Bot of Thomas Güttler f7fd30da15 feat(ci): add Print runner wait time step to all workflow jobs (#517) 2026-06-07 02:40:08 +02:00
29c2c7e96c fix: three deploy failures from run #1424 (#369)
## Summary

Fixes three distinct failures from CI deploy run #1424 and concurrent website update failures.

- **Play Store job**: `pip install google-auth requests` fails on Ubuntu 24.04 with PEP 668. Fixed by using `python3 -m venv` for an isolated install.
- **SSH key error (APK, Linux, website jobs)**: All SSH/rsync steps fail with `Load key "/root/.ssh/id_ed25519": error in libcrypto` inside the Dagger Alpine 3.21 container. This is the first time these jobs actually ran (all previous deploy runs had every job skipped). Two fixes:
  - `setup_dagger_remote.sh`: `export_secret` was appending an extra trailing newline to values (like SSH private keys) that already end with `\n`. Now only adds one when needed.
  - `ci/main.go` `Deployer`: mounts the key at a `.raw` path, strips Windows-style CRLF endings with `tr -d '\r'`, then writes the normalised key to `id_ed25519`. CRLF bytes cause "error in libcrypto" in Alpine's LibreSSL-backed openssh.

## Test plan
- [ ] Deploy run triggers after merge; all three deploy jobs complete
- [ ] Play Store verification step passes
- [ ] SSH commands in Alpine load the key without `error in libcrypto`

Closes #366

Co-authored-by: Thomas SharedInbox <sharedinbox@thomas-guettler.de>
Reviewed-on: https://codeberg.org/guettli/sharedinbox/pulls/369
2026-06-03 21:23:13 +02:00
6a097976d3 fix: correct LAST_DEPLOYED_SHA detection so Play Store always gets updated (#364)
Closes #361

Three bugs in the hourly deploy workflow's change-detection logic caused the Play Store to silently fall behind whenever a deploy failed or all-android jobs were skipped.

**Bug 1 (primary): commit_sha → head_sha**
Forgejo's API returns head_sha; commit_sha was always None. This meant LAST_DEPLOYED_SHA was always empty, so the diff fell back to HEAD~1..HEAD — only the single most recent commit was inspected. If android changes landed in an earlier commit, they were silently missed.

**Bug 2: Skipped runs counted as 'deployed'**
A workflow run where deploy-playstore was skipped (android=false) has status=success, so it was treated as a successful deploy. Now the code queries each run's job results and only trusts a run where the 'Build & Deploy to Play Store' job's own conclusion=success.

**Bug 3: Narrow fallback when SHA unknown**
When LAST_DEPLOYED_SHA could not be determined the workflow diffed HEAD~1..HEAD — potentially missing many commits. Now it defaults to android=true / linux=true (deploy everything) as the safe fallback.

Additional changes:
- ::error:: / ::warning:: / ::notice:: annotations so skip/failure reasons surface in the Actions UI.
- scripts/verify_playstore_deploy.py: new post-deploy check that queries the internal track and fails if the latest version code is more than 1 hour old. (Version codes are Unix timestamps set by ci/main.go's PublishAndroid.) Catches silent deploy failures the upload API did not reject.
- scripts/test_verify_playstore_deploy.py: 5 unit tests for the verify script (all pass).

Co-authored-by: Thomas SharedInbox <sharedinbox@thomas-guettler.de>
Reviewed-on: https://codeberg.org/guettli/sharedinbox/pulls/364
2026-06-03 19:26:00 +02:00
9605c5e3b7 ci: print explicit reason when deploy jobs are skipped (#357)
## Summary

- The \`Detect Changed Files\` step in \`deploy.yml\` previously set \`android=false\` / \`linux=false\` silently, leaving downstream jobs showing only "skipped" in CI with no visible cause
- Now each decision emits a clear one-liner in the step log:
  - \`Android deploy: SKIPPED (no android-relevant files changed)\`
  - \`Android deploy: TRIGGERED (android-relevant files changed)\`
  - \`Linux deploy: SKIPPED (no linux-relevant files changed)\`
  - or \`HEAD <sha> already successfully deployed — skipping all deploy jobs\`
- The skip reason is visible in the \`check-changes\` job output, which is the job that makes the decision

Closes #353

## Test plan

- [ ] Trigger the deploy workflow on a commit that only touches CI/docs files — \`check-changes\` step log should show "Android deploy: SKIPPED (no android-relevant files changed)"
- [ ] Trigger the deploy workflow on a commit touching \`lib/\` — log should show "Android deploy: TRIGGERED"
- [ ] Trigger a second run on the same commit — log should show "already successfully deployed — skipping all deploy jobs"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Thomas SharedInbox <sharedinbox@thomas-guettler.de>
Reviewed-on: https://codeberg.org/guettli/sharedinbox/pulls/357
2026-06-03 13:27:29 +02:00
Bot of Thomas Güttler 2747c4e63d chore: migrate CI secrets from Forgejo to SOPS (#354) 2026-06-03 06:37:07 +02:00
Thomas Güttler b0a09939c9 chore: migrate all workflows to SSH-based Dagger engine and remove stunnel legacy 2026-06-02 17:40:35 +02:00
Bot of Thomas Güttler 91083218d4 fix: diff from last deployed SHA to catch all changes since last deploy (#320) (#332) 2026-05-29 17:34:21 +02:00
Bot of Thomas Güttler adc4eb6f6d feat: remove publish-website from deploy.yml, schedule website.yml hourly (#325) (#330) 2026-05-29 12:53:18 +02:00
Thomas SharedInbox a8d6ec5861 fix: use commit_sha instead of head_sha to detect already-deployed commits
Forgejo's API returns head_sha=null in workflow run objects; the correct
field is commit_sha. The skip-check always got None, so every hourly
schedule triggered a full redeploy of the same commit.
2026-05-26 15:22:23 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 720c54433a feat: run Firebase tests once daily via dedicated workflow (#272)
Move Android Firebase instrumented tests out of deploy.yml into a new
firebase-tests.yml workflow that runs once per day (3 AM UTC) and only
when Firebase-relevant files changed in the last 24 hours. On failure,
the workflow automatically creates a Forgejo issue labelled "Ready" with
instructions to find the root cause and fix it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 08:48:10 +02:00
c97e3d505f fix: skip deploy when HEAD already successfully deployed (#264) (#265)
## Summary

- The hourly `deploy.yml` schedule re-deployed the same commit repeatedly because it always diffed `HEAD~1..HEAD` — once a commit touching `lib/`/`pubspec.*` became HEAD, every hourly tick would detect "android changes" and deploy again.
- Fix: at the start of the `check-changes` job, query the Forgejo workflow runs API for the last successful `deploy.yml` run. If its `head_sha` matches current HEAD, output `android=false` / `linux=false` immediately, skipping all downstream jobs.
- `workflow_dispatch` bypasses this check (always deploys), matching the existing behaviour.

## Test plan

- [ ] Verify the `check-changes` job exits early on the next scheduled run after a successful deploy of the same commit
- [ ] Verify a new commit still triggers deployment normally
- [ ] Verify `workflow_dispatch` still deploys unconditionally

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Thomas SharedInbox <sharedinbox@thomas-guettler.de>
Reviewed-on: https://codeberg.org/guettli/sharedinbox/pulls/265
2026-05-26 07:35:18 +02:00
Bot of Thomas Güttler a7783d46cf fix: disable Save button when no password available; fix changelog fetch-depth (#246, #229) (#248) 2026-05-25 14:47:25 +02:00
Bot of Thomas Güttler 32ba916cbf fix: trigger deploy on script changes, add changelog dep, deepen fetch (#228) (#233) 2026-05-24 21:05:10 +02:00
Thomas SharedInbox b2c11e0c63 Revert "feat: keep secrets in sync via age-encrypted master key (#208) (#223)"
This reverts commit 96b1660b59.
2026-05-24 18:39:23 +02:00
Bot of Thomas Güttler 96b1660b59 feat: keep secrets in sync via age-encrypted master key (#208) (#223) 2026-05-24 16:35:10 +02:00
Bot of Thomas Güttler 37eca207c6 fix: pin SSH host key via known_hosts instead of StrictHostKeyChecking=no (#161) (#181) 2026-05-24 13:00:04 +02:00
Bot of Thomas Güttler 30bcc8a314 fix: skip CI jobs when unrelated files change (#144) (#207) 2026-05-24 08:30:10 +02:00
Bot of Thomas Güttler 71ccf24d0c fix: survive permanently broken path_provider channel on Android (#192) (#194) 2026-05-24 03:50:07 +02:00
Bot of Thomas Güttler 833e8d49b0 fix: remove continue-on-error from CI workflows (#172) (#189) 2026-05-23 19:05:08 +02:00
Bot of Thomas Güttler 6adba9b001 perf: parallelize APK deploy and reduce fetch-depth in deploy.yml (#171) (#188) 2026-05-23 18:55:08 +02:00
Bot of Thomas Güttler 1b1f9788fd docs: explain why continue-on-error is intentional on deploy steps (#154) (#177) 2026-05-23 15:30:14 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 b6a2f91820 security: fix log/state file permissions, Firebase key on disk, TLS cleanup
- agent_loop.py: create log dir with mode 0700 and enforce it on
  existing dirs; open log files with mode 0600; chmod state file
  to 0600 after every write. Prevents other local processes from
  reading agent output (which may contain credential paths) or
  tampering with the state file's pid field.

- ci/main.go (TestAndroidFirebase): replace
    echo "$FIREBASE_SA_KEY" > /tmp/key.json
  with bash process substitution
    --key-file=<(echo "$FIREBASE_SA_KEY")
  The key is now passed via a file descriptor — it never touches
  disk, so it cannot be stranded by a failed gcloud auth call or
  snapshotted into the Dagger layer cache.

- ci.yml / deploy.yml: add "Cleanup TLS credentials" step
  (if: always()) at the end of every job that calls
  setup_dagger_remote.sh. Removes /tmp/dagger-tls,
  /tmp/stunnel-dagger.conf, /tmp/stunnel.pid from the self-hosted
  runner after each job, so client certs do not accumulate between
  job runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 10:54:53 +02:00
Thomas SharedInboxandClaude Sonnet 4.6 9cd18ba70e feat: agent loop uses PRs; ci.yml fast-only; hourly deploy workflow (#156)
- agent_loop.py: agents now create an `issue-N-fix` branch and open a PR;
  the loop discovers the PR via `fgj pr list`, tracks its CI run, squash-merges
  on green, and falls back to the global-CI path if no PR exists (backward compat).
  Adds `_find_pr_for_branch`, `_latest_ci_run_for_branch`, `_merge_pr` helpers.

- .forgejo/workflows/ci.yml: strip to the single fast `check` job only
  (removes build-linux, deploy-playstore, publish-website).

- .forgejo/workflows/deploy.yml (new, replaces android-emulator-tests.yml):
  scheduled hourly + workflow_dispatch; runs firebase tests, Play Store deploy,
  Linux build/deploy, website publish; on completion sets CI/Full-Pass or
  CI/Full-Fail label on the repo's DEPLOY_HEALTH_ISSUE tracking issue.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 22:05:09 +02:00