Agent loop + CI improvements: PRs, fast/slow split, scheduled deploy #156

Closed
opened 2026-05-22 19:53:01 +00:00 by guettlibot · 0 comments
guettlibot commented 2026-05-22 19:53:01 +00:00 (Migrated from codeberg.org)

Background / root cause found

Issue #153 (and likely #148, #150–#155) was closed without any fix. Root cause:

  1. Claude hit its org rate limit immediately — the agent exited after < 1 s.
  2. The loop saw: agent dead → pending_issue=153, latest CI green (from issue #147's push) → closed #153 without verifying a new CI run had occurred.

Fix already committed in b48cb98: the loop now records the CI run ID when the agent starts and only closes the issue if a newer run passes. Same CI run ID → agent pushed nothing → set State/Question instead.


Planned improvements

1. Use PRs instead of pushing directly to main

Currently agents push directly to main. This means:

  • There is no code-review step, not even a diff in the UI.
  • If CI fails on main, it blocks all other work.
  • Merge conflicts are more likely.

Proposed change:

  • Agents create a feature branch (issue-<N>-fix) and open a PR.
  • The loop tracks the PR number in state alongside the issue number.
  • When the PR's CI passes the loop merges the PR (with --squash or --merge) and closes the issue.
  • If CI fails, the loop starts a fix-CI agent on the same branch.

Agent prompt addition: "Create a branch named issue-<N>-fix, push there, open a PR against main. Do NOT close the issue or merge."

Loop state addition: { "pr": 42, "issue": 153, ... }_latest_ci_run switches to checking the PR's head commit status.

2. Fast tests only in CI (on push + PR)

ci.yml currently runs the full suite on every push and PR: unit tests, Android build, Play Store publish, linux deploy, website publish. Android + Play Store each take ~30–45 min.

Proposed split:

ci.yml (on: push + pull_request) — keep only the check job:

  • task check-dagger (unit tests, analysis, formatting)
  • Removes build-linux, deploy-playstore, publish-website jobs from this workflow.
  • Target: < 10 min total.

android-emulator-tests.yml — move to scheduled-only (remove push/pull_request triggers).

3. Scheduled hourly run on main for long tests + deploy

New workflow .forgejo/workflows/deploy.yml (or rename android-emulator-tests.yml):

on:
  schedule:
    - cron: '0 * * * *'   # every hour
  workflow_dispatch:

Jobs:

  • test-android-firebase (Firebase Test Lab)
  • deploy-playstore (publish to Play Store)
  • build-linux + deploy-linux
  • publish-website

4. Set a label on success after scheduled run

When the scheduled hourly run succeeds, mark success visibly so:

  • The agent loop can check whether the last full deploy passed.
  • Developers can see the deploy health at a glance.

Option A — label on a persistent tracking issue (e.g., a pinned "Deploy health" issue):

fgj issue edit <tracking-issue> --add-label "CI/Full-Pass" --remove-label "CI/Full-Fail"

This is easy to query from the loop.

Option B — post a comment + set a Forgejo commit status (via Forgejo API POST /repos/.../statuses).

Option C — create a git tag deploy/YYYYMMDD-HHMM on each successful full deploy.

Recommendation: Option A (tracking issue label) is simplest and queryable by the loop.


Acceptance criteria

  • Agent loop fix (b48cb98) deployed and verified — no more phantom closes.
  • Agents create PRs; loop merges on CI green.
  • ci.yml runs only fast tests (< 10 min) on push + PR.
  • New scheduled workflow runs full suite once per hour on main.
  • On scheduled success, set CI/Full-Pass label on a tracking issue.
## Background / root cause found Issue #153 (and likely #148, #150–#155) was closed without any fix. Root cause: 1. Claude hit its org rate limit immediately — the agent exited after < 1 s. 2. The loop saw: agent dead → `pending_issue=153`, latest CI green (from issue #147's push) → **closed #153 without verifying a new CI run had occurred**. **Fix already committed** in b48cb98: the loop now records the CI run ID when the agent starts and only closes the issue if a *newer* run passes. Same CI run ID → agent pushed nothing → set `State/Question` instead. --- ## Planned improvements ### 1. Use PRs instead of pushing directly to main Currently agents push directly to `main`. This means: - There is no code-review step, not even a diff in the UI. - If CI fails on `main`, it blocks all other work. - Merge conflicts are more likely. **Proposed change**: - Agents create a feature branch (`issue-<N>-fix`) and open a PR. - The loop tracks the PR number in state alongside the issue number. - When the PR's CI passes the loop merges the PR (with `--squash` or `--merge`) and closes the issue. - If CI fails, the loop starts a fix-CI agent on the same branch. Agent prompt addition: "Create a branch named `issue-<N>-fix`, push there, open a PR against main. Do NOT close the issue or merge." Loop state addition: `{ "pr": 42, "issue": 153, ... }` — `_latest_ci_run` switches to checking the PR's head commit status. ### 2. Fast tests only in CI (on push + PR) `ci.yml` currently runs the full suite on every push and PR: unit tests, Android build, Play Store publish, linux deploy, website publish. Android + Play Store each take ~30–45 min. **Proposed split**: **`ci.yml` (on: push + pull_request)** — keep only the `check` job: - `task check-dagger` (unit tests, analysis, formatting) - Removes `build-linux`, `deploy-playstore`, `publish-website` jobs from this workflow. - Target: < 10 min total. **`android-emulator-tests.yml`** — move to scheduled-only (remove `push`/`pull_request` triggers). ### 3. Scheduled hourly run on main for long tests + deploy New workflow `.forgejo/workflows/deploy.yml` (or rename `android-emulator-tests.yml`): ```yaml on: schedule: - cron: '0 * * * *' # every hour workflow_dispatch: ``` Jobs: - `test-android-firebase` (Firebase Test Lab) - `deploy-playstore` (publish to Play Store) - `build-linux` + `deploy-linux` - `publish-website` ### 4. Set a label on success after scheduled run When the scheduled hourly run succeeds, mark success visibly so: - The agent loop can check whether the last full deploy passed. - Developers can see the deploy health at a glance. **Option A** — label on a persistent tracking issue (e.g., a pinned "Deploy health" issue): ``` fgj issue edit <tracking-issue> --add-label "CI/Full-Pass" --remove-label "CI/Full-Fail" ``` This is easy to query from the loop. **Option B** — post a comment + set a Forgejo commit status (via Forgejo API `POST /repos/.../statuses`). **Option C** — create a git tag `deploy/YYYYMMDD-HHMM` on each successful full deploy. Recommendation: Option A (tracking issue label) is simplest and queryable by the loop. --- ## Acceptance criteria - [ ] Agent loop fix (b48cb98) deployed and verified — no more phantom closes. - [ ] Agents create PRs; loop merges on CI green. - [ ] `ci.yml` runs only fast tests (< 10 min) on push + PR. - [ ] New scheduled workflow runs full suite once per hour on main. - [ ] On scheduled success, set `CI/Full-Pass` label on a tracking issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: guettli/sharedinbox#156