- Track a heartbeat timestamp in ~/.sharedinbox-agent-heartbeat at the
start of each _run_loop() invocation so we can tell when it last ran.
- Add `agent_loop.py monitor` subcommand that exits 1 with a WARNING
message if the heartbeat is missing, corrupted, or older than 2 hours.
- Add .forgejo/workflows/monitor.yml scheduled workflow that runs the
monitor check every 2 hours on the self-hosted runner; a CI failure
serves as the warning when the loop is stalled.
- Add 7 unit tests covering all monitor / heartbeat scenarios.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a catch-up PR merge fails (PR stays open after the merge command), the loop sets the issue to State/Question and comments on it. But on the next cron tick the same PR is still open with passing CI, so it tries again — spamming the issue with identical comments every minute.
Fix: before attempting a catch-up merge, fetch the issue's current labels via `_get_issue_labels()`. If `State/Question` is already set, skip the PR entirely.
Closes#239
Co-authored-by: Thomas SharedInbox <sharedinbox@thomas-guettler.de>
Reviewed-on: https://codeberg.org/guettli/sharedinbox/pulls/242
Forgejo reports deploy.yml (scheduled/dispatch) runs with event=push
and prettyref=main, identical to ci.yml push runs. The event-only
filter was insufficient — adding workflow_id == "ci.yml" prevents
deploy.yml runs from blocking or triggering false CI fix agents.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_latest_main_ci_run() was using event != pull_request which still
matched deploy.yml schedule runs when their prettyref == "main",
blocking the loop from picking up new issues.
_latest_ci_run_for_branch() had the same issue: the else branch matched
any non-pull_request event including schedule runs.
Both functions now explicitly filter for event == "push" only.
Tests updated: rename _latest_ci_run → _latest_main_ci_run, mock
_open_issue_prs to prevent real API calls in unit tests, and update
_find_pr_for_branch side_effect to reflect the upstream post-merge
PR-still-open verification check.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Forgejo workflow_runs API has no head_branch field. For pull_request
events the branch lives in event_payload["pull_request"]["head"]["ref"];
for push events it is in prettyref. The old code used run.get("head_branch")
which always returned None, causing _latest_ci_run_for_branch to never find
the run and the loop to declare "no CI run after 15 min" and set the issue to
State/Question — even when CI had already passed.
Also fixes a pre-existing test mock that was missing the session_name kwarg.
Adds TestLatestCiRunForBranch covering both event types and the regression.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously issue agents were instructed to close the issue via prompt text
immediately after pushing. If CI then failed, the issue was already closed.
Now the loop tracks a pending_issue across cron ticks:
- When an agent finishes (issue or ci-fix), the issue number is extracted
from state before it is cleared.
- If CI is still running, a "pending-ci" state preserves the issue number.
- If CI fails, the ci-fix agent is started with the issue number in state
so it survives the fix cycle.
- Once CI passes, _close_issue() is called from Python — never by the agent.
The agent prompt no longer instructs the agent to close the issue.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `---------------------- Starting YYYY-MM-DD HH:MMZ` header at each run
- Remove `[agent_loop]` prefix from all output lines
- Show full Codeberg URL for CI runs instead of bare run ID
- Show full issue URL and title when referencing issues
- Store issue_title in state file so "still running" messages include the title
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
State/Ready → State/InProgress is already set by agent_loop.py before
the agent starts. Update AGENTS.md to reflect that agents invoked via
the loop must not set InProgress themselves (only manual workflows need
to). Also fix TestMain tests that called main() directly, which caused
argparse to consume sys.argv; they now call _run_loop() instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add TestMain class covering the main() flow: asserts that _set_labels
is called with State/InProgress (and State/Ready removed) strictly
before _start_agent, and that no labels or agents are touched when
there are no ready issues.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the tmux-based agent launcher with a direct subprocess.Popen
call. Claude sessions can't be attached to anyway, so the tmux layer
added complexity with no benefit. State now tracks a PID instead of a
tmux session name; liveness is checked with os.kill(pid, 0).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously claude was launched with -p (print mode) which produces no
visible TUI. Attaching to the session with `tmux attach -t issue-NNN`
showed a blank terminal. Removing -p makes Claude run its interactive
TUI inside the tmux pane, so the session is fully watchable.
Add scripts/test_agent_loop.py covering _start_agent command
construction and state file round-trips.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>