Speed up agent loop and deploy #234
Closed
opened 2026-05-24 19:04:59 +00:00 by guettli
·
2 comments
No Branch/Tag Specified
main
issue-563-agentloop-validation
dummy-pr-test
issue-560-fix-firebase-run-url
issue-539-stable-imap-uid
issue-533-shared-email-list
plan-issue-555
drop-nix
plan-issue-484
plan-issue-539
plan-issue-535
plan-issue-474
plan-issue-533
fix-dagger-engineless-precommit
issue-521-fix-deploy-yml-wait-time-api
issue-502-fix-email-id-collision-mailbox
issue-492-eliminate-duplicate-build-runner
issue-494-website-change-detection
issue-491-parallelize-check
issue-478-fix-stalwart-dual-stack-bind
issue-475-allowed-addresses-glob
issue-473-search-result-reorder
issue-453-update-agentloop-defaults
issue-466-structured-search
issue-505-exclude-chaos-monkey-from-regular-ci
issue-509-fix-search-result-sorting
fix-ink-sparkle-remaining-tests
issue-506-fix-search-emails-tests
issue-504-runner-wait-time
issue-488-search-notes
issue-472-changelog-issue-links
issue-501-folder-search-local-sqlite
issue-486-fix-stale-test-shader-mismatch
fix/prevent-settled-search-rerun-473
issue-467-fix-search-stale-results
issue-446-installed-versions-in-changelog
issue-462-fix-pr
issue-448-chaos-monkey-test
issue-436-notes-on-emails
issue-429-unify-mail-display
issue-422-move-to-folder-create-new
issue-414-ensure-not-run-as-root
issue-424-unify-email-list-views
issue-419-trusted-senders-page
issue-425-fix-prs
test-foo
issue-421-bug-report
issue-383-fix-ci
issue-394-fix-deploy-flutter-version
issue-391-fix-ci-double-trigger
issue-376-combined-inbox-v2
issue-376-combined-inbox
issue-384-fix-open-prs
sops-migrate
issue-339-safe-first-on-imap-fetch
issue-340-try-catch-measure-height
issue-342-pin-intl-version
issue-341-guard-threademails-last
issue-335-agentloop-code-test
issue-329-fix
issue-315-fix
issue-320-fix
issue-325-fix
issue-312-fix
issue-311-fix
issue-305-fix
issue-304-fix
issue-299-fix
issue-300-fix
issue-298-fix
issue-296-fix
issue-294-fix
issue-289-fix
issue-288-fix
issue-287-fix
issue-286-fix
issue-277-fix
issue-282-fix
issue-280-fix
issue-272-fix
issue-268-fix
issue-267-fix
issue-266-fix
issue-258-fix
issue-260-fix
issue-257-fix
issue-253-fix
issue-216-fix
issue-251-fix
issue-249-fix
issue-question-fixes
issue-235-fix
issue-236-fix-v2
issue-237-fix
issue-236-fix
issue-228-fix
issue-217-fix
issue-214-fix
issue-213-fix
issue-208-fix
issue-205-fix
issue-204-fix
issue-203-fix
issue-202-fix
issue-129-fix
issue-161-fix
issue-160-fix
issue-201-fix
issue-210-fix
issue-198-fix
issue-200-fix
issue-144-fix
issue-199-fix
fix/playstore-upload-use-requests
issue-193-fix
issue-186-fix
issue-185-fix
issue-192-fix
issue-183-fix
issue-175-fix
issue-172-fix
issue-171-fix
issue-167-fix
issue-136-fix
issue-162-fix
issue-179-fix
issue-155-fix
issue-154-fix
issue-152-fix
issue-151-fix
issue-141-fix
issue-150-fix
issue-164-fix
migrate-to-dagger
task/d1-ci-matrix
task/a4-typeconverter-json
task/u7-onboarding-walkthrough
task/d3-sync-doc
task/a5-layer-boundary-lint
task/t5-golden-tests
task/p5-date-cache
task/s4-link-handling
task/p3-html-parse-isolate
task/u8-mark-all-read
task/u3-recent-searches
task/a3-jmap-injectable-http-client
task/r5-tls-error-handling
fix/playstore-redirect-retry
task/t3-repository-contract-tests
task/p2-email-list-pagination
task/p1-fts5-search
fix/playstore-upload-timeout
task/a1-email-detail-notifier
fix/upgrade-workmanager-0.9
fix/android-core-library-desugaring
task/p4-db-indexes
task/r3-html-error-boundary
task/d2-check-coverage
task/a2-email-tile
task/t4-migration-tests
task/t2-widget-tests
task/t1-email-repo-coverage
task/u6-connection-status
task/u4-push-notifications
task/u2-draft-sync
task/u1-list-unsubscribe
task/s2-hostname-validation
task/r6-reliability-fuzz-tests
task/r4-sync-error-banner
task/r2-force-resync
task/r1-undo-history-persistence
No results found.
Labels
Clear labels
NeedSupervisor
State/InProgress
State/Later
State/Planned
automerge
ci-failure
do-not-merge
loop/code
loop/code-ci-pending
loop/code-done
loop/code-in-process
loop/merge
loop/merge-done
loop/merge-in-process
loop/plan
loop/plan-done
loop/plan-in-process
Issue escalated to a human supervisor; agentloop will skip it until cleared.
Eligible for automatic merge by CI
Issue opened by agentloop to track a failing CI workflow; used for deduplication.
Plan PR — review only, do not merge.
Add to run the built-in "code" prompt; override at prompts/code.md.
Prompt "code" finished; waiting for the PR's CI to pass before advancing.
Prompt "code" finished successfully.
Agent for the "code" prompt is currently running on this issue.
Managed by agentloop
Managed by agentloop
Managed by agentloop
Add to run the built-in "plan" prompt; override at prompts/plan.md.
Prompt "plan" finished successfully.
Agent for the "plan" prompt is currently running on this issue.
No labels
State/Planned
Milestone
No items
No Milestone
Projects
Clear projects
No projects
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: guettli/sharedinbox#234
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Look at the current way the agent loop works.
Create a plan how to speed up the flow. Final goal is to run deploy.yml
Where could caching help?
Where could running concurrently help?
Where could calling cron jobs more often help?
Think about other ways, too.
Implementation Plan: Speed up agent loop and deploy
After reading the issue and exploring the codebase (crontab,
scripts/agent_loop.py,.forgejo/workflows/deploy.yml,scripts/deploy_cron.py,Taskfile.yml,ci/main.go), here is a detailed breakdown of where time is lost and how to recover it.Current flow and where time is wasted
The end-to-end cycle for a single issue currently looks like this (excluding actual agent work):
That is up to ~84 minutes of pure polling delay, before any actual build/deploy time.
1. Increase cron frequency for
agent_loop.py(Quick win)File: user crontab
Change:
*/5 * * * *→*/1 * * * *Each state transition in the loop costs up to one cron interval of idle waiting. Steps 1–5 above each burn up to 5 minutes. With 1-minute intervals, those five steps shrink to at most 5 minutes total (instead of 25).
Risk: 5× more invocations. Each run makes 2–4 Codeberg API calls via
teaandfgj. This is negligible for a single-user instance and well within Codeberg's rate limits.Note: The docstring in
agent_loop.pystill says "every 10 minutes" but the crontab already runs it every 5 minutes — the docstring should be updated to match.2. Trigger
deploy.ymlimmediately after PR merge (Highest impact)File:
scripts/agent_loop.pyChange: After every successful
_merge_pr()call, triggerdeploy.ymlimmediately via the Forgejo API.deploy.ymlis currently scheduled hourly (0 * * * *), meaning up to 59 minutes pass between a PR merging to main and apps being deployed. Triggering it from the agent loop eliminates this gap entirely.Add a helper after the merge calls (there are two: in section 2b post-agent merge and in the catch-up section):
Risk: If multiple PRs merge in quick succession,
deploy.ymlcould be triggered several times within minutes. Thecheck-changesjob already skips redundant builds when nothing relevant changed, so this is mostly harmless overhead. Add a guard if desired: check whether adeploy.ymlrun started within the last N minutes before triggering again.3. Add a
pushtrigger todeploy.yml(Complementary to #2)File:
.forgejo/workflows/deploy.ymlChange: Add a
pushtrigger with path filters alongside the existing schedule:This is complementary to option #2 (the loop trigger fires even for non-source changes; the push trigger fires only when relevant files change). Either approach eliminates the hourly wait; both together give belt-and-suspenders coverage.
Risk: Forgejo evaluates the
pathsfilter the same way ascheck-changes— if neither Android nor Linux source files changed, deploy.yml won't run. The existing hourly schedule still covers edge cases (e.g., infra/config changes not covered by the path filter).4. Fix the hourly change-detection window in
deploy.ymlFile:
.forgejo/workflows/deploy.yml,check-changesjobCurrent bug:
git diff --name-only HEAD~1 HEADcompares only the last commit. On the hourly schedule, if two PRs merged since the last run, the first PR's file changes are invisible. Its Android/Linux changes will be silently skipped, and no build will fire.Fix (simplest): For scheduled runs, always build.
workflow_dispatchalready sets both flags totrue. Extend that logic:Dagger's caching means redundant builds are cheap — if nothing changed, Dagger replays from cache. The expensive steps (Gradle compilation, Flutter build) are fully cached on the self-hosted runner's Dagger engine volumes.
Alternative fix: Increase
fetch-depthand diff from the last successful deploy's SHA, stored in a file on the runner. More precise, more complex.5. Allow concurrent planning + implementation agents (Throughput)
File:
scripts/agent_loop.pyChange: Replace the single-agent state file with separate state tracking for plan agents and impl agents, allowing one of each to run simultaneously.
Currently the loop starts either a plan agent or an impl agent per tick, never both. When a planning agent is running, no implementation work happens — even if a separate issue with
State/Readyis waiting.Approach: Use two state files (
~/.sharedinbox-plan-state.jsonand~/.sharedinbox-impl-state.json), or store a list in the existing state file. The loop logic becomes:Risk: Two agents running
task checkconcurrently both invoke Dagger. Since Dagger supports concurrent access to its cache volumes, this is generally safe but may slow individual runs on a resource-constrained runner. A ci-fix agent should remain mutually exclusive with impl agents (to avoid main-branch conflicts). This is the most complex change here.Recommendation: Only worth doing once there is a persistent backlog of issues; for typical use (a few issues at a time), options 1–4 will provide more practical speed-up with far less complexity.
6. Minor: decouple
publish-websitefrom Firebase test resultsFile:
.forgejo/workflows/deploy.ymlCurrent:
publish-websitedepends on[build-linux, deploy-playstore, deploy-apk]. If Firebase tests fail (a separate job),deploy-playstoreskips/fails, which in turn blockspublish-website.The website publish (Hugo build + rsync) is independent of whether Android tests pass. Consider gating
publish-websiteonly on the build jobs that actually produced deployable artifacts, not on test results.Summary and recommended order
*/1agent_loop.pydeploy.ymldeploy.ymlagent_loop.pydeploy.ymlStart with 1, 2, 4 — these are the highest-value, lowest-risk changes and can all be implemented in one small PR. Add option 3 if belt-and-suspenders deploy triggering is desired. Defer option 5 until there is a real throughput bottleneck.
Planning complete. To resume this session: