security: verify PID belongs to claude before SIGKILL (PID recycling risk) #160
Closed
opened 2026-05-23 08:55:18 +00:00 by guettlibot
·
18 comments
No Branch/Tag Specified
main
issue-563-agentloop-validation
dummy-pr-test
issue-560-fix-firebase-run-url
issue-539-stable-imap-uid
issue-533-shared-email-list
plan-issue-555
drop-nix
plan-issue-484
plan-issue-539
plan-issue-535
plan-issue-474
plan-issue-533
fix-dagger-engineless-precommit
issue-521-fix-deploy-yml-wait-time-api
issue-502-fix-email-id-collision-mailbox
issue-492-eliminate-duplicate-build-runner
issue-494-website-change-detection
issue-491-parallelize-check
issue-478-fix-stalwart-dual-stack-bind
issue-475-allowed-addresses-glob
issue-473-search-result-reorder
issue-453-update-agentloop-defaults
issue-466-structured-search
issue-505-exclude-chaos-monkey-from-regular-ci
issue-509-fix-search-result-sorting
fix-ink-sparkle-remaining-tests
issue-506-fix-search-emails-tests
issue-504-runner-wait-time
issue-488-search-notes
issue-472-changelog-issue-links
issue-501-folder-search-local-sqlite
issue-486-fix-stale-test-shader-mismatch
fix/prevent-settled-search-rerun-473
issue-467-fix-search-stale-results
issue-446-installed-versions-in-changelog
issue-462-fix-pr
issue-448-chaos-monkey-test
issue-436-notes-on-emails
issue-429-unify-mail-display
issue-422-move-to-folder-create-new
issue-414-ensure-not-run-as-root
issue-424-unify-email-list-views
issue-419-trusted-senders-page
issue-425-fix-prs
test-foo
issue-421-bug-report
issue-383-fix-ci
issue-394-fix-deploy-flutter-version
issue-391-fix-ci-double-trigger
issue-376-combined-inbox-v2
issue-376-combined-inbox
issue-384-fix-open-prs
sops-migrate
issue-339-safe-first-on-imap-fetch
issue-340-try-catch-measure-height
issue-342-pin-intl-version
issue-341-guard-threademails-last
issue-335-agentloop-code-test
issue-329-fix
issue-315-fix
issue-320-fix
issue-325-fix
issue-312-fix
issue-311-fix
issue-305-fix
issue-304-fix
issue-299-fix
issue-300-fix
issue-298-fix
issue-296-fix
issue-294-fix
issue-289-fix
issue-288-fix
issue-287-fix
issue-286-fix
issue-277-fix
issue-282-fix
issue-280-fix
issue-272-fix
issue-268-fix
issue-267-fix
issue-266-fix
issue-258-fix
issue-260-fix
issue-257-fix
issue-253-fix
issue-216-fix
issue-251-fix
issue-249-fix
issue-question-fixes
issue-235-fix
issue-236-fix-v2
issue-237-fix
issue-236-fix
issue-228-fix
issue-217-fix
issue-214-fix
issue-213-fix
issue-208-fix
issue-205-fix
issue-204-fix
issue-203-fix
issue-202-fix
issue-129-fix
issue-161-fix
issue-160-fix
issue-201-fix
issue-210-fix
issue-198-fix
issue-200-fix
issue-144-fix
issue-199-fix
fix/playstore-upload-use-requests
issue-193-fix
issue-186-fix
issue-185-fix
issue-192-fix
issue-183-fix
issue-175-fix
issue-172-fix
issue-171-fix
issue-167-fix
issue-136-fix
issue-162-fix
issue-179-fix
issue-155-fix
issue-154-fix
issue-152-fix
issue-151-fix
issue-141-fix
issue-150-fix
issue-164-fix
migrate-to-dagger
task/d1-ci-matrix
task/a4-typeconverter-json
task/u7-onboarding-walkthrough
task/d3-sync-doc
task/a5-layer-boundary-lint
task/t5-golden-tests
task/p5-date-cache
task/s4-link-handling
task/p3-html-parse-isolate
task/u8-mark-all-read
task/u3-recent-searches
task/a3-jmap-injectable-http-client
task/r5-tls-error-handling
fix/playstore-redirect-retry
task/t3-repository-contract-tests
task/p2-email-list-pagination
task/p1-fts5-search
fix/playstore-upload-timeout
task/a1-email-detail-notifier
fix/upgrade-workmanager-0.9
fix/android-core-library-desugaring
task/p4-db-indexes
task/r3-html-error-boundary
task/d2-check-coverage
task/a2-email-tile
task/t4-migration-tests
task/t2-widget-tests
task/t1-email-repo-coverage
task/u6-connection-status
task/u4-push-notifications
task/u2-draft-sync
task/u1-list-unsubscribe
task/s2-hostname-validation
task/r6-reliability-fuzz-tests
task/r4-sync-error-banner
task/r2-force-resync
task/r1-undo-history-persistence
No results found.
Labels
Clear labels
NeedSupervisor
State/InProgress
State/Later
State/Planned
automerge
ci-failure
do-not-merge
loop/code
loop/code-ci-pending
loop/code-done
loop/code-in-process
loop/merge
loop/merge-done
loop/merge-in-process
loop/plan
loop/plan-done
loop/plan-in-process
Issue escalated to a human supervisor; agentloop will skip it until cleared.
Eligible for automatic merge by CI
Issue opened by agentloop to track a failing CI workflow; used for deduplication.
Plan PR — review only, do not merge.
Add to run the built-in "code" prompt; override at prompts/code.md.
Prompt "code" finished; waiting for the PR's CI to pass before advancing.
Prompt "code" finished successfully.
Agent for the "code" prompt is currently running on this issue.
Managed by agentloop
Managed by agentloop
Managed by agentloop
Add to run the built-in "plan" prompt; override at prompts/plan.md.
Prompt "plan" finished successfully.
Agent for the "plan" prompt is currently running on this issue.
No labels
Milestone
No items
No Milestone
Projects
Clear projects
No projects
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: guettli/sharedinbox#160
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Why SIGKILL is needed
The agent loop kills long-running agents (> 1 h) to prevent a runaway Claude session from holding the loop hostage indefinitely. Without the kill, no new issue agent could ever start on that machine again.
The kill path in `_run_loop()` (agent_loop.py ~line 368):
```python
_kill_agent(state) # os.kill(pid, 9)
```
`_kill_agent` sends SIGKILL to the PID stored in `~/.sharedinbox-agent-state.json`.
The risk
Linux recycles PIDs. If the original `claude` process died between the previous cron tick and the current one, the stored PID may now belong to a completely different process — a system daemon, another user's process, or any other program.
`_agent_alive` (os.kill(pid, 0)) and `_kill_agent` (os.kill(pid, 9)) are two separate syscalls with no atomicity guarantee, so there is a TOCTOU window even when the process exists at check time.
Additionally, the state file is currently written world-readable (`0664`), meaning any local process could craft a `{"pid": 1, ...}` entry to make the loop send SIGKILL to init. (The state file permissions were fixed in
b6a2f91, making this specific escalation moot — but the recycling race remains.)Fix
Before calling `os.kill(pid, 9)`, verify the process is actually `claude`:
```python
def _is_claude_process(pid: int) -> bool:
try:
comm = Path(f"/proc/{pid}/comm").read_text().strip()
return comm in ("claude", "node") # claude may run as node
except OSError:
return False
```
Then in `_kill_agent`:
```python
def _kill_agent(state: dict) -> None:
pid = state.get("pid")
if pid and _is_claude_process(pid):
try:
os.kill(pid, 9)
except ProcessLookupError:
pass
elif pid:
print(f"WARNING: pid {pid} is not a claude process — skipping kill to avoid hitting recycled PID")
```
Also consider storing `/proc/{pid}/stat` field 22 (starttime) at agent launch and comparing it before killing, for a stronger identity check.
Agent opened PR #163 but no CI run appeared on branch
issue-160-fixafter 17 min. The agent may not have pushed any commits. Please investigate and resume manually.Agent opened PR #163 but no CI run appeared on branch
issue-160-fixafter 328 min. The agent may not have pushed any commits. Please investigate and resume manually.Agent opened PR #163 but no CI run appeared on branch
issue-160-fixafter 392 min. The agent may not have pushed any commits. Please investigate and resume manually.Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Reopened: this issue was incorrectly closed by the agent loop's catch-up scan, which called
_close_issue(160)after a merge attempt that silently exited 0 — but PR #163 was never actually merged (it is still open). The loop has been patched to verify the merge succeeded before closing the issue.Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.
Automatic merge of PR #163 failed (PR is still open after the merge command). Please merge manually.