App in PlayStore was not updated #361

Closed
opened 2026-06-03 13:53:26 +00:00 by guettli · 2 comments
guettli commented 2026-06-03 13:53:26 +00:00 (Migrated from codeberg.org)

do plan in below comment.

do plan in below comment.
guettlibot commented 2026-06-03 14:55:02 +00:00 (Migrated from codeberg.org)

Let me read the relevant project files to understand the CI/CD setup before writing a plan.Let me look at the deploy scripts and Taskfile to understand the full picture.Now I have all the context I need to write the plan.Now I have everything I need. Here is the complete plan to post as a comment on issue #361.


Implementation Plan: Fix silent deploy failures in deploy.yml

Root cause analysis

The workflow shows green while the app is not deployed due to three compounding bugs:


Bug 1 (Critical): Wrong JSON field — commit_sha instead of head_sha

In check-changes, the Python snippet that determines LAST_DEPLOYED_SHA reads:

print(runs[0].get("commit_sha") or "")

Forgejo's API returns this field as head_sha (following GitHub's convention). commit_sha is always None, so LAST_DEPLOYED_SHA = "" on every run. The fallback diff is always HEAD~1..HEAD — only the single most recent commit's file changes are checked.

Consequence: If android changes landed in commit A and the deploy failed (workflow=failure), and then a non-android commit B lands, diff(A, B) shows no android changes → android=false → deploy skipped → workflow=success. The failure at A is silently abandoned and never retried.


Bug 2: Silent exception swallowing makes API failures invisible

except Exception as e:
    print(f"API check failed: {e}", file=sys.stderr)
    print("")

The error goes to stderr only — not surfaced as a CI annotation. LAST_DEPLOYED_SHA silently becomes "", and the CI step continues as if nothing went wrong.


Bug 3: "All-skipped" workflow runs count as "successfully deployed"

When android=false, all deploy jobs are skipped. The overall workflow run still has status=success. Once Bug 1 is fixed and LAST_DEPLOYED_SHA is correctly populated, it will be set to the SHA of this skipped run. If a deploy then fails at a later commit and no further android commits land, the skipped run's SHA becomes the reference — and a diff from a "skipped" SHA may fail to detect that a deploy is still outstanding.


Changes to .forgejo/workflows/deploy.yml

Step 1 — Fix commit_shahead_sha

In the LAST_DEPLOYED_SHA Python block:

# Before
print(runs[0].get("commit_sha") or "")

# After
print(runs[0].get("head_sha") or "")

Also verify whether Forgejo uses status or conclusion for run completion state. If r.get("status") == "success" never matches, change to r.get("conclusion") == "success" and update the URL parameter status=successconclusion=success.

Step 2 — Make the exception a visible CI annotation

# Before
except Exception as e:
    print(f"API check failed: {e}", file=sys.stderr)
    print("")

# After
except Exception as e:
    print(f"::error::LAST_DEPLOYED_SHA lookup failed ({type(e).__name__}: {e})", file=sys.stderr)
    print("")

The ::error:: prefix causes Forgejo Actions to surface this as a check annotation visible in the PR/run UI, not just a buried stderr line.

Additionally, change the fallback strategy: when LAST_DEPLOYED_SHA = "" due to an API failure, default to android=true (force-deploy) rather than HEAD~1..HEAD. This ensures no deploy is silently skipped because the detection machinery broke.

Step 3 — Only treat "actually deployed" runs as the last deployed SHA

The status=success filter includes runs where all deploy jobs were skipped. After fixing Bug 1, this becomes the primary source of incorrect "last deployed" detection.

Fix: after retrieving the candidate runs, check each run's job results via /api/v1/repos/{repo}/actions/runs/{run_id}/jobs and select only the first run where the deploy-playstore job has result == "success" (not "skipped").

for run in runs:
    run_id = run["id"]
    jobs_url = f"{server}/api/v1/repos/{repo}/actions/runs/{run_id}/jobs"
    jobs_req = urllib.request.Request(jobs_url, headers={"Authorization": f"token {token}"})
    with urllib.request.urlopen(jobs_req) as jr:
        jobs_data = json.loads(jr.read())
    for job in jobs_data.get("workflow_jobs", []):
        if job.get("name") == "Build & Deploy to Play Store" and job.get("conclusion") == "success":
            print(run.get("head_sha") or "")
            sys.exit(0)
print("")  # no actual deploy found in recent history

Step 4 — Add a post-deploy verification step in deploy-playstore

After task publish-android succeeds, add a new step that queries the Play Store API to confirm the new version code is present on the internal track. If it is not, fail loudly. The verification logic can reuse scripts/deploy_playstore.py's authentication pattern. A new script scripts/verify_playstore_deploy.py should:

  • Accept the expected version code as input (or read it from Dagger output)
  • Call the Play Store Edits or Tracks API and confirm the version is present
  • Exit non-zero if the version is absent

This catches the case where dagger call publish-android exits 0 despite an internal Play Store API failure.

Step 5 — Add explicit ::notice:: annotations when skipping

In check-changes, when outputting android=false, add an annotation so the skip reason is visible in the Actions UI:

echo "::notice::Android deploy SKIPPED — diff $LAST_DEPLOYED_SHA..HEAD has no android-relevant changes"

For the "already deployed" early exit:

echo "::notice::All deploys SKIPPED — HEAD $HEAD_SHA was already successfully deployed"

Summary of changes

File Change
.forgejo/workflows/deploy.yml Fix commit_shahead_sha; ::error:: on API exception; fallback to android=true on lookup failure; filter LAST_DEPLOYED_SHA to runs where deploy-playstore actually succeeded; add ::notice:: skip annotations
.forgejo/workflows/deploy.yml New step after task publish-android: run scripts/verify_playstore_deploy.py
scripts/verify_playstore_deploy.py New script: authenticate with PLAY_STORE_CONFIG_JSON, query internal track, fail if expected version code absent
Let me read the relevant project files to understand the CI/CD setup before writing a plan.Let me look at the deploy scripts and Taskfile to understand the full picture.Now I have all the context I need to write the plan.Now I have everything I need. Here is the complete plan to post as a comment on issue #361. --- ## Implementation Plan: Fix silent deploy failures in deploy.yml ### Root cause analysis The workflow shows green while the app is not deployed due to **three compounding bugs**: --- **Bug 1 (Critical): Wrong JSON field — `commit_sha` instead of `head_sha`** In `check-changes`, the Python snippet that determines `LAST_DEPLOYED_SHA` reads: ```python print(runs[0].get("commit_sha") or "") ``` Forgejo's API returns this field as `head_sha` (following GitHub's convention). `commit_sha` is always `None`, so `LAST_DEPLOYED_SHA = ""` on every run. The fallback diff is always `HEAD~1..HEAD` — only the single most recent commit's file changes are checked. **Consequence:** If android changes landed in commit A and the deploy failed (workflow=failure), and then a non-android commit B lands, `diff(A, B)` shows no android changes → `android=false` → deploy skipped → workflow=**success**. The failure at A is silently abandoned and never retried. --- **Bug 2: Silent exception swallowing makes API failures invisible** ```python except Exception as e: print(f"API check failed: {e}", file=sys.stderr) print("") ``` The error goes to stderr only — not surfaced as a CI annotation. `LAST_DEPLOYED_SHA` silently becomes `""`, and the CI step continues as if nothing went wrong. --- **Bug 3: "All-skipped" workflow runs count as "successfully deployed"** When `android=false`, all deploy jobs are `skipped`. The overall workflow run still has `status=success`. Once Bug 1 is fixed and `LAST_DEPLOYED_SHA` is correctly populated, it will be set to the SHA of this skipped run. If a deploy then fails at a later commit and no further android commits land, the skipped run's SHA becomes the reference — and a diff from a "skipped" SHA may fail to detect that a deploy is still outstanding. --- ### Changes to `.forgejo/workflows/deploy.yml` **Step 1 — Fix `commit_sha` → `head_sha`** In the `LAST_DEPLOYED_SHA` Python block: ```python # Before print(runs[0].get("commit_sha") or "") # After print(runs[0].get("head_sha") or "") ``` Also verify whether Forgejo uses `status` or `conclusion` for run completion state. If `r.get("status") == "success"` never matches, change to `r.get("conclusion") == "success"` and update the URL parameter `status=success` → `conclusion=success`. **Step 2 — Make the exception a visible CI annotation** ```python # Before except Exception as e: print(f"API check failed: {e}", file=sys.stderr) print("") # After except Exception as e: print(f"::error::LAST_DEPLOYED_SHA lookup failed ({type(e).__name__}: {e})", file=sys.stderr) print("") ``` The `::error::` prefix causes Forgejo Actions to surface this as a check annotation visible in the PR/run UI, not just a buried stderr line. Additionally, change the fallback strategy: when `LAST_DEPLOYED_SHA = ""` due to an API failure, default to `android=true` (force-deploy) rather than `HEAD~1..HEAD`. This ensures no deploy is silently skipped because the detection machinery broke. **Step 3 — Only treat "actually deployed" runs as the last deployed SHA** The `status=success` filter includes runs where all deploy jobs were `skipped`. After fixing Bug 1, this becomes the primary source of incorrect "last deployed" detection. Fix: after retrieving the candidate runs, check each run's job results via `/api/v1/repos/{repo}/actions/runs/{run_id}/jobs` and select only the first run where the `deploy-playstore` job has `result == "success"` (not `"skipped"`). ```python for run in runs: run_id = run["id"] jobs_url = f"{server}/api/v1/repos/{repo}/actions/runs/{run_id}/jobs" jobs_req = urllib.request.Request(jobs_url, headers={"Authorization": f"token {token}"}) with urllib.request.urlopen(jobs_req) as jr: jobs_data = json.loads(jr.read()) for job in jobs_data.get("workflow_jobs", []): if job.get("name") == "Build & Deploy to Play Store" and job.get("conclusion") == "success": print(run.get("head_sha") or "") sys.exit(0) print("") # no actual deploy found in recent history ``` **Step 4 — Add a post-deploy verification step in `deploy-playstore`** After `task publish-android` succeeds, add a new step that queries the Play Store API to confirm the new version code is present on the internal track. If it is not, fail loudly. The verification logic can reuse `scripts/deploy_playstore.py`'s authentication pattern. A new script `scripts/verify_playstore_deploy.py` should: - Accept the expected version code as input (or read it from Dagger output) - Call the Play Store Edits or Tracks API and confirm the version is present - Exit non-zero if the version is absent This catches the case where `dagger call publish-android` exits 0 despite an internal Play Store API failure. **Step 5 — Add explicit `::notice::` annotations when skipping** In `check-changes`, when outputting `android=false`, add an annotation so the skip reason is visible in the Actions UI: ```bash echo "::notice::Android deploy SKIPPED — diff $LAST_DEPLOYED_SHA..HEAD has no android-relevant changes" ``` For the "already deployed" early exit: ```bash echo "::notice::All deploys SKIPPED — HEAD $HEAD_SHA was already successfully deployed" ``` --- ### Summary of changes | File | Change | |---|---| | `.forgejo/workflows/deploy.yml` | Fix `commit_sha` → `head_sha`; `::error::` on API exception; fallback to `android=true` on lookup failure; filter `LAST_DEPLOYED_SHA` to runs where `deploy-playstore` actually succeeded; add `::notice::` skip annotations | | `.forgejo/workflows/deploy.yml` | New step after `task publish-android`: run `scripts/verify_playstore_deploy.py` | | `scripts/verify_playstore_deploy.py` | New script: authenticate with `PLAY_STORE_CONFIG_JSON`, query internal track, fail if expected version code absent |
guettlibot commented 2026-06-03 17:00:06 +00:00 (Migrated from codeberg.org)

Let me read the issue details and project files to understand what needs to be done.Let me check the Forgejo runner Dockerfile and the recent git history to understand the issue better.Let me check a few more files for context on the issue plan.Let me try using the fgj tool to get the issue comments since the plan should be in a comment.Now I have the full plan from the issue comment. Let me implement the fixes:

  1. Fix commit_shahead_sha (the primary bug)
  2. Filter to runs where deploy-playstore actually succeeded (not just skipped)
  3. Make exception visible as ::error:: annotation
  4. Fallback to deploy-all when SHA is unknown
  5. Add ::notice:: annotations for skip/trigger messages
  6. Add post-deploy verification scriptNow let me implement the changes to deploy.yml:Now add the verify step after the Publish Android to Play Store step:Now create the verify_playstore_deploy.py script:Now let me also add a test for verify_playstore_deploy.py. Let me check if there are existing tests for similar scripts:Now let me verify the deploy.yml looks correct after the edits:I need to also install requests alongside google-auth since AuthorizedSession depends on it:Now let me run the existing unit tests to make sure nothing is broken:All tests pass. Now let me commit the changes and push:Good. Let me now commit:The PR is open at https://codeberg.org/guettli/sharedinbox/pulls/364.

Here's what was done for issue #361 ("App in PlayStore was not updated"):

Root cause (from the issue plan comment): Three compounding bugs in .forgejo/workflows/deploy.yml caused the Play Store to silently fall behind.

Bug 1 — commit_shahead_sha (primary bug)
Forgejo's API returns head_sha; commit_sha is always None. So LAST_DEPLOYED_SHA was always empty and the diff fell back to HEAD~1..HEAD — only the single most recent commit. Android changes in earlier commits were silently missed.

Bug 2 — Skipped runs counted as "deployed"
A workflow run where deploy-playstore was skipped (android=false) still has status=success overall. The code now queries each run's job results via the jobs API and only uses a run where the "Build & Deploy to Play Store" job's own conclusion=success.

Bug 3 — Narrow fallback when SHA unknown
When LAST_DEPLOYED_SHA couldn't be determined, the fallback HEAD~1..HEAD diff was dangerously narrow. Now it defaults to android=true / linux=true (deploy everything) as the safe default.

Additional changes:

  • ::error:: / ::warning:: / ::notice:: workflow annotations so skip/failure reasons appear in the Actions UI
  • scripts/verify_playstore_deploy.py: new post-deploy script that queries the internal track and fails if the latest version code is >1 hour old (version codes are Unix timestamps per ci/main.go)
  • scripts/test_verify_playstore_deploy.py: 5 unit tests (all passing)
Let me read the issue details and project files to understand what needs to be done.Let me check the Forgejo runner Dockerfile and the recent git history to understand the issue better.Let me check a few more files for context on the issue plan.Let me try using the `fgj` tool to get the issue comments since the plan should be in a comment.Now I have the full plan from the issue comment. Let me implement the fixes: 1. Fix `commit_sha` → `head_sha` (the primary bug) 2. Filter to runs where `deploy-playstore` actually succeeded (not just skipped) 3. Make exception visible as `::error::` annotation 4. Fallback to deploy-all when SHA is unknown 5. Add `::notice::` annotations for skip/trigger messages 6. Add post-deploy verification scriptNow let me implement the changes to `deploy.yml`:Now add the verify step after the `Publish Android to Play Store` step:Now create the `verify_playstore_deploy.py` script:Now let me also add a test for `verify_playstore_deploy.py`. Let me check if there are existing tests for similar scripts:Now let me verify the deploy.yml looks correct after the edits:I need to also install `requests` alongside `google-auth` since `AuthorizedSession` depends on it:Now let me run the existing unit tests to make sure nothing is broken:All tests pass. Now let me commit the changes and push:Good. Let me now commit:The PR is open at **https://codeberg.org/guettli/sharedinbox/pulls/364**. --- Here's what was done for issue #361 ("App in PlayStore was not updated"): **Root cause (from the issue plan comment):** Three compounding bugs in `.forgejo/workflows/deploy.yml` caused the Play Store to silently fall behind. **Bug 1 — `commit_sha` → `head_sha` (primary bug)** Forgejo's API returns `head_sha`; `commit_sha` is always `None`. So `LAST_DEPLOYED_SHA` was always empty and the diff fell back to `HEAD~1..HEAD` — only the single most recent commit. Android changes in earlier commits were silently missed. **Bug 2 — Skipped runs counted as "deployed"** A workflow run where `deploy-playstore` was skipped (`android=false`) still has `status=success` overall. The code now queries each run's job results via the jobs API and only uses a run where the "Build & Deploy to Play Store" job's own `conclusion=success`. **Bug 3 — Narrow fallback when SHA unknown** When `LAST_DEPLOYED_SHA` couldn't be determined, the fallback `HEAD~1..HEAD` diff was dangerously narrow. Now it defaults to `android=true / linux=true` (deploy everything) as the safe default. **Additional changes:** - `::error::` / `::warning::` / `::notice::` workflow annotations so skip/failure reasons appear in the Actions UI - `scripts/verify_playstore_deploy.py`: new post-deploy script that queries the internal track and fails if the latest version code is >1 hour old (version codes are Unix timestamps per `ci/main.go`) - `scripts/test_verify_playstore_deploy.py`: 5 unit tests (all passing)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: guettli/sharedinbox#361