# DEVIATIONS.md

Live catalog of drift between the paper text (`01_intro_method_v4.md` through `04_limitations_conclusion_appendix_v4.md`, draft v4.0.6; citation target v4.0.3) and the as-shipped public artifact set (GitHub release `v4.0.3` + `https://knowva.ai/CompressionV4`). Updated when drift changes.

## A. HF revision prefix mismatches

Paper §3.1 and Appendix A.2 list 8-character HF revision prefixes for the 12 corpus models. **7 of 12 prefixes do not match the actually-resolved full SHAs**: `distilbert-base-uncased`, `sentence-transformers/all-MiniLM-L6-v2`, and all five bartowski Q4_K_M GGUF rows. Authoritative SHAs are in `manifest/model-manifest.json` with `prefix_match: true/false` per model. The paper text already documents this drift (note under §3.1 table).

## B. fp16 trained profile deferred

Paper §6 and §11.1 scope fp16 as a companion measurement and note that the OpenZL le-u16 trained-profile fp16 row in Section 6.2 / Section 10.1 is deferred until the 778 B fp16 profile is regenerated. The bf16 profile (`bf16.zl`, 617 B) is regenerated and shipped; fp16 is not.

## C. External methods not run under the strict contract

Paper §11.2 and §5.3 explicitly scope this out. DFloat11 (`LeanModels/DFloat11`) and Cloudflare Unweight (`cloudflareresearch/unweight-kernels`) require Hopper-class GPUs (H100/H200) for decode and were not run on the CPU-only reference. ZipNN (`zipnn/zipnn`) installs on CPU but fails byte-exact roundtrip in default configuration. `reproductions/{dfloat11,unweight,zipnn}/` ship pointer READMEs only.

## D. 30B / 70B entropy spot-check not run

Paper §5.8 explicitly notes scope is ≤7B (validated through Qwen2.5-7B-Instruct). No 30B/70B measurement is claimed or shipped.

## E. Inference sanity check verified only on gpt2

Paper §4.3 reports gpt2 verified (149 bf16 tensors, 311 MiB, 8/8 token identity). The full 11-model `max_new_tokens=256` sweep is deferred to a GPU host and is not claimed in the paper. The byte-exact tensor-level roundtrip assertion (§4.2) verified across all 11,942 method-evaluations in `results/results.jsonl.zst` mathematically implies token identity for the other 10 test models at temperature 0 / seed 42.

## F. Smoke-test sha256 prefixes

Paper Appendix A.6 quotes three prefixes from the v4.0.3 run: `fc3544c35489eb94`, `c79972a64d11b717`, `1c6f077f8c228cf2`. `manifest/expected-checksums.json` carries these values; `reproducibility_smoke_test.sh` passes 3/3 against the v4.0.3 fixture set.

## G. Docker image not built

Paper Appendix A.1 notes no hosted Docker image. The `Dockerfile` ships in-repo for local builds. No publication to `ghcr.io` (no `write:packages` scope on the build account).

## H. Archival identifiers

- **Canonical:** GitHub release v4.0.3 (`https://github.com/NimoRotem/llm-compression-limits/releases/tag/v4.0.3`). Immutable git tag; release ZIP byte-identical to the published artifact.
- **Supplementary persistent identifier:** Software Heritage SWHID `swh:1:rel:55d910f5af170c22719cc9346f4d8a5029f09164` (SH save-task 2330392, archived 2026-05-14). Resolvable via the SH API (`archive.softwareheritage.org/api/1/release/55d910f5af170c22719cc9346f4d8a5029f09164/`); the SH browse UI may not surface the release tree immediately.
- **No Zenodo DOI:** the Zenodo↔GitHub webhook installation did not complete on this repository. We use the GitHub release as the verifiable archive and the SWHID as the citation-grade persistent identifier.

## Maintenance

This file is the authoritative live catalog of drift, per paper Appendix A. Update it when an item changes — e.g., fp16 profile regenerated, Docker image published, full inference sweep run.