Add benchmark
This commit is contained in:
@@ -18,6 +18,14 @@ returning a slice of the bytes written.
|
||||
`CodeGeneratorResponse` to stdout. `protoc` then writes those `.rs` files to disk. The generated
|
||||
files are included directly in the crate that uses the protobuffers.
|
||||
|
||||
Sample usage:
|
||||
|
||||
```
|
||||
protoc -Iproto/ proto/hackers.proto --plugin=./target/debug/protoc-gen-roto --roto_out=src/
|
||||
```
|
||||
|
||||
This will generate a file, src/hackers.rs.
|
||||
|
||||
## Generated code
|
||||
|
||||
For each protobuf message roto generates two types:
|
||||
@@ -117,6 +125,81 @@ using `FieldIterator` and records the byte offset of each field's tag. Subsequen
|
||||
call `ProtoAccessor::get_value_at(offset)` — no re-scanning. For repeated fields, the start and
|
||||
end offsets of the field range are recorded to bound iteration efficiently.
|
||||
|
||||
## Benchmarks
|
||||
|
||||
Two benchmark suites share the same binary data files and the same four
|
||||
measurement groups:
|
||||
|
||||
| Group | What is timed |
|
||||
| --------------- | ------------------------------------------------------- |
|
||||
| `shallow_parse` | Become ready to read any field (one scan / full decode) |
|
||||
| `deep_parse` | Walk the full tree: Campaign → Operations → Hackers |
|
||||
| `field_access` | Read individual fields on an already-parsed message |
|
||||
| `iterate` | Count top-level and nested repeated fields |
|
||||
|
||||
### 1 — Generate the shared data files (do this once)
|
||||
|
||||
Data files are written to `data/bench/`.
|
||||
|
||||
```sh
|
||||
cargo run --release --bin gen_bench_data -- --preset tiny
|
||||
cargo run --release --bin gen_bench_data -- --preset small
|
||||
cargo run --release --bin gen_bench_data -- --preset medium
|
||||
cargo run --release --bin gen_bench_data -- --preset large
|
||||
```
|
||||
|
||||
For even larger inputs use `--preset huge` (~500 MB) or set the knobs
|
||||
directly:
|
||||
|
||||
```sh
|
||||
# ~50 MB: 500 operations × 100 KB stolen_data each
|
||||
cargo run --release --bin gen_bench_data -- --ops 500 --stolen-kb 100 --output data/bench/50mb.pb
|
||||
```
|
||||
|
||||
### 2 — Rust benchmark (criterion)
|
||||
|
||||
```sh
|
||||
cargo bench --bench hackers_bench
|
||||
```
|
||||
|
||||
HTML reports are written to `target/criterion/`. Run a single group:
|
||||
|
||||
```sh
|
||||
cargo bench --bench hackers_bench -- shallow_parse
|
||||
```
|
||||
|
||||
### 3 — C / upb benchmark
|
||||
|
||||
Requires protobuf ≥ 21 with `protoc-gen-upb` (ships with modern `protoc`).
|
||||
|
||||
```sh
|
||||
cd upb_test
|
||||
make # compiles hackers_bench from the pre-generated upb files
|
||||
./hackers_bench
|
||||
```
|
||||
|
||||
To regenerate the upb C files from `proto/hackers.proto`:
|
||||
|
||||
```sh
|
||||
cd upb_test && make regen
|
||||
```
|
||||
|
||||
### Interpreting the comparison
|
||||
|
||||
The two libraries have fundamentally different models:
|
||||
|
||||
- **roto `shallow_parse`** does one linear scan recording byte offsets — no
|
||||
allocation, no field decoding. Subsequent field reads decode on demand at
|
||||
the stored offset.
|
||||
- **upb `Campaign_parse`** fully decodes the entire message tree into
|
||||
arena-allocated structs upfront. Subsequent field reads are direct struct
|
||||
member lookups (~1 ns).
|
||||
|
||||
The result: roto's parse is faster and allocation-free; upb's field access
|
||||
after parsing is faster. For workloads that read every field the costs
|
||||
invert; for workloads that read a handful of fields from large messages roto
|
||||
wins.
|
||||
|
||||
## Literature
|
||||
|
||||
https://protobuf.dev/programming-guides/encoding/
|
||||
|
||||
Reference in New Issue
Block a user