2026-04-30 22:01:41 -07:00
# roto
2026-05-04 13:45:18 -07:00
Zero-allocation Rust protobuf reader and writer.
2026-04-30 22:01:41 -07:00
2026-05-04 13:45:18 -07:00
## Overview
2026-04-30 22:01:41 -07:00
2026-05-04 13:45:18 -07:00
Instead of deserializing binary protobuf data into Rust structs, roto scans a message _ once _ on
construction — recording the byte offset of each field — then reads fields on demand directly from
the original bytes. No heap allocation, no data copying, no full deserialization upfront.
2026-04-30 22:01:41 -07:00
2026-05-04 13:45:18 -07:00
Writing works the same way: you provide a fixed buffer and a builder writes fields directly into it,
returning a slice of the bytes written.
2026-05-03 13:31:39 -07:00
2026-05-04 13:45:18 -07:00
## Design
2026-05-03 13:31:39 -07:00
2026-05-04 13:45:18 -07:00
`protoc` generates a `CodeGeneratorRequest` message; `protoc-gen-roto` (in
`src/bin/protoc-gen-roto.rs` ) reads this from stdin, generates Rust source files, and writes a
`CodeGeneratorResponse` to stdout. `protoc` then writes those `.rs` files to disk. The generated
files are included directly in the crate that uses the protobuffers.
2026-05-03 13:31:39 -07:00
2026-05-04 14:40:11 -07:00
Sample usage:
```
protoc -Iproto/ proto/hackers.proto --plugin=./target/debug/protoc-gen-roto --roto_out=src/
```
This will generate a file, src/hackers.rs.
2026-05-04 13:45:18 -07:00
## Generated code
2026-04-30 22:01:41 -07:00
2026-05-04 13:45:18 -07:00
For each protobuf message roto generates two types:
- **Reader struct** `MessageName<'a>` — borrows the original byte slice, zero-copy.
- **Builder struct** `MessageNameBuilder<'b>` — writes into a caller-provided `&mut [u8]` .
Nested message types are placed in a `pub mod message_name { ... }` module (snake_case of the
parent message name) within the same generated file.
## Sample usage
Given this proto definition:
``` proto
2026-04-30 22:01:41 -07:00
message Hello {
string hello_world = 1 ;
message InnerWorld {
string thought = 1 ;
}
InnerWorld inner_world = 2 ;
}
```
2026-05-04 13:45:18 -07:00
### Reading
2026-04-30 23:13:24 -07:00
``` rust
2026-05-04 13:45:18 -07:00
fn parse_proto ( data : & [ u8 ] ) -> roto ::Result < String > {
// Scan the data once, recording field offsets
let hello = Hello ::new ( data ) ? ;
// String fields return &str borrowed from the original bytes (zero-copy)
let hello_world : & str = hello . hello_world ( ) ? ;
// Nested message fields return &[u8]; construct the nested reader from those bytes
let inner_bytes : & [ u8 ] = hello . inner_world ( ) ? ;
let inner_world = hello ::InnerWorld ::new ( inner_bytes ) ? ;
let thought : & str = inner_world . thought ( ) ? ;
Ok ( format! ( " {} is about {} " , hello_world , thought ) )
}
2026-04-30 23:13:24 -07:00
```
2026-05-04 13:45:18 -07:00
Fields absent from the binary data return `Err(roto::RotoError::FieldNotFound)` .
### Writing
Nested messages must be serialized into a scratch buffer first, then embedded as raw bytes in the
outer builder.
``` rust
fn build_proto ( buf : & mut [ u8 ] ) -> roto ::Result < & [ u8 ] > {
// Serialize the inner message first
let mut inner_buf = [ 0 u8 ; 256 ] ;
let inner_bytes = hello ::InnerWorldBuilder ::builder ( & mut inner_buf )
. thought ( " some thought " ) ?
. finish ( ) ? ;
// Build the outer message, embedding the serialized inner bytes
HelloBuilder ::builder ( buf )
. hello_world ( " some world " ) ?
. inner_world ( inner_bytes ) ?
. finish ( ) // returns Result<&'b mut [u8]> — the written portion of buf
}
```
Builder methods consume `self` and return `Result<Self>` , enabling `?` -based chaining.
`finish()` returns `Result<&'b mut [u8]>` — a slice of the portion of the buffer that was written.
2026-05-04 20:11:54 -07:00
### Updating messages
You can read a message, modify specific fields, and use `.with()` to copy the remaining fields from the original binary.
``` rust
fn update_proto ( data : & [ u8 ] , buf : & mut [ u8 ] ) -> roto ::Result < & [ u8 ] > {
let msg = Message ::new ( data ) ? ;
let mut builder = MessageBuilder ::builder ( buf ) ;
if msg . foo ( ) ? = = " bar " {
builder = builder . foo ( " foosbar " ) ? ;
}
builder . with ( & msg ) ? . finish ( )
}
```
2026-05-04 13:45:18 -07:00
### Repeated fields
Repeated fields return a `RepeatedFieldIterator<'a>` . Each item yields `Result<(&[u8], WireType)>` .
``` rust
let hello = Hello ::new ( data ) ? ;
for item in hello . tags ( ) {
let ( value_bytes , _wire_type ) = item ? ;
// decode value_bytes according to the expected wire type
}
```
## Runtime API
The core runtime in `src/lib.rs` provides:
- `ProtoAccessor<'a>` — scans a message's fields and reads values at recorded offsets.
- `ProtoBuilder<'a>` — writes fields into a provided `&mut [u8]` buffer.
- `FieldIterator<'a>` / `RepeatedFieldIterator<'a>` — iterators over fields and repeated fields.
- `Tag` , `WireType` — protobuf encoding primitives.
- `read_varint` , `write_varint` , `skip_value` — low-level wire-format helpers.
- `RotoError` , `Result<T>` — error type and alias.
## High-level design
2026-04-30 22:01:41 -07:00
2026-05-04 13:45:18 -07:00
On construction (`MessageName::new(data)` ), the generated reader struct iterates the binary once
using `FieldIterator` and records the byte offset of each field's tag. Subsequent field accesses
call `ProtoAccessor::get_value_at(offset)` — no re-scanning. For repeated fields, the start and
end offsets of the field range are recorded to bound iteration efficiently.
2026-04-30 22:01:41 -07:00
2026-05-04 14:40:11 -07:00
## Benchmarks
Two benchmark suites share the same binary data files and the same four
measurement groups:
2026-05-04 23:37:36 -07:00
| Group | What is timed |
| ------------------- | ------------------------------------------------------- |
| `shallow_parse` | Become ready to read any field (one scan / full decode) |
| `deep_parse` | Walk the full tree: Campaign → Operations → Hackers |
| `field_access` | Read individual fields on an already-parsed message |
| `iterate` | Count top-level and nested repeated fields |
| `read_update_write` | Parse, update a field, and serialize back to a buffer |
2026-05-04 14:40:11 -07:00
### 1 — Generate the shared data files (do this once)
Data files are written to `data/bench/` .
``` sh
cargo run --release --bin gen_bench_data -- --preset tiny
cargo run --release --bin gen_bench_data -- --preset small
cargo run --release --bin gen_bench_data -- --preset medium
cargo run --release --bin gen_bench_data -- --preset large
```
For even larger inputs use `--preset huge` (~500 MB) or set the knobs
directly:
``` sh
# ~50 MB: 500 operations × 100 KB stolen_data each
cargo run --release --bin gen_bench_data -- --ops 500 --stolen-kb 100 --output data/bench/50mb.pb
```
### 2 — Rust benchmark (criterion)
``` sh
cargo bench --bench hackers_bench
```
HTML reports are written to `target/criterion/` . Run a single group:
``` sh
cargo bench --bench hackers_bench -- shallow_parse
```
### 3 — C / upb benchmark
Requires protobuf ≥ 21 with `protoc-gen-upb` (ships with modern `protoc` ).
``` sh
cd upb_test
make # compiles hackers_bench from the pre-generated upb files
./hackers_bench
```
To regenerate the upb C files from `proto/hackers.proto` :
``` sh
cd upb_test && make regen
```
2026-05-04 14:53:49 -07:00
### 4 — Results
Measured on Linux x86-64 with the four standard presets. Rust times are
criterion medians; C/upb times are the custom runner's mean over ≥ 0.5 s.
#### `shallow_parse` — cost to become ready to read any field
| Size | Bytes | roto (ns) | upb (ns) | roto speedup |
| ------ | ----------: | --------: | -----------: | -----------: |
| tiny | 588 | 32.7 | 606.2 | **18.5× ** |
| small | 20,265 | 182.9 | 22,619.2 | **123.7× ** |
| medium | 2,071,053 | 16,632.0 | 5,346,977.2 | **321× ** |
| large | 102,608,384 | 1,618.6 | 41,132,079.7 | **25,411× ** |
> roto's cost is O(number of top-level fields): it records field offsets by
> jumping past nested blobs using their length prefixes. upb fully decodes the
> entire tree — including all nested messages and raw byte payloads — into
> arena-allocated structs.
#### `deep_parse` — parse + walk Campaign → Operations → every Hacker handle
| Size | Bytes | roto (ns) | upb (ns) | roto speedup |
| ------ | --------: | ----------: | ----------: | -----------: |
| tiny | 588 | 385.3 | 596.8 | **1.55× ** |
| small | 20,265 | 13,374.0 | 22,321.6 | **1.67× ** |
| medium | 2,071,053 | 1,454,400.0 | 4,227,384.3 | **2.91× ** |
> roto pays one extra `::new()` scan per nesting level; upb's walk is pure
> pointer-chasing because everything was decoded upfront. roto is still
> faster overall because its per-level scans cost less than upb's full decode.
#### `field_access` — individual field reads on a pre-parsed message (`small` preset)
| Field | roto (ns) | upb (ns) | upb speedup |
| ------------------------------ | --------: | -------: | ----------: |
| `campaign::name` | 14.3 | 1.11 | **12.9× ** |
| `campaign::total_bytes_stolen` | 7.1 | 1.74 | **4.1× ** |
| `operation::codename` | 13.8 | 1.76 | **7.8× ** |
| `operation::timestamp` | 9.7 | 1.40 | **6.9× ** |
| `operation::successful` | 7.0 | 1.13 | **6.1× ** |
| `hacker::handle` | 14.4 | 1.56 | **9.2× ** |
| `hacker::skill_level` (f32) | 7.7 | 1.76 | **4.4× ** |
| `hacker::is_elite` (bool) | 7.5 | 1.14 | **6.6× ** |
| `worm::polymorphic` (bool) | 7.5 | 1.76 | **4.2× ** |
| `worm::payload` (bytes) | 16.6 | 1.75 | **9.5× ** |
> After parsing, upb field reads are direct struct-member lookups (~1– 2 ns).
> roto re-decodes the value at its pre-recorded byte offset on every call
> (~7– 17 ns). This is the one area where upb holds a clear advantage.
#### `iterate` — count repeated fields (parse included in every iteration)
| Benchmark | Size | roto (ns) | upb (ns) | roto speedup |
| ------------------ | ------ | --------: | ----------: | -----------: |
| `count_operations` | tiny | 50.0 | 600.2 | **12.0× ** |
| `count_operations` | small | 393.7 | 22,702.9 | **57.7× ** |
| `count_operations` | medium | 36,628.0 | 4,193,874.0 | **114.5× ** |
| `count_all_crew` | tiny | 235.3 | 610.2 | **2.6× ** |
| `count_all_crew` | small | 4,369.5 | 23,109.0 | **5.3× ** |
| `count_all_crew` | medium | 444,930.0 | 4,151,181.5 | **9.3× ** |
> `count_operations` includes parsing; upb's O(1) array-length read is
> dominated by its full-decode cost, so roto wins by the same margin as
> `shallow_parse`. `count_all_crew` also parses each `Operation` sub-message;
> roto's per-level scans remain cheaper than upb's full decode.
2026-05-04 23:37:36 -07:00
#### `read_update_write` — parse, update a field, and serialize back to a buffer
| Size | Bytes | roto (ns) | upb (ns) | roto speedup |
| ------ | --------: | --------: | ----------: | -----------: |
| tiny | 588 | 153.8 | 1,120.3 | **7.3× ** |
| small | 20,265 | 1,301.8 | 42,089.6 | **32.3× ** |
| medium | 2,071,053 | 302,090.0 | 9,233,397.9 | **30.5× ** |
> roto's `with()` method allows copying fields directly from the original binary
> without decoding them, making the update process extremely efficient. upb must
> fully parse the message into structs and then re-serialize the entire tree.
2026-05-04 14:40:11 -07:00
### Interpreting the comparison
The two libraries have fundamentally different models:
- **roto `shallow_parse` ** does one linear scan recording byte offsets — no
allocation, no field decoding. Subsequent field reads decode on demand at
the stored offset.
- **upb `Campaign_parse` ** fully decodes the entire message tree into
arena-allocated structs upfront. Subsequent field reads are direct struct
member lookups (~1 ns).
The result: roto's parse is faster and allocation-free; upb's field access
after parsing is faster. For workloads that read every field the costs
invert; for workloads that read a handful of fields from large messages roto
wins.
2026-05-04 23:12:20 -07:00
## Protobuf Spec Validation
2026-04-30 22:01:41 -07:00
2026-05-04 23:12:20 -07:00
The goal is to validate roto's implementation against the Proto3 specification.
### Supported Features
- **Scalar Types**: `double` , `float` , `int32` , `int64` , `uint32` , `uint64` , `sint32` , `sint64` , `fixed32` , `fixed64` , `sfixed32` , `sfixed64` , `bool` , `string` , `bytes` .
- **Messages**: Top-level and nested message definitions.
- **Enums**: Enum definitions with `from_i32` conversion.
- **Field Labels**: Singular and `repeated` fields.
### Unsupported Features
2026-05-04 23:38:11 -07:00
- **Default Values**: Missing fields return `RotoError::FieldNotFound` instead of the Proto3 default value (e.g., `0` or `""` ).
- **Field Presence**: No `has_field()` methods are generated to distinguish between a field being absent and a field being set to its default value.
2026-05-04 23:12:20 -07:00
- **`oneof` Fields**: Not currently supported in code generation.
- **`map` Fields**: Not currently supported.
- **Packed Repeated Fields**: `roto` expects individual tags for all repeated elements; it does not support the packed encoding used for scalar numeric types in Proto3.
- **Reserved Fields**: `reserved` statements are ignored.
- **Services**: `service` and `rpc` definitions are ignored.
- **Options**: Field and message options are ignored.
### Tasks
- [x] Analyze `roto/codegen` to determine which protobuf constructs are supported during code generation.
- [x] Analyze `roto/runtime` to determine which wire types and protobuf types are supported during reading and writing.
- [x] Compare findings with the Proto3 spec (https://protobuf.dev/reference/protobuf/proto3-spec/).
- [x] Document supported and unsupported features in the README.