fastjson 0.1.0 benchmark baseline

A PHP extension that wraps yyjson behind a fastjson_* API mirroring ext/json. Side-by-side numbers against stock ext/json and simdjson_php on a 21-file corpus. Baseline is vanilla PHP 8.6.0-dev built from current master. For encode we also include ext/json with PR-120 applied (the SIMD encode patch is currently open against PHP master, not yet merged) so the patch's effect is visible.

Hardware: i9-13950HX Build: release, both PHP and extensions -O2 Iterations: 200 per case, slowest 10% dropped Corpus: 21 files — 15 large (14.81 MB) + 6 small (64.6 KB), from simdjson_php's jsonexamples

Decode vanilla PHP 8.6.0-dev

Three decoders, one PHP build. PR-120 doesn't affect decode (encode-only patch), so this section is the same regardless of whether the patch is applied.

ext/jsonvanilla 8.6.0-dev
212 MB/s
fastjson 0.1.0vanilla 8.6.0-dev
578 MB/s
simdjson_php 4.0.1devvanilla 8.6.0-dev
745 MB/s
0250500745 MB/s
Aggregate decode → stdClass on the 14.81 MB large corpus. fastjson is 2.73× ext/json; simdjson is another 1.29× ahead of fastjson on the large corpus. On the small corpus the order flips: fastjson 907 MB/s vs simdjson 837 MB/s — see the small-corpus table below.

Per-file decode, large corpus

FileSize ext/jsonMB/s fastjsonMB/s simdjsonMB/s fastjson vsext/json
apache_builds.json 124 KB 364 1,047 897 2.88×
canada.json 2.15 MB 98 434 609 4.43×
citm_catalog.json 1.65 MB 428 1,153 1,112 2.70×
github_events.json 64 KB 382 1,260 1,262 3.30×
gsoc-2018.json 3.17 MB 345 1,066 1,847 3.09×
instruments.json 215 KB 353 909 862 2.58×
marine_ik.json 2.85 MB 178 302 400 1.70×
mesh.json 707 KB 187 478 675 2.56×
mesh.pretty.json 1.50 MB 254 842 1,217 3.32×
numbers.json 147 KB 240 1,001 959 4.18×
random.json 498 KB 242 500 504 2.07×
stringifiedphp.json 140 KB 350 2,669 3,213 7.62×
twitter.json 617 KB 394 981 1,020 2.49×
twitterescaped.json 549 KB 305 841 683 2.75×
update-center.json 521 KB 260 617 620 2.37×
aggregate14.81 MB2125787452.73×
fastjson best simdjson best fastjson faster than ext/json

Per-file decode, small corpus

fastjson takes the small-corpus aggregate ahead of simdjson (907 vs 837 MB/s). Per-call setup cost dominates at this scale; fastjson's startup is lighter than simdjson's tape-format parser.

FileSize ext/jsonMB/s fastjsonMB/s simdjsonMB/s fast/callns fastjson vsext/json
flatadversarial.json 64 B 126 246 228 248 1.95×
adversarial.json 80 B 165 330 317 231 2.00×
demo.json 387 B 322 782 686 472 2.43×
repeat.json 11.1 KB 432 947 1,009 11,400 2.19×
truenull.json 11.7 KB 200 889 805 12,900 4.45×
twitter_timeline.json 41.2 KB 304 910 816 44,200 2.99×
aggregate64.6 KB2919078373.12×
fastjson best simdjson best fastjson wins 5 of 6 small files; simdjson catches it on the 11 KB repeat case.

Encode PR-120 effect visible

This section adds a third column: ext/json on PHP 8.6.0-dev with PR-120 applied. The patch SIMD-accelerates long-string encoding; on a JSON-shaped corpus the SIMD setup cost amortizes only on inputs with large string payloads.

ext/jsonvanilla 8.6.0-dev
178 MB/s
ext/json + PR-1208.6.0-dev with SIMD encode patch
159 MB/s
fastjson 0.1.0vanilla 8.6.0-dev
1,034 MB/s
05001,0001,088 MB/s
Aggregate encode on the 14.81 MB corpus. fastjson is 5.80× vanilla ext/json, 6.51× ext/json with PR-120. PR-120 lands at 0.89× vanilla on this corpus shape (mostly small fields).
PR-120's effect is workload-shape dependent. The SIMD encoder shines on inputs with long string payloads. The microbenchmark json_encode(str_repeat('a', 1048576)) reports 10,032 MB/s with PR-120 vs 1,597 MB/s on vanilla 8.6 — a 6.28× SIMD win. On JSON-shaped corpora with many small string fields, per-call SIMD setup dominates and the aggregate lands below vanilla. The per-file table makes the workload split visible.

Per-file encode, large corpus

FileSize ext/jsonvanilla 8.6 ext/json+PR-1208.6 + patch fastjsonvanilla 8.6 PR-120 vsvanilla fastjson vsvanilla ext/json
apache_builds.json 124 KB 1,098 1,051 1,643 0.96× 1.50×
canada.json 2.15 MB 60 54 717 0.89× 11.84×
citm_catalog.json 1.65 MB 2,280 2,153 3,139 0.94× 1.38×
github_events.json 64 KB 1,187 1,093 2,141 0.92× 1.80×
gsoc-2018.json 3.17 MB 716 903 1,310 1.26× 1.83×
instruments.json 215 KB 1,760 1,678 2,183 0.95× 1.24×
marine_ik.json 2.85 MB 131 113 657 0.86× 5.00×
mesh.json 707 KB 90 78 745 0.87× 8.30×
mesh.pretty.json 1.50 MB 199 173 1,690 0.87× 8.49×
numbers.json 147 KB 57 46 667 0.81× 11.65×
random.json 498 KB 591 484 831 0.82× 1.41×
stringifiedphp.json 140 KB 725 1,527 2,921 2.11× 4.03×
twitter.json 617 KB 1,066 893 1,692 0.84× 1.59×
twitterescaped.json 549 KB 949 922 1,469 0.97× 1.55×
update-center.json 521 KB 804 887 1,076 1.10× 1.34×
aggregate14.81 MB1781591,0340.89×5.80×
fastjson best (every file) PR-120 wins (3 files) PR-120 loses (12 files) PR-120 visibly wins on stringifiedphp (long stringified PHP source, 2.11×) and gsoc-2018 (date-heavy strings, 1.26×).

Per-file encode, small corpus

FileSize ext/jsonvanilla 8.6 fastjsonvanilla 8.6 fast/callns ext/callns fastjson vsext/json
flatadversarial.json 64 B 343 459 133 178 1.34×
adversarial.json 80 B 687 631 121 111 0.92×
demo.json 387 B 1,318 1,557 237 280 1.18×
repeat.json 11.1 KB 990 1,745 6,200 10,900 1.76×
truenull.json 11.7 KB 2,071 2,356 4,900 5,500 1.14×
twitter_timeline.json 41.2 KB 1,091 1,707 23,60036,900 1.56×
aggregate64.6 KB1,1691,7941.53×
fastjson best fastjson slower ext/json takes adversarial.json (80 B) by ~10 ns/call: fastjson's per-call entry is heavier on the smallest inputs.

Validate vanilla PHP 8.6.0-dev

fastjson's edge comes from vendor patch P-002 (YYJSON_READ_VALIDATE_ONLY), which adds a no-tree validate entry point to yyjson and drops peak memory 2.7× vs the stock read path. simdjson holds the validate crown via tape-format short-circuit.

ext/jsonvanilla 8.6.0-dev
244 MB/s
fastjson 0.1.0vanilla 8.6.0-dev
1,344 MB/s
simdjson_php 4.0.1devvanilla 8.6.0-dev
1,970 MB/s
05001,0001,5001,970 MB/s
Aggregate validate on the 14.81 MB large corpus. fastjson is 5.51× ext/json; simdjson is another 1.47× ahead.

Per-file validate, large corpus

FileSize ext/jsonMB/s fastjsonMB/s simdjsonMB/s fastjson vsext/json
apache_builds.json 124 KB 385 2,291 3,958 5.95×
canada.json 2.15 MB 106 906 1,259 8.55×
citm_catalog.json 1.65 MB 565 2,407 4,038 4.26×
github_events.json 64 KB 444 3,015 4,720 6.79×
gsoc-2018.json 3.17 MB 373 1,550 4,250 4.16×
instruments.json 215 KB 437 2,129 3,559 4.87×
marine_ik.json 2.85 MB 235 895 1,225 3.81×
mesh.json 707 KB 207 1,292 1,269 6.24×
mesh.pretty.json 1.50 MB 262 1,696 2,033 6.48×
numbers.json 147 KB 251 1,482 1,585 5.91×
random.json 498 KB 310 1,515 2,538 4.90×
stringifiedphp.json 140 KB 360 2,849 3,337 7.91×
twitter.json 617 KB 482 2,736 3,908 5.67×
twitterescaped.json 549 KB 353 2,376 1,736 6.73×
update-center.json 521 KB 316 2,150 3,078 6.81×
aggregate14.81 MB2441,3441,9705.51×
fastjson best (2 files) simdjson best (13 files) fastjson takes mesh and twitterescaped. simdjson's tape-based validator dominates everywhere else.

Per-file validate, small corpus

FileSize ext/jsonMB/s fastjsonMB/s simdjsonMB/s fast/callns fastjson vsext/json
flatadversarial.json 64 B 157 492 581 124 3.15×
adversarial.json 80 B 195 694 727 110 3.55×
demo.json 387 B 341 1,670 1,717 221 4.89×
repeat.json 11.1 KB 548 2,098 4,016 5,200 3.83×
truenull.json 11.7 KB 222 2,596 2,445 4,400 11.68×
twitter_timeline.json 41.2 KB 380 2,971 3,113 13,6007.82×
aggregate64.6 KB3522,6743,0407.60×
fastjson best simdjson best

Memory peak

Single-call peak heap delta, aggregated across the 15-file large corpus on vanilla 8.6. simdjson and ext/json land identical for decode because the bench uses simdjson's eager simdjson_decode (builds a full PHP value tree like ext/json). simdjson's lazy walk would lower this, but it's a different programming model and not a fair like-for-like.

Decode → stdClass

ext/json56.4 MB
fastjson97.8 MB (1.74×)
simdjson56.4 MB (1.00×)
fastjson holds yyjson's full doc alongside the zval output until the walker finishes; simdjson and ext/json each free the parser state as they emit zvals.

Encode

ext/json11.2 MB
fastjson11.9 MB (1.06×)
simdjsonno encode
Direct-write encoder: walks zvals straight into smart_str using yyjson primitives (yyjson_write_number, yyjson_write_string_to_buf). Near-parity with ext/json.

Validate

ext/json~80 B
fastjson14.9 MB (101×)
simdjson~0 B
ext/json and simdjson validate stream the input. fastjson uses vendor patch P-002 for no-tree validate (2.7× less memory than stock yyjson) but still copies the input buffer; not yet streaming.

Where each tool lands

ext/json

Always there. No install. Streaming validate at ~80 bytes of state. PR-120 (currently open) accelerates encoder long-string paths but is a net loss on JSON-shaped corpora today.

fastjson 0.1.0

Drop-in replacement. fastjson_encode/decode/validate match json_* argument-for-argument; json_last_error-compatible. Coexists with ext/json so adoption is per call site. Best encode by 5-6×, best small-input decode.

simdjson_php

Decode + validate only. Separate simdjson_* API with different semantics around lazy parsing. Fastest large-input decode and validate; loses to fastjson on small inputs. Not a drop-in for code calling json_*.

Methodology

What "200 iterations" actually measures

Each (file, operation) cell is the median of 200 timed runs of the operation on the in-memory JSON, with the slowest 10% dropped (warmup + jitter filter). Throughput is bytes-in / median-ns × 1000 for decode and validate; bytes-out / median-ns for encode. hrtime(true) for timing.

How the PR-120 column was produced

PR-120 (php/php-src#20120) is currently open against PHP master, not merged. The vanilla 8.6 baseline is built from php-src master without the patch; the PR-120 column is built from the same tree with the three patched files (ext/json/json_encoder.c, ext/standard/html.c, ext/standard/html.h) applied. Both are release builds (--disable-debug, -O2); same configure flags. fastjson and simdjson_php are rebuilt against each PHP install.

Why simdjson is faster on float-heavy and large inputs

canada.json is geographic coordinates: an array of arrays of doubles. simdjson's SSE-driven number parser hits multi-GB/s on doubles where yyjson's portable C99 number parser is slower. Similar gaps on marine_ik (mesh vertex data), numbers, and gsoc-2018 (large date strings).

fastjson catches simdjson on object-heavy inputs (citm_catalog, instruments, twitterescaped) because constructing PHP stdClass / array bodies dominates over raw parse, and fastjson writes object property tables via Z_OBJPROP_P directly, bypassing the per-property write-handler dispatch.

Reproducing this
git clone https://github.com/iliaal/fastjson
cd fastjson
phpize && ./configure --enable-fastjson && make -j$(nproc)
./bench/fetch-data.sh

PHP=$HOME/php-install-PHP-8.6-vanilla/bin/php
$PHP -d extension=$(pwd)/modules/fastjson.so \
     -d extension=/path/to/simdjson.so \
     bench/run.php bench/data 200 > bench/baseline.md

Both PHP and fastjson must be release-built (-O2, no --enable-debug). phpize inherits the running PHP's CFLAGS, so building fastjson against a debug PHP gives a debug .so and 2-3× slower throughput. Full recipe at bench/README.md.

fastjson github.com/iliaal/fastjson · BSD 3-Clause · vendors yyjson 0.12.0 (MIT) with three local patches documented in vendor/yyjson/PATCHES.md