Benchmarks, metrics, and optimization strategies for high-performance JSONL processing
Constant memory usage regardless of file size. Process 100GB files with 10MB RAM.
Begin processing immediately without loading entire file. 0ms time-to-first-record.
Add records in O(1) time. No file rewriting required unlike JSON arrays.
Test setup: 100,000 records, ~50MB file, Python 3.11, MacBook Pro M1
| Format | Library | Parse Time | Memory Peak | Time to First Record |
|---|---|---|---|---|
| JSON Array | json (stdlib) | 1,450ms | 280MB | 1,450ms |
| JSONL | json (stdlib) | 1,380ms | 8MB | <1ms |
| JSONL | orjson | 410ms | 8MB | <1ms |
| JSONL.gz | orjson + gzip | 680ms | 12MB | 15ms |
Key takeaways:
# Benchmark code
import json
import orjson
import gzip
import time
import tracemalloc
def benchmark_json_array():
tracemalloc.start()
start = time.time()
with open('data.json', 'r') as f:
data = json.load(f) # Load entire array
for record in data:
process(record)
elapsed = time.time() - start
peak = tracemalloc.get_traced_memory()[1] / 1024 / 1024
tracemalloc.stop()
print(f"JSON Array: {elapsed*1000:.0f}ms, {peak:.0f}MB")
def benchmark_jsonl_orjson():
tracemalloc.start()
start = time.time()
with open('data.jsonl', 'rb') as f:
for line in f:
record = orjson.loads(line)
process(record)
elapsed = time.time() - start
peak = tracemalloc.get_traced_memory()[1] / 1024 / 1024
tracemalloc.stop()
print(f"JSONL (orjson): {elapsed*1000:.0f}ms, {peak:.0f}MB")
Test setup: 100,000 records, ~50MB file, Node.js 20.x, MacBook Pro M1
| Approach | Parse Time | Memory Peak | Throughput |
|---|---|---|---|
| JSON.parse() entire file | 2,100ms | 350MB | 47 rec/ms |
| readline + JSON.parse() | 1,820ms | 12MB | 55 rec/ms |
| ndjson stream | 1,240ms | 10MB | 81 rec/ms |
// Fast JSONL streaming with ndjson
const fs = require('fs');
const ndjson = require('ndjson');
const stream = fs.createReadStream('data.jsonl')
.pipe(ndjson.parse())
.on('data', (record) => {
// Process each record as it arrives
process(record);
})
.on('end', () => {
console.log('Complete');
});
Test setup: 1,000,000 records, ~500MB file, Go 1.21, MacBook Pro M1, 10 cores
| Approach | Parse Time | Throughput | CPU Usage |
|---|---|---|---|
| Single-threaded | 3,200ms | 313 rec/ms | ~100% (1 core) |
| 10 goroutines | 420ms | 2,381 rec/ms | ~900% (9 cores) |
7.6x speedup with parallel processing. JSONL's line-based format is trivially parallelizable.
// Go: Parallel JSONL processing
package main
import (
"bufio"
"encoding/json"
"os"
"sync"
)
func processChunk(lines [][]byte, wg *sync.WaitGroup) {
defer wg.Done()
for _, line := range lines {
var record map[string]interface{}
json.Unmarshal(line, &record)
// Process record...
}
}
func main() {
file, _ := os.Open("data.jsonl")
defer file.Close()
scanner := bufio.NewScanner(file)
const chunkSize = 10000
var chunk [][]byte
var wg sync.WaitGroup
for scanner.Scan() {
chunk = append(chunk, append([]byte(nil), scanner.Bytes()...))
if len(chunk) >= chunkSize {
wg.Add(1)
go processChunk(chunk, &wg)
chunk = nil
}
}
if len(chunk) > 0 {
wg.Add(1)
go processChunk(chunk, &wg)
}
wg.Wait()
}
Memory consumption for various file sizes (streaming JSONL vs loading JSON array):
| File Size | Records | JSON Array Memory | JSONL Memory | Savings |
|---|---|---|---|---|
| 10 MB | 20,000 | ~60 MB | ~5 MB | 92% |
| 100 MB | 200,000 | ~550 MB | ~8 MB | 99% |
| 1 GB | 2,000,000 | ~5.5 GB | ~10 MB | 99.8% |
| 10 GB | 20,000,000 | ~55 GB (OOM) | ~12 MB | 99.98% |
Why the difference?
Measure memory usage in your own applications:
# Python: Profile memory usage
import tracemalloc
import json
tracemalloc.start()
# Your processing code here
with open('data.jsonl', 'r') as f:
for line in f:
obj = json.loads(line)
process(obj)
current, peak = tracemalloc.get_traced_memory()
print(f"Current: {current / 1024 / 1024:.1f} MB")
print(f"Peak: {peak / 1024 / 1024:.1f} MB")
tracemalloc.stop()
JSONL compresses extremely well due to repeated field names. Test file: 100MB JSONL with typical user records.
| Compression | Compressed Size | Ratio | Compress Time | Decompress Time |
|---|---|---|---|---|
| None | 100.0 MB | - | - | - |
| gzip -6 | 12.4 MB | 87.6% | 3.2s | 0.8s |
| gzip -9 | 11.8 MB | 88.2% | 8.1s | 0.8s |
| bzip2 | 8.9 MB | 91.1% | 12.4s | 4.2s |
| xz | 7.2 MB | 92.8% | 28.6s | 1.9s |
| zstd -3 | 11.2 MB | 88.8% | 0.6s | 0.3s |
Recommendations:
Process compressed JSONL without decompressing entire file first:
import gzip
import json
# Stream-decompress and process
with gzip.open('data.jsonl.gz', 'rt') as f:
for line in f:
obj = json.loads(line)
# Memory stays constant!
const fs = require('fs');
const zlib = require('zlib');
const readline = require('readline');
const stream = fs.createReadStream('data.jsonl.gz')
.pipe(zlib.createGunzip())
.pipe(readline.createInterface({ input: process.stdin }));
for await (const line of stream) {
const obj = JSON.parse(line);
// Process...
}
# Decompress and process on-the-fly with jq
zcat data.jsonl.gz | jq '.name'
# Decompress, filter, compress again
zcat input.jsonl.gz | grep '"status":"active"' | gzip > filtered.jsonl.gz
Performance impact: Decompression adds 40-60% overhead, but file I/O reduction often makes it faster overall, especially with SSDs.
How quickly can you start processing data? Test: 1GB file with 2M records.
12.4 seconds
Must parse entire file before accessing first record
<1 millisecond
First record available immediately after reading first line
12,000x faster startup! Critical for real-time processing and interactive applications.
Records processed per second across different languages and approaches:
| Language | Library | Records/sec | Notes |
|---|---|---|---|
| Python | json (stdlib) | 72,000 | Baseline |
| Python | orjson | 244,000 | 3.4x faster |
| Node.js | JSON.parse() | 55,000 | Single-threaded |
| Node.js | ndjson | 81,000 | Optimized streaming |
| Go | encoding/json | 313,000 | Single goroutine |
| Go | encoding/json (parallel) | 2,381,000 | 10 goroutines |
| Rust | serde_json | 520,000 | Zero-copy parsing |
| Command line | jq | 45,000 | General purpose |
Benchmark: HTTP endpoint returning 100,000 records over network.
18.2s
Client waits for full response
0.15s
Time to first record
18.5s
But processing starts immediately
Key benefit: User sees results in 150ms instead of 18 seconds, even though total transfer time is similar. Perceived performance is 120x better.
Task: Process 50GB daily application logs (25M records), extract errors, write to database
JSON Array Approach
JSONL Approach
With gzip compression, file size reduced to 6GB, processing time 6.5 minutes.
Task: Transform 10M training examples for GPT fine-tuning
In-Memory Processing
Streaming Pipeline
2.2x faster, 225x less memory. Can run on small instance instead of memory-optimized.
Task: Display live events in web dashboard as they arrive
Polling JSON Endpoint
JSONL Streaming
50x lower latency, 90% less bandwidth usage.