JSONL Examples

Real-world use cases and code samples showing when and why to use JSONL

JSONL vs JSON

Traditional JSON

[
  {
    "id": 1,
    "name": "Alice",
    "email": "[email protected]"
  },
  {
    "id": 2,
    "name": "Bob",
    "email": "[email protected]"
  }
]

Must load entire file

Cannot append easily

Memory intensive for large data

JSONL Format

{"id": 1, "name": "Alice", "email": "[email protected]"}
{"id": 2, "name": "Bob", "email": "[email protected]"}

Stream line-by-line

Append-friendly

Memory efficient

When to Use JSONL

Machine Learning & AI Training Data

Widely used for passing training data to ML models (e.g., OpenAI, Google Vertex AI). Each line represents one training example, making it easy to stream massive datasets.

{"prompt": "What is AI?", "response": "Artificial Intelligence is..."}
{"prompt": "Explain ML", "response": "Machine Learning is..."}

Streaming Data

Process data as it arrives without loading everything into memory.

{"event": "click", "ts": 1234567890}
{"event": "view", "ts": 1234567891}

Application Logging

Perfect format for log files. Each log entry is structured as a JSON object, and new logs are simply appended to the end of the file. No need to parse and rewrite the file.

{"ts": 1699123456, "level": "info", "msg": "Server started", "port": 8080}
{"ts": 1699123460, "level": "error", "msg": "DB connection failed", "retry": 3}

Big Data Pipelines

A very common format for ingesting, exporting, and processing data in big data systems like Apache Spark, Hadoop, and data warehouses.

{"user_id": 1, "status": "active", "last_login": "2025-01-10"}
{"user_id": 2, "status": "inactive", "last_login": "2024-12-15"}

Analytics Events

Track user events efficiently with one event per line.

{"action": "signup", "user": "alice"}
{"action": "purchase", "user": "bob"}

Streaming APIs

APIs that need to send a large, indeterminate number of results can stream them as JSONL, allowing the client to process results as they arrive without waiting for the entire response.

{"product_id": 1, "name": "Widget A", "price": 99.99, "stock": 45}
{"product_id": 2, "name": "Widget B", "price": 149.99, "stock": 12}

Code Examples

Python - Read JSONL

import json

# Read JSONL file line by line (memory efficient)
with open('data.jsonl', 'r') as f:
    for line in f:
        data = json.loads(line)
        print(data['name'])

# Write JSONL file
with open('output.jsonl', 'w') as f:
    for item in items:
        f.write(json.dumps(item) + '\n')

JavaScript - Parse JSONL

const fs = require('fs');
const readline = require('readline');

// Read JSONL file with streams
const fileStream = fs.createReadStream('data.jsonl');
const rl = readline.createInterface({
  input: fileStream,
  crlfDelay: Infinity
});

for await (const line of rl) {
  const data = JSON.parse(line);
  console.log(data.name);
}

Bash - Process JSONL with jq

# Filter JSONL with jq
cat data.jsonl | jq 'select(.age > 30)'

# Convert JSONL to CSV
cat data.jsonl | jq -r '[.name, .age] | @csv'

# Count lines in JSONL
wc -l data.jsonl

Best Practices

Do

  • Use UTF-8 encoding
  • One valid JSON object per line
  • Use \n (LF) line endings
  • Stream processing for large files
  • Compress with gzip for storage

Don't

  • Don't put arrays/objects across lines
  • Don't use trailing commas
  • Don't load entire file if streaming
  • Don't mix JSON and JSONL formats
  • Don't forget error handling