JSON Formatting Best Practices

Published · 9 min read

JSON has become the default wire format for the web, and for good reason: it is small, readable by humans, and every language has a battle-tested parser. But “readable” only stays true if you actually follow a few conventions. This guide covers the practices that keep JSON maintainable, the errors that silently break parsers, and the situations where you should compress it all away on purpose.

Start with valid JSON, not valid JavaScript

The single most common mistake is treating JSON as a subset of JavaScript. It almost is, but not quite. JSON is a strict format defined by RFC 8259, and the parsers that ship with your language enforce every rule. Get one wrong and you get a cryptic Unexpected token error at 3am.

The fastest way to find these problems before they reach production is to run your data through a JSON formatter and validator. A formatter re-serializes with the same library your parser uses, so if it can pretty-print your input, the input was valid.

Structure for the reader, not the database

A nested structure that mirrors your domain is usually clearer than a flat key-value bag, but deeply nested JSON is painful to consume. As a rule of thumb, three levels of nesting is comfortable; beyond five, consumers start writing fragile path expressions like data.users[0].profile.addresses[2].zip that shatter the moment you rename a field.

Prefer arrays of flat-ish objects over a single giant object keyed by id:

// Easier to iterate, map, and validate
{
  "users": [
    { "id": "u1", "name": "Alice", "role": "admin" },
    { "id": "u2", "name": "Bob",   "role": "editor" }
  ]
}

// Harder to consume — keys are data, and order is not guaranteed
{
  "u1": { "name": "Alice", "role": "admin" },
  "u2": { "name": "Bob",   "role": "editor" }
}

Use objects as maps when the keys truly are arbitrary identifiers and you need O(1) lookup, but remember that JSON object key order is implementation-defined. Do not rely on insertion order surviving a round-trip unless you sort keys explicitly.

Be consistent about types

JSON has six data types: string, number, boolean, null, array, and object. It has no date type, no integer/float distinction, and no way to say a number is a currency. You have to pick conventions and stick to them.

Null versus missing versus empty

Three subtly different states, all legal JSON, and a frequent source of bugs. "middleName": null means the field exists and is explicitly null. Omitting the key means “not provided.” "middleName": "" means “provided and empty.” Consumers using if (obj.middleName) treat all three the same; consumers using if ("middleName" in obj) treat them differently. Decide which semantics your API uses and document it. A common, defensive pattern: always include the key, use null for “unknown,” and reserve omission for “this field does not apply to this resource.”

When to minify, when to pretty-print

Two valid serializations exist for the same data. Pretty-printed JSON adds indentation and newlines so a human can read it; minified JSON strips every byte of whitespace. They are equivalent to a parser, so the choice is about audience.

Minify on the wire. For HTTP responses shipped to browsers or mobile clients, minify. Whitespace adds up: a 2 MB JSON payload pretty-printed with 2-space indent can balloon 20–40%. Most servers gzip or Brotli-compress responses, which erases much of the difference, but you still pay for it in serialization time and in the uncompressed bytes logged by intermediaries.

Pretty-print for humans. Config files, API documentation, log lines you will grep, examples in a README — anything a person reads should be indented. Use 2 spaces by convention; 4 is acceptable but wastes horizontal space when nesting is deep.

You can switch between the two instantly with a formatter tool. The JSON Formatter on this site does both, and it also surfaces the first syntax error with a line and column pointer, which is far more useful than a generic SyntaxError.

Validation before parsing

Parsing is not validation. JSON.parse will happily accept {"age": "twenty"} even when your code expects a number. If the data crosses a trust boundary — a webhook, a file upload, a third-party API — validate the shape before you trust it.

In JavaScript, reach for a schema validator like Zod or Ajv (JSON Schema). In Python, Pydantic. The cost is a few milliseconds of parsing; the benefit is that malformed input fails loudly at the boundary instead of silently producing undefined three functions deep.

Convert carefully to and from CSV

JSON and CSV serve different audiences. CSV is for spreadsheets and tabular tools; JSON is for nested and heterogeneous data. The conversion is lossy in both directions. A CSV row is flat, so a JSON object with nested arrays has to be flattened (losing structure) or escaped with quoted fields containing JSON strings (losing the spreadsheet’s value). When you convert, decide first whether your data is actually tabular. If every row has the same columns and no nesting, the CSV ↔ JSON converter will round-trip cleanly. If it is nested, expect to lose information and design accordingly.

Large payloads and precision

Two more traps wait in the numbers. First, JSON numbers are defined as decimal but say nothing about precision. A value like 0.1 cannot be represented exactly in IEEE 754 floating point, so 0.1 + 0.2 becomes 0.30000000000000004. If you send coordinates or computed quantities, decide whether the consumer should treat them as floats or as scaled integers. Second, JavaScript’s JSON.parse parses every number as a double, which can only represent integers exactly up to 253−1. Any integer above that — a large database ID, a Twitter snowflake, a nanosecond timestamp — is silently rounded. The only safe fix is to send large integers as strings and parse them with a BigInt-aware library on the consumer side.

For genuinely large payloads — anything over a few megabytes — consider whether the consumer needs the whole document at once. Parsing a 50 MB JSON file with JSON.parse allocates a comparable amount of memory and blocks the main thread in a browser. Streaming parsers (oboe.js in the browser, ijson in Python) read the document incrementally and fire callbacks as they encounter elements, which keeps memory flat and lets the UI stay responsive. If the data is tabular and large, JSON is often the wrong format entirely — Arrow or Parquet will be an order of magnitude smaller and faster to load.

Encoding, characters, and BOM

JSON is defined as UTF-8. Period. RFC 8259 says implementations MUST use UTF-8 and MUST NOT emit a byte-order mark (BOM). A leading BOM (EF BB BF) is technically allowed on input but is pointless and routinely confuses legacy parsers, so never write one. If you are seeing  at the start of a parsed string, you have a BOM; strip it. Similarly, do not use UTF-16 or Latin-1 for JSON files — some servers will mis-serve them and your carefully encoded Korean or emoji arrives as question marks.

A short checklist

None of this is exotic. The reason JSON-related bugs are so common is that the format looks forgiving — it lets you write almost-JSON that works in one parser and breaks in another. Spend thirty seconds formatting and validating before you ship, and most of those bugs never happen.

Related tools

← Back to blog