JSON Formatting Best Practices
Published · 9 min read
JSON has become the default wire format for the web, and for good reason: it is small, readable by humans, and every language has a battle-tested parser. But “readable” only stays true if you actually follow a few conventions. This guide covers the practices that keep JSON maintainable, the errors that silently break parsers, and the situations where you should compress it all away on purpose.
Start with valid JSON, not valid JavaScript
The single most common mistake is treating JSON as a subset of JavaScript. It almost is, but not quite.
JSON is a strict format defined by RFC 8259,
and the parsers that ship with your language enforce every rule. Get one wrong and you get a cryptic
Unexpected token error at 3am.
- Keys must be double-quoted.
{name: "Alice"}is valid JavaScript and invalid JSON. JSON requires{"name": "Alice"}. Single quotes are not allowed either — only double quotes. - No trailing commas.
{"a": 1, "b": 2,}fails to parse. The trailing comma after the last element of an object or array is forbidden, even though many linters allow it in JavaScript source. - Strings use double quotes and escape properly. Inside a string, escape
"as\", backslash as\\, newlines as\n, and control characters with\uXXXX. Raw tabs and newlines inside a string literal are invalid. - No comments.
// noteand/* note */are not legal JSON, even though some lax parsers accept them. If you need comments, use JSON5 for config and strip them before sending data over the wire. - The top level is one value. A JSON document is a single object or array. Two adjacent objects are two documents, not one.
The fastest way to find these problems before they reach production is to run your data through a JSON formatter and validator. A formatter re-serializes with the same library your parser uses, so if it can pretty-print your input, the input was valid.
Structure for the reader, not the database
A nested structure that mirrors your domain is usually clearer than a flat key-value bag, but
deeply nested JSON is painful to consume. As a rule of thumb, three levels of nesting is
comfortable; beyond five, consumers start writing fragile path expressions like
data.users[0].profile.addresses[2].zip that shatter the moment you rename a field.
Prefer arrays of flat-ish objects over a single giant object keyed by id:
// Easier to iterate, map, and validate
{
"users": [
{ "id": "u1", "name": "Alice", "role": "admin" },
{ "id": "u2", "name": "Bob", "role": "editor" }
]
}
// Harder to consume — keys are data, and order is not guaranteed
{
"u1": { "name": "Alice", "role": "admin" },
"u2": { "name": "Bob", "role": "editor" }
}
Use objects as maps when the keys truly are arbitrary identifiers and you need O(1) lookup, but remember that JSON object key order is implementation-defined. Do not rely on insertion order surviving a round-trip unless you sort keys explicitly.
Be consistent about types
JSON has six data types: string, number, boolean, null, array, and object. It has no date type, no integer/float distinction, and no way to say a number is a currency. You have to pick conventions and stick to them.
- Dates as ISO 8601 strings.
"2026-06-21T14:30:00Z"sorts correctly as a string, is unambiguous across timezones, and parses natively withnew Date()in JavaScript anddatetime.fromisoformatin Python. - Money as strings or integer minor units. Floating point cannot represent
0.1 + 0.2exactly. Send"19.99"or1999cents — never a raw float. - IDs as strings. Many IDs (Twitter snowflakes, some database primary keys) exceed
2^53, the limit where JavaScript can represent integers exactly. A numeric ID of9007199254740993arrives as9007199254740992in the browser. Send IDs as strings. - Booleans for booleans. Do not encode truthiness as
0/1or"true"/"false"strings unless you have a specific reason.
Null versus missing versus empty
Three subtly different states, all legal JSON, and a frequent source of bugs. "middleName": null
means the field exists and is explicitly null. Omitting the key means “not provided.”
"middleName": "" means “provided and empty.” Consumers using
if (obj.middleName) treat all three the same; consumers using
if ("middleName" in obj) treat them differently. Decide which semantics your API
uses and document it. A common, defensive pattern: always include the key, use null for
“unknown,” and reserve omission for “this field does not apply to this resource.”
When to minify, when to pretty-print
Two valid serializations exist for the same data. Pretty-printed JSON adds indentation and newlines so a human can read it; minified JSON strips every byte of whitespace. They are equivalent to a parser, so the choice is about audience.
Minify on the wire. For HTTP responses shipped to browsers or mobile clients, minify. Whitespace adds up: a 2 MB JSON payload pretty-printed with 2-space indent can balloon 20–40%. Most servers gzip or Brotli-compress responses, which erases much of the difference, but you still pay for it in serialization time and in the uncompressed bytes logged by intermediaries.
Pretty-print for humans. Config files, API documentation, log lines you will grep, examples in a README — anything a person reads should be indented. Use 2 spaces by convention; 4 is acceptable but wastes horizontal space when nesting is deep.
You can switch between the two instantly with a formatter tool. The
JSON Formatter on this site does both, and it also surfaces
the first syntax error with a line and column pointer, which is far more useful than a generic
SyntaxError.
Validation before parsing
Parsing is not validation. JSON.parse will happily accept
{"age": "twenty"} even when your code expects a number. If the data crosses a trust
boundary — a webhook, a file upload, a third-party API — validate the shape before you trust it.
In JavaScript, reach for a schema validator like Zod
or Ajv (JSON Schema). In
Python, Pydantic.
The cost is a few milliseconds of parsing; the benefit is that malformed input fails loudly at the
boundary instead of silently producing undefined three functions deep.
Convert carefully to and from CSV
JSON and CSV serve different audiences. CSV is for spreadsheets and tabular tools; JSON is for nested and heterogeneous data. The conversion is lossy in both directions. A CSV row is flat, so a JSON object with nested arrays has to be flattened (losing structure) or escaped with quoted fields containing JSON strings (losing the spreadsheet’s value). When you convert, decide first whether your data is actually tabular. If every row has the same columns and no nesting, the CSV ↔ JSON converter will round-trip cleanly. If it is nested, expect to lose information and design accordingly.
Large payloads and precision
Two more traps wait in the numbers. First, JSON numbers are defined as decimal but say nothing about
precision. A value like 0.1 cannot be represented exactly in IEEE 754 floating point,
so 0.1 + 0.2 becomes 0.30000000000000004. If you send coordinates or
computed quantities, decide whether the consumer should treat them as floats or as scaled integers.
Second, JavaScript’s JSON.parse parses every number as a double, which can only
represent integers exactly up to 253−1. Any integer above that — a large
database ID, a Twitter snowflake, a nanosecond timestamp — is silently rounded. The only safe
fix is to send large integers as strings and parse them with a BigInt-aware library on the consumer
side.
For genuinely large payloads — anything over a few megabytes — consider whether the consumer
needs the whole document at once. Parsing a 50 MB JSON file with JSON.parse allocates
a comparable amount of memory and blocks the main thread in a browser. Streaming parsers
(oboe.js
in the browser, ijson in Python) read the document incrementally and fire callbacks as
they encounter elements, which keeps memory flat and lets the UI stay responsive. If the data is
tabular and large, JSON is often the wrong format entirely — Arrow or Parquet will be an order of
magnitude smaller and faster to load.
Encoding, characters, and BOM
JSON is defined as UTF-8. Period. RFC 8259 says implementations MUST use UTF-8 and MUST NOT emit a
byte-order mark (BOM). A leading BOM (EF BB BF) is technically allowed on input but is
pointless and routinely confuses legacy parsers, so never write one. If you are seeing
at the start of a parsed string, you have a BOM; strip it. Similarly, do not use
UTF-16 or Latin-1 for JSON files — some servers will mis-serve them and your carefully encoded
Korean or emoji arrives as question marks.
A short checklist
- Double-quote every key and string. No single quotes, no bare keys.
- No trailing commas. No comments. No raw control characters in strings.
- Send IDs and big numbers as strings. Send dates as ISO 8601. Send money as strings or integer minor units.
- Pick one convention for null versus missing versus empty, and document it.
- Keep nesting shallow. Flatten arrays of objects when you can.
- Minify on the wire, pretty-print for humans.
- Validate shape at trust boundaries. Parsing is not validation.
None of this is exotic. The reason JSON-related bugs are so common is that the format looks forgiving — it lets you write almost-JSON that works in one parser and breaks in another. Spend thirty seconds formatting and validating before you ship, and most of those bugs never happen.
Related tools
- JSON Formatter & Validator — pretty-print, minify, and find syntax errors instantly.
- CSV ↔ JSON Converter — move tabular data between formats without losing structure.