How to Work With Large JSON Files
9 min read
A small JSON file is trivial: read it, parse it, done. But a file that is hundreds of megabytes or gigabytes is a different problem. The usual approach β read the whole file into memory and call JSON.parse() β will exhaust RAM and crash. This guide covers the techniques that let you process large JSON safely.
Why the naive approach fails
JSON.parse() (and its equivalents in other languages) is a DOM-style parser: it builds the entire data structure in memory at once. Parsing a 1GB file can easily need several gigabytes of RAM, because the in-memory object graph is larger than the text. Editors and browsers have hard limits too β most text editors struggle past a few hundred megabytes.
Technique 1: Newline-delimited JSON (NDJSON)
If you control the format, the single best decision is to store many records as one JSON object per line rather than one giant array. This is called NDJSON or JSON Lines.
{"id": 1, "name": "Ada"}
{"id": 2, "name": "Grace"}
{"id": 3, "name": "Edsger"}Now you can read and parse the file line by line, holding only one record in memory at a time. Logs, data exports and streaming pipelines almost always use this format for exactly this reason.
Technique 2: Streaming parsers
When you are stuck with one enormous JSON array, use a streaming (SAX-style) parser that emits events as it reads, instead of building the whole tree. In Node.js, libraries like stream-json or clarinet do this; in Python, ijson yields items one at a time.
# Python with ijson - constant memory, no matter the file size
import ijson
with open("huge.json", "rb") as f:
for record in ijson.items(f, "item"):
process(record) # one element at a timeThe key idea is that memory use stays roughly constant regardless of file size, because you never hold more than the current record.
Technique 3: jq for the command line
jq is a streaming-capable command-line JSON processor. It can filter, reshape and extract from large files without loading everything, and it is scriptable.
# pull one field from every record in a huge array
jq -c '.[] | {id, name}' huge.json > slim.ndjson
# count records without loading the whole file into a language runtime
jq 'length' huge.jsonTechnique 4: Split the file
Sometimes the simplest fix is to break a giant file into manageable chunks β by top-level key, by record range, or by line count for NDJSON β then process each chunk independently and in parallel. Many data tools and the Unix split command make this easy for line-based files.
Practical tips
- Prefer NDJSON for anything that grows over time (logs, exports, event streams).
- Never load a multi-gigabyte file into a browser tab or GUI editor β it will freeze.
- If you only need a few fields, project them out early to shrink memory and downstream work.
- Compress at rest (gzip) and stream-decompress β JSON compresses extremely well because it is repetitive text.
- Validate a small sample first so you do not spend an hour processing a malformed file.
Frequently asked questions
How big a JSON file can I open in a browser tool?
It depends on the device's memory, but as a rough guide, browser-based editors handle a few megabytes comfortably and start to struggle in the tens of megabytes. For anything larger, use a streaming parser or jq on the command line rather than a GUI.
What is the difference between JSON and NDJSON?
A JSON file is a single value (often one big array). NDJSON is many independent JSON values, one per line, with no enclosing array. NDJSON is far easier to stream because each line is parsed on its own.
Can I convert a big JSON array to NDJSON?
Yes β jq -c '.[]' big.json emits each array element on its own line, producing NDJSON you can then stream line by line.
For everyday-sized files
If your JSON is small enough to open comfortably, the JSON Wallet editor will format, validate and let you explore it in tree view in the browser. Use the Size analyzer to see which properties dominate the byte size β often the first step before deciding what to strip out. To learn the format itself, see What is JSON?
Keep reading
What Is JSON? A Beginnerβs Guide
JSON is the most widely used data format on the web. This guide explains what it is, how it is structured, and where you will run into it as a developer.
JSON Syntax Rules Explained (With Examples)
JSON has a small but strict grammar. This reference walks through every rule with valid and invalid examples so your data parses the first time.
Common JSON Errors and How to Fix Them
Every developer hits "Unexpected token in JSON" eventually. Here are the most common JSON errors, what causes them, and exactly how to fix each one.
Ready to put this into practice? Open the free JSON editor β