What Is JSON Schema? A Complete Guide With Examples
Learn what JSON Schema is, why it prevents production data bugs, every core keyword explained with examples, a complete real-world schema, and how to validate JSON in JavaScript, Python, Ruby, and Go.
When you build an API or receive data from an external service, you assume the structure will always be consistent. That assumption breaks in production. Fields go missing. Types change. Arrays become null. JSON Schema solves this problem by letting you define exactly what valid JSON looks like and automatically check incoming data against that definition before it ever touches your business logic.
What Is JSON Schema?
JSON Schema is a vocabulary that allows you to annotate and validate JSON documents. It lets you describe the structure of a JSON object in precise, machine-readable terms: which fields must be present, what type each field must be, what values are acceptable, and how nested objects and arrays should be structured.
Think of JSON Schema as a blueprint for your data. The blueprint says what valid JSON looks like. Any JSON that matches the blueprint passes validation. Any JSON that does not match is invalid and can be rejected immediately at the entry point of your application, before it causes any bugs deeper in your system.
Crucially, JSON Schema is itself written in JSON. There is no special language to learn: you describe your data structure using JSON objects with specific keywords that the JSON Schema specification defines. Here is the simplest possible example:
This schema says: the JSON must be an object with three required fields. name must be a string between 1 and 100 characters. email must be a string matching the email format. age must be an integer between 0 and 150. Any JSON that violates any of these rules fails validation with a specific error message pointing to exactly which rule was broken.
JSON Schema is the contract between the producer and consumer of JSON data. Validate at the boundary and invalid data never enters your system.
Why JSON Schema Matters: Data Problems Without Validation
Without schema validation, data quality problems are discovered late and in the worst possible way. A missing required field crashes at the point where the field is first accessed. A field that was a number but is now returned as a string causes silent arithmetic bugs. An array that became null breaks a map operation with a confusing TypeError. These bugs are difficult to trace because the failure occurs far from where the bad data entered the system.
With JSON Schema validation at the entry point of your application, these problems are caught immediately with a precise, actionable error message that points to exactly which field failed and which rule it violated. The invalid data never propagates further. The bug is a clear validation error, not a mysterious crash deep in business logic.
JSON Schema Types
The type keyword is the foundation of every JSON Schema. It restricts the value to one or more of the seven JSON data types. Every schema should specify a type unless it genuinely accepts any type of value.
In most APIs, some fields are optional and may be null when not set. If your schema specifies "type": "string" for such a field, a null value will fail validation. Use "type": ["string", "null"] for fields that can legitimately be null. This forces your code to handle both cases and eliminates an entire class of null reference errors.
The Core JSON Schema Keywords
JSON Schema has a rich vocabulary of validation keywords. The following are the ones you will use in the vast majority of schemas. Understanding these twelve keywords covers almost every real-world validation requirement.
String Validation Keywords
Constrain how many characters a string value must contain. Both are inclusive boundaries.
Validates the string against a regular expression. The regex uses ECMAScript syntax.
Validates against a named format. Common values: email, uri, date, time, date-time, uuid, ipv4, ipv6.
Restricts the value to an explicit set of allowed values. Works with any type. The most reliable way to validate status fields and categories.
Number and Integer Validation Keywords
Inclusive numeric range boundaries. The value must be greater than or equal to minimum and less than or equal to maximum.
Exclusive boundaries. The value must be strictly greater than exclusiveMinimum and strictly less than exclusiveMaximum.
The value must be an exact multiple of the given number. Useful for currency values in smallest units or quantities with fixed increments.
Array Validation Keywords
Defines the schema that every item in the array must match. Can reference any valid schema, including another object schema for arrays of objects.
Constrains the number of items in an array. Use minItems: 1 on required arrays to prevent empty arrays from passing validation.
When set to true, all items in the array must be unique. Useful for tag lists, category IDs, or any field where duplicates are invalid.
Object Validation Keywords
An array of property names that must be present in the object. A field listed in required but absent from the data fails validation immediately.
Defines schemas for each named property. A property listed in properties but not in required is optional: it only needs to match the schema if it is present.
Controls whether properties not listed in properties are allowed. Set to false to reject any unexpected fields, useful for strict API contracts.
A Real-World JSON Schema: E-Commerce Order
The best way to understand JSON Schema is to see a complete, realistic example. The following schema validates an order object in an e-commerce application. It combines nested objects, arrays, enums, formats, and range constraints to define every rule the order data must satisfy:
This schema catches a comprehensive set of data quality problems before they reach your business logic: orders with no items, negative totals, prices with more than two decimal places, invalid status values, malformed email addresses, non-UUID order IDs, and any unexpected fields injected by a misbehaving client. Each violation produces a specific error pointing to the exact path in the JSON where the problem occurred.
Before writing a JSON Schema, always inspect the actual JSON you are trying to validate. Paste it into the JSON Formatter to see the full structure clearly: nesting levels, field names, data types, and which fields appear to be nullable. Understanding the real data is the essential first step to writing an accurate schema. Documentation examples are often simplified and the real payload may have additional fields or different types than described.
Common Validation Errors and What They Mean
When JSON Schema validation fails, the library returns an error object describing which keyword was violated, at which path in the document. Understanding how to read these errors makes debugging integration issues significantly faster:
| Error | Keyword | Cause | Fix |
|---|---|---|---|
| must have required property 'X' | required | A field listed in required is absent from the data | Add the missing field to the JSON, or remove it from required if it is truly optional |
| must be string | type | The field exists but contains the wrong type. Often a number where a string is expected, or null where a string is expected | Check the actual value type. If null is valid, use ["string", "null"] |
| must be integer | type | A decimal number was provided where an integer is required (e.g. 3.14 for a quantity field) | Round the value, or change the type to number if decimals are valid |
| must be >= N | minimum | A numeric value is below the allowed minimum | Validate the value before submission, or adjust the minimum constraint if the business rule was wrong |
| must match format "email" | format | The string does not match the expected format. Could be a missing @ in an email, an invalid date string, or a malformed UUID | Validate the format on the client before sending, or check how format validation is enabled in your library |
| must be equal to one of the allowed values | enum | The value is not in the allowed enum list. Often a casing issue ("Active" vs "active") or a value from an outdated enum list | Check the exact allowed values in the schema. Confirm casing matches exactly. |
| must be array | type | The field is null, an object, or a primitive where an array is expected. A very common API inconsistency | Ensure the API always returns an empty array [] rather than null when there are no items |
| must NOT have additional properties | additionalProperties | The JSON contains a property not listed in properties, and the schema uses additionalProperties: false | Add the field to properties, or remove additionalProperties: false if extra fields are acceptable |
Validation Libraries by Language
JSON Schema is supported by mature, well-maintained libraries in every major language. You do not need to write your own validation logic: the library handles all the keyword evaluation and produces detailed error messages automatically.
In Ajv, format validation (email, uuid, date-time, etc.) is not enabled by default in Draft 2020-12. You must install the ajv-formats package separately and call addFormats(ajv) after creating the Ajv instance. If format validation appears to silently pass invalid values, this missing step is almost always the cause. The same applies to the Python jsonschema library with format_checker.
7-Step Workflow: Adding JSON Schema Validation to an Integration
Follow this workflow when adding JSON Schema validation to an existing API integration or building a new one from scratch:
- Collect real API responses before writing any schema. Make actual requests to the API endpoint and capture the raw JSON responses for multiple cases: a typical response, a response with optional fields absent, and a response with the maximum number of nested objects. Paste each into the JSON Formatter to see the full structure clearly before writing a single line of schema.
- Identify required vs optional fields from real data, not documentation. Compare multiple real responses and note which fields appear in all of them (candidates for required) and which only appear sometimes (optional). API documentation frequently lists fields as required that the API sometimes omits, or vice versa. Real data is the ground truth.
- Write the schema bottom-up: nested objects first. If your JSON has nested objects or arrays of objects, write the schema for the innermost types first. Then compose them into the parent schema using $defs or inline references. Starting from the outside and working in makes it easy to lose track of nesting depth.
- Use the $schema keyword and target Draft 2020-12. Always declare the schema version with the "$schema": "https://json-schema.org/draft/2020-12/schema" keyword. Different draft versions have different behaviours for some keywords. Specifying the version eliminates ambiguity and ensures your validator uses the correct evaluation rules.
- Test against both valid and invalid data. Write test cases that confirm validation passes for correct data and fails with specific expected errors for each type of invalid data: missing required fields, wrong types, out-of-range values, invalid enum values, and invalid formats. Testing only the happy path misses most of the value of having a schema.
- Validate API responses as well as requests. Apply schema validation to both incoming request bodies and outgoing API response bodies. Validating the response confirms that the third-party API is returning what its documentation claims. When an API changes its response format, your validation will catch it immediately rather than letting it produce silent data corruption.
- Use the Text Diff Checker when API response formats change. When an external API releases a new version or updates its response structure, paste the old example response and the new example response side by side into the Text Diff Checker. The highlighted differences show exactly which fields were added, removed, or changed type, making it fast to update your schema and integration code accurately.
Frequently Asked Questions About JSON Schema
They serve a similar purpose but operate at different times and in different contexts. TypeScript interfaces validate structure at compile time and only within TypeScript code. JSON Schema validates data at runtime, including data arriving from external sources like APIs, webhooks, and user input. A TypeScript interface cannot validate data coming from an untrusted external source because the type information is erased at runtime. JSON Schema is specifically designed for runtime validation. Use both together: TypeScript interfaces for compile-time safety within your codebase, and JSON Schema for validating data at the boundaries where external data enters your system.
OpenAPI (formerly Swagger) is a specification for describing entire REST APIs: endpoints, request formats, response formats, authentication, and more. OpenAPI uses a subset of JSON Schema (with some extensions) to describe the structure of request and response bodies. JSON Schema is the underlying validation vocabulary that OpenAPI builds on. If you are building a REST API, OpenAPI is the appropriate tool for documentation and client generation. If you need to validate individual JSON documents at runtime in your application code, use a JSON Schema library directly. Many OpenAPI tools can also extract the JSON Schema definitions from an OpenAPI spec for use with validation libraries.
Yes, this is one of JSON Schema's core strengths. Nested objects are validated by defining a schema in the properties keyword where the value is another schema object. Arrays of objects are validated using the items keyword with an object schema as its value. This nesting can go as deep as your data structure requires. The validation error output includes the full JSON Pointer path to the failing field (for example /items/2/price), making it easy to identify exactly which item in an array failed and which field within it was invalid.
Draft 2020-12 is the current version and the one to target for new schemas. The key differences from Draft 7 are: $defs replaces definitions for reusable schemas, prefixItems replaces items for tuple validation, and format validation is officially annotation-only by default (validators may still enforce it). Draft 2020-12 also introduced unevaluatedProperties and unevaluatedItems for more precise validation of unknown fields. Always declare the draft with the $schema keyword so the validator applies the correct rules. Many existing schemas use Draft 7 and still work correctly: the major Ajv version 8 supports both drafts.
Use the $defs keyword (Draft 2020-12) or definitions (Draft 7) to define reusable schemas in one place, then reference them elsewhere in the same schema using $ref. For example, an address schema used in both a shipping address and billing address field can be defined once in $defs.Address and referenced with {"$ref": "#/$defs/Address"} in each location. For schemas shared across multiple files, you can reference external schema files with a full URI in the $ref value. This avoids copy-pasting schemas and ensures that updating the definition updates all references automatically.
The ideal approach is to validate as early as possible, which usually means in middleware or at the API gateway level, before the request reaches any business logic. This gives you a single, consistent validation layer that all endpoints benefit from without duplicating validation logic in every handler. Many frameworks support this directly: Express.js can use Ajv in middleware, FastAPI validates request bodies automatically using Pydantic, and NestJS uses class-validator for DTO validation. For validating external API responses, validate in the service layer immediately after the HTTP response is received and parsed, before passing the data to any other function. This ensures that validation failures produce clear errors at the integration boundary rather than propagating invalid data.
Tools for working with JSON and API data
Format and validate JSON before writing schemas, compare API responses between environments, convert JSON to YAML or XML, and more. All free, all in your browser, no login required.
Validate at the Boundary. Trust Nothing That Enters Your System.
JSON Schema is the most direct solution to one of the most common sources of production bugs: data that does not match what your code expects. Validating at the entry point of your application means invalid data produces a clear, immediate error message rather than a confusing crash deep in business logic. It means API contract violations are caught the moment they arrive, not hours later when a customer reports a failure.
The core keywords in this guide cover the majority of real-world validation needs: type, required, properties, enum, format, minimum, maximum, minLength, maxLength, items, and additionalProperties. The real-world order schema demonstrates how these keywords compose into a complete, production-quality validator. The seven-step workflow gives you a repeatable process for adding validation to any new or existing integration.
Start every schema by formatting the actual JSON you need to validate with the JSON Formatter. Understanding the real data structure before writing the schema produces more accurate constraints and avoids writing rules that reject legitimate data. When the API updates and the response format changes, the Text Diff Checker shows you exactly what changed so you can update your schema and code with confidence.