contentEncoding : String

contentEncoding

String

The string instance should be interpreted as encoded binary data and decoded using the encoding named by this property.

Value This keyword should be set to a standard (to increase interoperability) encoding name such as those defined in RFC 4648 and RFC 2045 Hint: Use the jsonschema metaschema and jsonschema lint commands to catch keywords set to invalid values
Kind Annotation
Applies To String
Base Dialect 2020-12
Changed In None
Introduced In Draft 7
Vocabulary Content
Specification https://json-schema.org/draft/2020-12/json-schema-validation.html#section-8.3
Metaschema https://json-schema.org/draft/2020-12/meta/content
Official Tests draft2020-12/content.json
Default None
Annotation String The content encoding name set by this keyword Hint: Use the jsonschema validate command to collect annotations from the command-line
Affected By None
Affects None
Also See

The contentEncoding keyword signifies that a string instance value (such as a specific object property) should be considered binary data serialised using the given encoding. This keyword does not affect validation, but the evaluator will collect its value as an annotation. The use of this and related keywords is a common technique to encode and describe arbitrary binary data (such as image, audio, and video) in JSON.

RFC 4648 (The Base16, Base32, and Base64 Data Encodings) and RFC 2045 (Format of Internet Message Bodies) define the following standard encodings. In the interest of interoperability, avoid defining new content encodings. While the JSON Schema specification does not provide explicit guidance on this, RFC 2045 Section 6.3 suggests that if a custom content encoding is really needed, it must be prefixed with x-. For example, x-my-new-encoding.

Encoding Description Reference
"base64" Encoding scheme using a 64-character hexadecimal alphabet RFC 4648 Section 4
"base32" Encoding scheme using a 32-character hexadecimal alphabet RFC 4648 Section 6
"base16" Encoding scheme using a 16-character hexadecimal alphabet RFC 4648 Section 8
"7bit" Encoding scheme that constrains ASCII to disallow octets greater than 127, disallow NUL, and restricts CR and LF to CRLF sequences RFC 2045 Section 2.7
"8bit" Encoding scheme that constrains ASCII to permit octets greater than 127, disallow NUL, and restrict CR and LF to CRLF sequences RFC 2045 Section 2.8
"binary" Encoding scheme where any sequence of octets is allowed RFC 2045 Section 2.9
"quoted-printable" Encoding scheme that preserves ASCII printable characters and escapes the rest using a simple algorithm based on an hexadecimal alphabet RFC 2045 Section 6.7

Remember that JSON Schema is a constraint-driven language. Therefore, non-string instances successfully validate against this keyword. If needed, make use of the type keyword to constraint the accepted type accordingly.

Examples

A schema that describes arbitrary data encoded using Base 64 Schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "contentEncoding": "base64"
}
Valid A string value encoded in Base 64 is valid and an annotation is emitted Instance
"SGVsbG8gV29ybGQ=" // Hello World
Annotations
{ "keyword": "/contentEncoding", "instance": "", "value": "base64" }
Valid A string value that does not represent a Base 64 encoded value is valid and an annotation is still emitted Instance
"This is not Base 64"
Annotations
{ "keyword": "/contentEncoding", "instance": "", "value": "base64" }
Valid A non-string value is valid but no annotations are emitted Instance
1234