For Coding Agents¶
Dense, prescriptive reference for AI coding agents working on or integrating with iscc-schema.
Architecture Map¶
File Layout¶
| Path | Contents | Editable? |
|---|---|---|
iscc_schema/__init__.py |
Public API exports: IsccMeta, Signature, ISBN, ISRC, TDM, GenAI, recover_context |
Yes |
iscc_schema/base.py |
Custom BaseModel — serialization (dict, json, jcs), empty-string-to-None coercion |
Yes |
iscc_schema/fields.py |
RFC 3986-compliant AnyUrl type with regex validation |
Yes |
iscc_schema/recovery.py |
recover_context() — reconstruct @context from $schema or @type |
Yes |
iscc_schema/aliases.json |
Maps @context→context_, @type→type_, $schema→schema_ |
Yes |
iscc_schema/schema.py |
Generated — main IsccMeta Pydantic v2 model |
No |
iscc_schema/generator.py |
Generated — OpenAPI models for ISCC Generator API | No |
iscc_schema/seed_isbn.py |
Generated — ISBN model |
No |
iscc_schema/seed_isrc.py |
Generated — ISRC model |
No |
iscc_schema/service_tdm.py |
Generated — TDM model |
No |
iscc_schema/service_genai.py |
Generated — GenAI model |
No |
iscc_schema/contexts.py |
Generated — JSON-LD context mappings + TYPE_SCHEMAS dispatch |
No |
iscc_schema/models/*.yaml |
Source of truth — OpenAPI 3.1.0 schema definitions | Yes |
iscc_schema/models/iscc-all.yaml |
Composition manifest (allOf + $ref to individual schemas) |
Yes |
tools/build_code.py |
Code generation: YAML → datamodel-code-generator → Pydantic + post-processing | Yes |
tools/build_json_schema.py |
Flatten individual YAML schemas into merged iscc.json |
Yes |
tools/build_json_ld_context.py |
Generate JSON-LD context files from model metadata | Yes |
tools/build_terms.py |
Generate vocabulary markdown from x-iscc-context fields |
Yes |
tools/build_docs.py |
Generate documentation pages from YAML schemas + README/CHANGELOG | Yes |
tools/format_yaml.py |
Reformat YAML (2-space indent, 88 width, LF) | Yes |
Schema Composition¶
IsccMeta (iscc-all.yaml composes via allOf + $ref):
├── iscc-jsonld.yaml → @context, @type, $schema
├── iscc-minimal.yaml → iscc
├── iscc-basic.yaml → name, description, meta
├── iscc-embeddable.yaml → creator, license, credit, rights, acquire
├── iscc-extended.yaml → media_id, iscc_id, image, keywords, form, version, tdm, genai
├── iscc-technical.yaml → mode, filename, filesize, datasize, mediatype, duration, fps, width, height, created
├── iscc-crypto.yaml → tophash, metahash, datahash, nonce, signature
├── iscc-nft.yaml → external_url, animation_url, properties, attributes, nft
└── iscc-declaration.yaml → original, redirect, chain, wallet, credentials, verifications
Standalone (NOT in IsccMeta):
├── isbn.yaml → ISBN (seed metadata)
├── isrc.yaml → ISRC (seed metadata)
├── tdm.yaml → TDM (service metadata; also inline in IsccMeta.tdm)
└── genai.yaml → GenAI (service metadata; also inline in IsccMeta.genai)
Import Dependency Flow¶
iscc_schema.__init__
→ iscc_schema.schema (generated) → iscc_schema.base → pydantic.BaseModel
→ iscc_schema.fields (AnyUrl)
→ iscc_schema.seed_isbn (generated) → iscc_schema.base
→ iscc_schema.seed_isrc (generated) → iscc_schema.base
→ iscc_schema.service_tdm (generated) → iscc_schema.base
→ iscc_schema.service_genai (generated) → iscc_schema.base
→ iscc_schema.recovery → iscc_schema.contexts (generated)
Public API Exports¶
from iscc_schema import IsccMeta # Main metadata model (all fields)
from iscc_schema import Signature # Cryptographic signature (nested)
from iscc_schema import ISBN # Seed metadata
from iscc_schema import ISRC # Seed metadata
from iscc_schema import TDM # Service metadata
from iscc_schema import GenAI # Service metadata
from iscc_schema import recover_context # JSON-LD context recovery
Decision Dispatch¶
Which model to use?¶
| Use case | Model | Source YAML |
|---|---|---|
| General ISCC metadata | IsccMeta |
iscc-all.yaml (composed) |
| ISBN-based Meta-Code generation | ISBN |
isbn.yaml |
| ISRC-based Meta-Code generation | ISRC |
isrc.yaml |
| TDM reservation signals | TDM |
tdm.yaml |
| GenAI disclosure signals | GenAI |
genai.yaml |
| API request/response models | generator.py models |
iscc-generator.yaml |
Which serialization method?¶
| Need | Method | Behavior |
|---|---|---|
| JSON-LD output | meta.json() |
by_alias=True → @context, @type, $schema |
| Dict for processing | meta.dict() |
exclude_none=True, exclude_unset=True, by_alias=True |
| Canonical bytes for signing | meta.jcs() |
JCS-canonicalized JSON bytes |
| Pydantic v2 native | meta.model_dump() / meta.model_dump_json() |
Used internally by dict() / json() |
Which build command?¶
| Changed | Run |
|---|---|
| Any YAML schema | uv run poe all (full pipeline) |
| Only Python code (base.py, fields.py) | uv run pytest |
| Only build tool scripts | uv run poe all |
| Quick code regen only | uv run poe buildcode |
| Quick JSON Schema only | uv run poe buildschema |
Constraints and Invariants¶
Model Configuration¶
All models inherit from iscc_schema.base.BaseModel with:
extra="forbid"— unknown fields raiseValidationErrorvalidate_assignment=True— assignment validates at runtimeuse_enum_values=True— enums serialize to string valuespopulate_by_name=True— accept bothcontext_and@context
Empty String Coercion¶
The @model_validator(mode="before") in base.py converts empty strings to None for all fields. Combined with exclude_none=True in serialization, empty strings are silently dropped.
URL Validation¶
AnyUrl in fields.py validates against RFC 3986. Accepts any scheme (http, https, ipfs, data, urn). Empty string → None via the empty-string coercion.
Field Aliases¶
Three fields use aliases because @ and $ are invalid in Python identifiers:
| JSON property | Python field | Alias |
|---|---|---|
@context |
context_ |
@context |
@type |
type_ |
@type |
$schema |
schema_ |
$schema |
Extension Fields¶
| Extension | Values | Purpose |
|---|---|---|
x-iscc-context |
IRI string | JSON-LD context mapping for the property |
x-iscc-status |
stable / draft |
Field maturity indicator |
x-iscc-standard |
Standard name | ISO/IPTC standard reference |
x-iscc-schema-doc |
Text | Original schema.org definition |
x-iscc-embed |
Text | Media embedding guidance |
Generated File Post-Processing¶
build_code.py applies two patches after code generation:
- Import swap:
pydantic.BaseModel→iscc_schema.base.BaseModel,pydantic.AnyUrl→iscc_schema.fields.AnyUrl - URL versioning:
http://purl.org/iscc/context→http://purl.org/iscc/context/{version}.jsonld,http://purl.org/iscc/schema→http://purl.org/iscc/schema/{version}.json
Side Effects Catalog¶
| Method / Function | Effect |
|---|---|
IsccMeta(...) |
Validates all fields, coerces empty strings to None |
meta.dict() |
Pure — returns new dict, no mutation |
meta.json() |
Pure — returns JSON string, no mutation |
meta.jcs() |
Pure — returns canonical bytes, no mutation |
recover_context(data) |
Mutates input dict — adds @context key |
poe buildcode |
Overwrites schema.py, generator.py, seed/service modules |
poe buildschema |
Overwrites docs/schema/*.json, iscc_schema/contexts.py |
poe buildcontext |
Overwrites docs/context/*.jsonld |
poe buildterms |
Overwrites docs/includes/terms-*.md |
poe builddocs |
Overwrites docs/index.md, docs/changelog.md, docs/schema/*.md, docs/context/index.md |
Task Recipes¶
Add a new field to IsccMeta¶
- Choose the appropriate YAML schema file in
iscc_schema/models/based on the field's category - Add the property with type, description, and
x-iscc-*extension fields - Add the field name to the
prioritylist intools/build_json_schema.pyfor property ordering - Run
uv run poe allto regenerate everything and run tests - Verify the field appears in
schema.py,docs/schema/iscc.json, and vocabulary docs
Add a new standalone schema (seed or service)¶
- Create
iscc_schema/models/{name}.yamlwith title, type, properties - Add to
SEED_SCHEMASorSERVICE_SCHEMASdict intools/build_code.py - Add to
SEED_SCHEMASorSERVICE_SCHEMASlists intools/build_json_schema.py - Add to the corresponding lists in
tools/build_json_ld_context.py - Add to the corresponding lists in
tools/build_terms.py - Add to the corresponding lists in
tools/build_docs.py - Export the new model from
iscc_schema/__init__.py - Run
uv run poe all
Recover JSON-LD context from plain JSON¶
from iscc_schema import recover_context
data = {"iscc": "ISCC:KACYPXW445FTYNJ3", "name": "Example"}
data_with_context = recover_context(data)
# Adds @context from SCHEMA_CONTEXTS[SCHEMA_ISCC]
Bypass validation (downstream pattern)¶
meta = IsccMeta.model_construct(iscc="ISCC:...", name="...")
# No validation, no coercion — use only for trusted internal data
Change Playbook¶
If modifying a YAML schema property¶
- Run
uv run poe all— regenerates code, JSON Schema, context, docs, runs tests - Check generated
schema.pyfor correct field type and alias - Check
docs/schema/iscc.jsonfor correct property definition - If property has
x-iscc-context: checkdocs/context/iscc.jsonldandcontexts.py
If modifying base.py¶
- Run
uv run pytest— tests cover serialization, coercion, JCS - Check that
dict(),json(),jcs()defaults still match expectations - Downstream impact: iscc-sdk subclasses
IsccMetaand uses.construct()extensively
If modifying fields.py (AnyUrl)¶
- Run
uv run pytest— tests cover URL validation patterns - Check that valid URIs (http, ipfs, data, urn) still accepted
- Check that empty string → None coercion still works
If modifying build_code.py¶
- Run
uv run poe buildcode && uv run pytest - Verify post-processing patches applied correctly in generated files
- Check import statements in
schema.pyreferenceiscc_schema.base.BaseModel
If modifying build_json_schema.py¶
- Run
uv run poe buildschema - Verify
docs/schema/iscc.jsonproperty order matchesprioritylist - Verify
contexts.pyregenerated with correct mappings - Verify
@contextproperty accepts both URI string and inline object
If adding a new x-iscc-* extension field¶
- Update all
tools/build_*.pyscripts that inspect extension fields - Update
tools/build_docs.pyto render the new extension in documentation - Run
uv run poe all
Common Mistakes¶
NEVER edit schema.py, generator.py, seed_*.py, service_*.py, or contexts.py directly — they are overwritten by poe buildcode / poe buildschema. ALWAYS edit the source YAML schemas and run the build pipeline.
NEVER add a new YAML schema field without adding it to the priority list in tools/build_json_schema.py. Fields missing from this list end up unsorted at the bottom of the generated JSON Schema.
NEVER use pydantic.BaseModel or pydantic.AnyUrl in hand-written code that extends iscc-schema. ALWAYS use iscc_schema.base.BaseModel and iscc_schema.fields.AnyUrl — they provide JSON-LD serialization, empty-string coercion, and RFC 3986 validation.
NEVER assume fields are present in serialized output. exclude_none=True and exclude_unset=True mean only explicitly set, non-None fields appear. ALWAYS check for key existence when consuming IsccMeta dicts.
NEVER pass by_alias=False to dict() or json() unless you specifically need Python field names (context_, type_, schema_). The default by_alias=True produces JSON-LD compatible output with @context, @type, $schema.
NEVER add a standalone schema without updating ALL five build scripts (build_code.py, build_json_schema.py, build_json_ld_context.py, build_terms.py, build_docs.py). Missing any one causes incomplete output.