Our Data Schemas and object model are still evolving, and there may be
discrepancies or incompleteness between our documentation and the latest in
development. If you have any questions, please contact us on the Speckle
Community Forum: Developers.
Before diving into the structure, here are the essential terms you’ll encounter:
- Collection - An organizational container that creates hierarchy by grouping objects and other collections. Collections don’t contain data themselves; they organize it.
- DataObject - A semantic object representing a BIM element or domain entity (wall, column, door) with properties and geometry stored in
displayValue. - Proxy - A relationship container stored at the root level that links objects to shared resources (materials, levels, groups) or encodes instance definitions.
- Definition - A proxy type that stores geometry for reuse by multiple Instance objects.
- Instance - An object that references a Definition proxy and applies a transform, representing repeated geometry (blocks, components).
- DisplayValue - An array property on DataObjects containing geometry primitives (meshes, lines, points) that represent the visual appearance.
- Root Collection - The top-level container for a data package, holding the collections hierarchy, proxies, info, and closures table.
- Info - Metadata fields at the root level that apply to the entire data package (views, transforms, analysis results).
What is the Data Schema?
The Data Schema is the structuring of properties and geometries at the most granular level. It defines how data from any source application (Revit, Rhino, AutoCAD, etc.) is organized, starting from the smallest building blocks and building up into larger structures.Design Principles
Speckle’s schema follows these core principles:- Unified structure - Same schema across all connectors, enabling code that works with any Speckle data
- Tree hierarchy - Directed tree structure prevents cycles and enables predictable traversal
- Separation of concerns - Collections organize spatially/categorically; Proxies encode cross-cutting relationships without duplication
- Content-based identity - Objects identified by content hash for deduplication;
applicationIdprovides stable version tracking - Interoperable geometry - Geometry stored as minimum viable primitives (Meshes, Lines, Points) for maximum receiver compatibility
Projects, Models, and Versions as Addresses
Projects, Models, and Versions form an addressing system that organizes and locates data in Speckle. Projects are top-level containers that organize related work and control access. Models are user-facing semantic labels that provide organizational meaning (e.g., “structural”, “MEP”, “design-option-a”). They group related versions together for easier navigation and understanding. Versions are immutable snapshots of data at a specific point in time. Each version references a Root Collection by its object ID, creating a discoverable entry point to the data. Think of this as an addressing system: Project → Model → Version provides a unique “address” to locate a specific package of information, where Models add semantic meaning to help users organize and find their work.The Data Schema Structure
The Data Schema organizes data from the most granular level upward:- Properties and Geometries - The atomic building blocks (key-value pairs and geometric primitives), all structured as Base Objects
- Objects - Combine properties and geometries into meaningful units (Geometry, DataObject, Instance)
- Collections - Organize objects into hierarchies (layers, levels, categories)
- Root Collection - The top-level container that holds Collections, Proxies, Info, and the closures table
- Proxies - Encode relationships between objects and shared resources (materials, levels, groups)
- Info - Metadata that applies to the entire data package
The Relationship Model
The Speckle data schema is structured as a directed tree. This is the core relationship model that defines how objects relate to each other:- Tree structure: Data flows from Root Collection → Collections → Objects in a hierarchical, one-way relationship
- No cycles: Unlike a directed acyclic graph (DAG), the tree structure prevents circular references
- Flat object graph (best practice): As a general rule, objects don’t nest within other objects—they’re organized in collections and referenced via proxies. Some DataObjects (e.g., Revit curtain walls) may use an
elementsproperty when required by the source application. - Proxies add cross-cutting relationships: While the collection hierarchy is a tree, proxies create additional relationships (materials, levels, groups) that reference objects by
applicationId
- Collections contain objects and other collections (via
elementsproperty) - DataObjects contain geometry in
displayValue(and may contain child DataObjects viaelementsin some cases) - Proxies reference objects by
applicationId(stored at root level) - Instances reference Definition proxies by
applicationId
elements properties on both Collections and DataObjects to handle connector-specific exceptions.
Conceptual Walkthrough
Consider a simple building model containing two walls on Level 1. Here’s how Speckle structures this conceptually:- Root Collection serves as the entry point, containing everything.
- Collections organize the walls: a “Level 1” collection contains a “Walls” collection, which contains the two wall DataObjects.
-
DataObjects represent each wall: each has a name, properties (material, dimensions), and a
displayValuearray containing mesh geometry. -
Proxies at the root level create additional relationships:
- A RenderMaterial proxy links both walls to a “Concrete” material resource
- A Level proxy links both walls to the “Level 1” level resource
- Info might contain a reference point transform for coordinate system alignment.
applicationId references.
The Complete Structure
Putting it all together, a complete data package looks like:Learning Path
This documentation is organized as a progressive learning experience:- Concepts - Core ideas: Collections, Objects, Proxies, Info
- Object Schema - How objects are structured and what fields they contain
- Geometry Schema - How geometry is stored and organized
- Proxy Schema - How proxies encode relationships
- Connector Index - Overview of connector-specific behaviors
Why is the structure the same across all connectors?
Why is the structure the same across all connectors?
Speckle uses a unified Data Schema so that data from any source can be
consumed by any receiver. This means you can write code that works with
Speckle models without needing to know which application created them. The
connector-specific differences are captured in the object properties and
hierarchy structure, not in the fundamental Data Schema organization.
What's the difference between a Speckle hierarchy and the source application's hierarchy?
What's the difference between a Speckle hierarchy and the source application's hierarchy?
The source application’s hierarchy (e.g., Revit’s levels/categories,
Rhino’s layers) is preserved in the Collections structure. However, Speckle
also imposes its own organizational structure through Proxies, which can
create additional relationships that don’t exist in the source application.
This allows objects to be organized in multiple ways simultaneously without
duplication.
What about orphaned objects?
What about orphaned objects?
Objects are scoped to projects—when an object is sent to a Speckle server, it’s stored within a project’s scope. However, without being referenced by a version, the object is effectively unreachable through normal workflows. These “orphaned” objects exist in the database and can be retrieved if you have the object ID directly, but they cannot be browsed or discovered through the UI. To make objects discoverable, they must be referenced by a version.Behind the Scenes (API):