Overview

Our Data Schemas and object model are still evolving, and there may be discrepancies or incompleteness between our documentation and the latest in development. If you have any questions, please contact us on the Speckle Community Forum: Developers.

This page introduces the fundamental structure of Speckle’s Data Schema—the conceptual framework that organizes how model data is structured, stored, and related. Understanding this structure is essential before working with any Speckle SDK, as it defines the relationships between objects, their organization, and how they reference each other.

Before diving into the structure, here are the essential terms you’ll encounter:

Collection - An organizational container that creates hierarchy by grouping objects and other collections. Collections don’t contain data themselves; they organize it.
DataObject - A semantic object representing a BIM element or domain entity (wall, column, door) with properties and geometry stored in displayValue.
Proxy - A relationship container stored at the root level that links objects to shared resources (materials, levels, groups) or encodes instance definitions.
Definition - A proxy type that stores geometry for reuse by multiple Instance objects.
Instance - An object that references a Definition proxy and applies a transform, representing repeated geometry (blocks, components).
DisplayValue - An array property on DataObjects containing geometry primitives (meshes, lines, points) that represent the visual appearance.
Root Collection - The top-level container for a data package, holding the collections hierarchy, proxies, info, and closures table.
Info - Metadata fields at the root level that apply to the entire data package (views, transforms, analysis results).

What is the Data Schema?

The Data Schema is the structuring of properties and geometries at the most granular level. It defines how data from any source application (Revit, Rhino, AutoCAD, etc.) is organized, starting from the smallest building blocks and building up into larger structures.

Design Principles

Speckle’s schema follows these core principles:

Unified structure - Same schema across all connectors, enabling code that works with any Speckle data
Tree hierarchy - Directed tree structure prevents cycles and enables predictable traversal
Separation of concerns - Collections organize spatially/categorically; Proxies encode cross-cutting relationships without duplication
Content-based identity - Objects identified by content hash for deduplication; applicationId provides stable version tracking
Interoperable geometry - Geometry stored as minimum viable primitives (Meshes, Lines, Points) for maximum receiver compatibility

Projects, Models, and Versions as Addresses

Projects, Models, and Versions form an addressing system that organizes and locates data in Speckle. Projects are top-level containers that organize related work and control access. Models are user-facing semantic labels that provide organizational meaning (e.g., “structural”, “MEP”, “design-option-a”). They group related versions together for easier navigation and understanding. Versions are immutable snapshots of data at a specific point in time. Each version references a Root Collection by its object ID, creating a discoverable entry point to the data. Think of this as an addressing system: Project → Model → Version provides a unique “address” to locate a specific package of information, where Models add semantic meaning to help users organize and find their work.

The Data Schema Structure

The Data Schema organizes data from the most granular level upward:

Properties and Geometries - The atomic building blocks (key-value pairs and geometric primitives), all structured as Base Objects
Objects - Combine properties and geometries into meaningful units (Geometry, DataObject, Instance)
Collections - Organize objects into hierarchies (layers, levels, categories)
Root Collection - The top-level container that holds Collections, Proxies, Info, and the closures table
Proxies - Encode relationships between objects and shared resources (materials, levels, groups)
Info - Metadata that applies to the entire data package

The following diagram illustrates how these components form a tree structure: See the Concepts page for detailed explanations of each component.

The Relationship Model

The Speckle data schema is structured as a directed tree. This is the core relationship model that defines how objects relate to each other:

Tree structure: Data flows from Root Collection → Collections → Objects in a hierarchical, one-way relationship
No cycles: Unlike a directed acyclic graph (DAG), the tree structure prevents circular references
Flat object graph (best practice): As a general rule, objects don’t nest within other objects—they’re organized in collections and referenced via proxies. Some DataObjects (e.g., Revit curtain walls) may use an elements property when required by the source application.
Proxies add cross-cutting relationships: While the collection hierarchy is a tree, proxies create additional relationships (materials, levels, groups) that reference objects by applicationId

Key relationship rules:

Collections contain objects and other collections (via elements property)
DataObjects contain geometry in displayValue (and may contain child DataObjects via elements in some cases)
Proxies reference objects by applicationId (stored at root level)
Instances reference Definition proxies by applicationId

Why a tree? This structure makes traversal predictable and efficient—you can walk from root to leaf without encountering cycles or complex reference resolution. The generally flat object graph (DataObjects typically don’t nest) keeps the model simple and prevents deep nesting issues. Proxies solve the “overlapping hierarchies” problem—a wall can belong to a level, a group, and use a material simultaneously without duplication. Note: Traversal code should check for elements properties on both Collections and DataObjects to handle connector-specific exceptions.

Conceptual Walkthrough

Consider a simple building model containing two walls on Level 1. Here’s how Speckle structures this conceptually:

Root Collection serves as the entry point, containing everything.
Collections organize the walls: a “Level 1” collection contains a “Walls” collection, which contains the two wall DataObjects.
DataObjects represent each wall: each has a name, properties (material, dimensions), and a displayValue array containing mesh geometry.
Proxies at the root level create additional relationships:
- A RenderMaterial proxy links both walls to a “Concrete” material resource
- A Level proxy links both walls to the “Level 1” level resource
Info might contain a reference point transform for coordinate system alignment.

The key insight: the walls exist once as DataObjects in the collections hierarchy, but participate in multiple organizational systems (spatial via collections, material via proxy) without duplication. This structure avoids cycles—you can traverse from root to leaf without encountering circular references—while preserving provenance through stable applicationId references.

The Complete Structure

Putting it all together, a complete data package looks like:

Learning Path

This documentation is organized as a progressive learning experience:

Concepts - Core ideas: Collections, Objects, Proxies, Info
Object Schema - How objects are structured and what fields they contain
Geometry Schema - How geometry is stored and organized
Proxy Schema - How proxies encode relationships
Connector Index - Overview of connector-specific behaviors

Why is the structure the same across all connectors?

Speckle uses a unified Data Schema so that data from any source can be consumed by any receiver. This means you can write code that works with Speckle models without needing to know which application created them. The connector-specific differences are captured in the object properties and hierarchy structure, not in the fundamental Data Schema organization.

What's the difference between a Speckle hierarchy and the source application's hierarchy?

The source application’s hierarchy (e.g., Revit’s levels/categories, Rhino’s layers) is preserved in the Collections structure. However, Speckle also imposes its own organizational structure through Proxies, which can create additional relationships that don’t exist in the source application. This allows objects to be organized in multiple ways simultaneously without duplication.

What about orphaned objects?

Objects are scoped to projects—when an object is sent to a Speckle server, it’s stored within a project’s scope. However, without being referenced by a version, the object is effectively unreachable through normal workflows. These “orphaned” objects exist in the database and can be retrieved if you have the object ID directly, but they cannot be browsed or discovered through the UI. To make objects discoverable, they must be referenced by a version.Behind the Scenes (API):

Conceptual Capability

After reading this overview, you understand how Speckle structures model data as a tree with a Root Collection containing Collections, Proxies, and Info. You recognize that Collections create hierarchical organization, Proxies encode cross-cutting relationships, and the tree structure prevents cycles while preserving provenance. You’re ready to explore the detailed concepts that make up each component.

Core Concepts

Connectors

What is the Data Schema?

Design Principles

Projects, Models, and Versions as Addresses

The Data Schema Structure

The Relationship Model

Conceptual Walkthrough

The Complete Structure

Learning Path

Conceptual Capability

Core Concepts

Connectors

​What is the Data Schema?

​Design Principles

​Projects, Models, and Versions as Addresses

​The Data Schema Structure

​The Relationship Model

​Conceptual Walkthrough

​The Complete Structure

​Learning Path

​Conceptual Capability

What is the Data Schema?

Design Principles

Projects, Models, and Versions as Addresses

The Data Schema Structure

The Relationship Model

Conceptual Walkthrough

The Complete Structure

Learning Path

Conceptual Capability