Extracting Room Data from Revit using SpecklePy

In this workflow, we will demonstrate how to extract room data from Revit models using SpecklePy SDK to a Pandas DataFrame.

Prerequisites

Before you begin, ensure you have the following:

A Revit model with room data.
Send your model to Speckle using the Speckle Revit Connector. (Learn how here)
SpecklePy SDK installed in your Python environment.
Pandas library for data manipulation.

Step 1: Install SpecklePy SDK and Pandas

You can install the required libraries using pip:

pip install specklepy pandas

Step 2: Connect to Speckle

First, import the required libraries and authenticate with Speckle.

import os

from dotenv import load_dotenv
from specklepy.api.client import SpeckleClient
from specklepy.api import operations
from specklepy.objects import Base
from specklepy.transports.server import ServerTransport
import pandas as pd

load_dotenv()

SPECKLE_TOKEN = os.getenv("SPECKLE_TOKEN")# you can generate a new token from the DEVELOPER SETTINGS
SPECKLE_SERVER_URL = os.getenv("SPECKLE_SERVER_URL", "https://app.speckle.systems")

if not SPECKLE_TOKEN:
    raise EnvironmentError("SPECKLE_TOKEN is not set. Add it to your .env file.")

client = SpeckleClient(host=SPECKLE_SERVER_URL)# here add your custom server URL 
client.authenticate_with_token(SPECKLE_TOKEN)

Step 3: Retrieve the full object graph

Next, we will retrieve the full object graph of the Revit model that contains the room data. To do this, we need the project ID and version ID of the model we sent to Speckle. You can find these IDs in the URL of your project and version in the Speckle web interface. Learn more about the core concepts here.

project_id = "your_project_id" # here add your project ID, you can find it in the URL of your project, after "projects/", for example in "https://app.speckle.systems/projects/e23d2d5041/models/ab02b98507" the project ID is "e23d2d5041"
version_id = "your_version_id" # here add your version ID, you can find it under Share > Copy ID in the web 3D viewer

# Get version metadata to find the referenced object ID
version = client.version.get(version_id, project_id)
referenced_object_id = version.referenced_object

# next create a server transport - this is the vehicle through which you will send and receive
transport = ServerTransport(stream_id=project_id, client=client)

# Download the full object graph. Note that this is an expensive operation, as it retrieves all objects in the model, so it may take some time for large models.
received_base = operations.receive(referenced_object_id, transport)

Step 4: Flatten the object graph to a list of objects

# Flatten root object and get all objects in a list
def flatten_base(base: Base) -> Iterable[Base]:
    """Flatten a base object into an iterable of bases.
    
    This function recursively traverses the `elements` or `@elements` attribute of the 
    base object, yielding each nested base object.

    Args:
        base (Base): The base object to flatten.

    Yields:
        Base: Each nested base object in the hierarchy.
    """
    # Attempt to get the elements attribute, fallback to @elements if necessary
    elements = getattr(base, "elements", getattr(base, "@elements", None))

    if elements is not None:
        for element in elements:
            yield from flatten_base(element)
    
    yield base

all_objects = list(flatten_base(received_base))

Step 5: Create helper functions to filter objects by category and extract nested values

In this step, we will define helper functions to filter objects by category and to extract nested values from the properties of the objects. These functions will be used in the next step to extract the desired data from the room objects.

# Helper function to filter objects by category
def filter_objects_by_category(all_objects, filter_categories):
    """
    Filters objects by category and returns a list of matching objects and their IDs.

    Parameters:
    all_objects (list): List of objects to filter.
    filter_categories (list): List of categories to filter by.

    Returns:
    list: filtered_objects
    """
    filtered_objects = []

    items = []

    for i in all_objects:
        if hasattr(i, "category"):  # Check if the object has the "category" attribute
            if i.category in filter_categories:  # Check if the category matches the filter list
                filtered_objects.append(i)  # Append the whole object to the items list
        else:
            continue  # Skip if "category" does not exist

    return filtered_objects

# Helper function to get nested value from a dict using a path of keys
def get_nested_value(data, path, extract_value=True):
    """
    Navigate a nested dict using a list of keys.
    If the final node is a Revit parameter dict (has a 'value' key), returns its value.

    Parameters:
    data (dict): The dict to traverse (typically obj.properties).
    path (list[str]): Ordered list of keys, e.g.
                      ['Parameters', 'Instance Parameters', 'Identity Data', 'Number']
    extract_value (bool): If True and the leaf node contains a 'value' key, return that.

    Returns:
    The resolved value, or None if any key in the path is missing.
    """
    current = data
    for key in path:
        if not isinstance(current, dict):
            return None
        current = current.get(key)
        if current is None:
            return None # If any key is missing, return NaN in the DataFrame
    if extract_value and isinstance(current, dict) and 'value' in current:
        return current['value']
    return current

# Helper function to extract specified fields from a list of objects using flexible nested paths
def extract_object_data(objects, field_paths, include_defaults=True):
    """
    Extract data from a list of DataObjects using flexible nested field paths.

    Returns:
    list[dict]: One dict per object with the resolved values.
    """
    results = []
    for obj in objects:
        properties = getattr(obj, 'properties', {}) or {}
        row = {}

        if include_defaults:
            row['id'] = getattr(obj, 'id', None) # Include the object ID by default, or other parameters like 'type' if needed.

        for col_name, path in field_paths.items():
            row[col_name] = get_nested_value(properties, path)

        results.append(row)
    return results

Step 6: Filter room objects and extract desired data into a DataFrame

In this step, we will define the field paths for the desired parameters and extract the data for the room objects into a Pandas DataFrame. You can use the Speckle 3D Viewer to inspect the structure of the properties and adjust the paths as needed.

# Define the field paths for the desired parameters, use the Speckle 3D Viewer to inspect the structure of the properties and adjust paths as needed. 
# Select one of the rooms in the viewer, inspect its properties and find the path to the parameters you want to extract. 
# The path is a list of keys to navigate through the nested dictionaries in obj.properties. For example, if the parameter is located at obj.properties['Parameters']['Instance Parameters']['Identity Data']['Number'], 
# then the path would be ['Parameters', 'Instance Parameters', 'Identity Data', 'Number'].
field_paths = {
    'Number':       ['Parameters', 'Instance Parameters', 'Identity Data', 'Number'],
    'Name':         ['Parameters', 'Instance Parameters', 'Identity Data', 'Name'],
    'Occupant':     ['Parameters', 'Instance Parameters', 'Identity Data', 'Occupant'],
    'Floor Finish': ['Parameters', 'Instance Parameters', 'Identity Data', 'Floor Finish'],
    'Level':        ['Parameters', 'Instance Parameters', 'Constraints', 'Level'],
    'Area':         ['Parameters', 'Instance Parameters', 'Dimensions', 'Area'],
    'Volume':       ['Parameters', 'Instance Parameters', 'Dimensions', 'Volume'],
    'Perimeter':    ['Parameters', 'Instance Parameters', 'Dimensions', 'Perimeter'],
    'Unbounded Height':    ['Parameters', 'Instance Parameters', 'Dimensions', 'Unbounded Height'],
}

room_data = filter_objects_by_category(all_objects, ['Rooms']) # Make sure to extract only rooms
data = extract_object_data(room_data, field_paths) # Extract the data using the defined field paths
df_rooms = pd.DataFrame(data)
df_rooms.head()

Results should look like this:

id	Number	Name	Occupant	Floor Finish	Level	Area	Volume	Perimeter	Unbounded Height
ab37495bf07f97 6aae3c55e3cc2617f9	S1	Stair	None	Concrete	L1 - Block 35	184.598958	15891.467781	60.833333	87.666667
00b6189ecb67a54f09e47cda717295a3	101	Café	Restaurant	Polished Concrete/LVT	L1 - Block 35	642.967568	13276.052784	196.655657	24.250000
42ba7c725016f8ef7458065b7824fbc2	102	Café Kitchen	Restaurant	Polished Concrete/LVT	L1 - Block 35	433.370768	4730.964220	137.312500	10.916667
77aacf2216a7c55cdc72ef5f118f3113	103	Outdoor Covered Dining	Restaurant	Polished Concrete/LVT	L1 - Block 35	674.576823	14316.140687	114.583333	22.750000
f71f1d3a62b5457de25061e025a8f2c9	104	Pocket Park	Outdoor Public	Varies see Plan	L1 - Block 35	1775.819324	39490.443882	225.443711	22.750000
c825211befa4530a133cf44fd21084cc	S3	Stair	None	Concrete	L2	180.062500	15500.451922	59.750000	87.708333

​Prerequisites

​Step 1: Install SpecklePy SDK and Pandas

​Step 2: Connect to Speckle

​Step 3: Retrieve the full object graph

​Step 4: Flatten the object graph to a list of objects

​Step 5: Create helper functions to filter objects by category and extract nested values

​Step 6: Filter room objects and extract desired data into a DataFrame

​Results should look like this: