Skip to content

Containers#

CaptureGraph provides specialized List and Dict containers that enable vectorized operations across your captured data.

The List Container#

The List class broadcasts attribute access to all elements:

import capturegraph.data as cg

target = cg.CaptureTarget("/path/to/target")
sessions = target.sessions.daily

# Vectorized attribute access
photos = sessions.photo      # List of all photos
dates = sessions.date        # List of all dates
locations = sessions.location.latitude  # List of latitudes

Key Features#

Vectorized Access#

Access attributes across all elements:

# Instead of:
latitudes = [s.location.latitude for s in sessions]

# Just write:
latitudes = sessions.location.latitude

Key Projection#

Select specific keys:

# Get only date and id for each session
subset = sessions["date", "id"]
# Returns: List[Dict] with only those keys

Safe Chaining#

Missing attributes return MissingType instead of crashing:

import capturegraph.data as cg

ratings = sessions.rating  # Returns MissingType if no ratings

if cg.is_missing(ratings):
    print("No ratings found")

Transforms#

.map(fn) — Element-wise Transform#

Apply a function to each element:

# Convert dates to strings
date_strings = sessions.date.map(lambda d: d.strftime("%Y-%m-%d"))

# Calculate ratings
scores = sessions.rating.map(lambda r: r * 10)

.map_leaves(fn) — Transform Leaf Values#

Apply only to leaf nodes (non-containers):

# Round all numeric values
rounded = data.map_leaves(lambda x: round(x, 2) if isinstance(x, float) else x)

NumPy Integration#

Convert to NumPy arrays:

import numpy as np

# Direct conversion
temps = np.array(sessions.temperature)

# MissingType converts to np.nan
ratings = np.array(sessions.rating)  # Missing values become nan

# Statistical operations work naturally
avg_temp = np.nanmean(temps)

Broadcasting Join#

Combine multiple lists with NumPy-style broadcasting:

import capturegraph.data as cg

combined = cg.zip(
    dates=sessions.date,
    temps=sessions.temperature,
    ratings=sessions.rating
)

# Returns: List[Dict] with keys: date, temp, rating
for item in combined:
    print(f"{item['date']}: {item['temp']}°, rated {item['rating']}")

The Dict Container#

The Dict class extends Python's dict with convenient attribute access:

import capturegraph.data as cg

data = cg.Dict({
    "name": "Session 1",
    "temperature": 22.5,
    "humidity": 45
})

# Attribute access
print(data.name)        # "Session 1"
print(data.temperature) # 22.5

Complete Example#

import capturegraph.data as cg
import numpy as np

# Load data
target = cg.CaptureTarget("/path/to/WeatherStudy")
sessions = target.sessions.daily

# Extract and process
dates = sessions.date
temps = sessions.weather.temperature
humidity = sessions.weather.humidity

# Filter to valid data
valid = [
    (d, t, h)
    for d, t, h in zip(dates, temps, humidity)
    if not cg.is_missing(t) and not cg.is_missing(h)
]

# Convert to arrays
temp_array = np.array([t for _, t, _ in valid])
humidity_array = np.array([h for _, _, h in valid])

# Analyze
print(f"Sessions: {len(valid)}")
print(f"Avg Temp: {np.mean(temp_array):.1f}°")
print(f"Avg Humidity: {np.mean(humidity_array):.1f}%")

# Correlation
correlation = np.corrcoef(temp_array, humidity_array)[0, 1]
print(f"Temp-Humidity Correlation: {correlation:.2f}")

See Also#