Containers#
CaptureGraph provides specialized List and Dict containers that enable vectorized operations across your captured data.
The List Container#
The List class broadcasts attribute access to all elements:
import capturegraph.data as cg
target = cg.CaptureTarget("/path/to/target")
sessions = target.sessions.daily
# Vectorized attribute access
photos = sessions.photo # List of all photos
dates = sessions.date # List of all dates
locations = sessions.location.latitude # List of latitudes
Key Features#
Vectorized Access#
Access attributes across all elements:
# Instead of:
latitudes = [s.location.latitude for s in sessions]
# Just write:
latitudes = sessions.location.latitude
Key Projection#
Select specific keys:
# Get only date and id for each session
subset = sessions["date", "id"]
# Returns: List[Dict] with only those keys
Safe Chaining#
Missing attributes return MissingType instead of crashing:
import capturegraph.data as cg
ratings = sessions.rating # Returns MissingType if no ratings
if cg.is_missing(ratings):
print("No ratings found")
Transforms#
.map(fn) — Element-wise Transform#
Apply a function to each element:
# Convert dates to strings
date_strings = sessions.date.map(lambda d: d.strftime("%Y-%m-%d"))
# Calculate ratings
scores = sessions.rating.map(lambda r: r * 10)
.map_leaves(fn) — Transform Leaf Values#
Apply only to leaf nodes (non-containers):
# Round all numeric values
rounded = data.map_leaves(lambda x: round(x, 2) if isinstance(x, float) else x)
NumPy Integration#
Convert to NumPy arrays:
import numpy as np
# Direct conversion
temps = np.array(sessions.temperature)
# MissingType converts to np.nan
ratings = np.array(sessions.rating) # Missing values become nan
# Statistical operations work naturally
avg_temp = np.nanmean(temps)
Broadcasting Join#
Combine multiple lists with NumPy-style broadcasting:
import capturegraph.data as cg
combined = cg.zip(
dates=sessions.date,
temps=sessions.temperature,
ratings=sessions.rating
)
# Returns: List[Dict] with keys: date, temp, rating
for item in combined:
print(f"{item['date']}: {item['temp']}°, rated {item['rating']}")
The Dict Container#
The Dict class extends Python's dict with convenient attribute access:
import capturegraph.data as cg
data = cg.Dict({
"name": "Session 1",
"temperature": 22.5,
"humidity": 45
})
# Attribute access
print(data.name) # "Session 1"
print(data.temperature) # 22.5
Complete Example#
import capturegraph.data as cg
import numpy as np
# Load data
target = cg.CaptureTarget("/path/to/WeatherStudy")
sessions = target.sessions.daily
# Extract and process
dates = sessions.date
temps = sessions.weather.temperature
humidity = sessions.weather.humidity
# Filter to valid data
valid = [
(d, t, h)
for d, t, h in zip(dates, temps, humidity)
if not cg.is_missing(t) and not cg.is_missing(h)
]
# Convert to arrays
temp_array = np.array([t for _, t, _ in valid])
humidity_array = np.array([h for _, _, h in valid])
# Analyze
print(f"Sessions: {len(valid)}")
print(f"Avg Temp: {np.mean(temp_array):.1f}°")
print(f"Avg Humidity: {np.mean(humidity_array):.1f}%")
# Correlation
correlation = np.corrcoef(temp_array, humidity_array)[0, 1]
print(f"Temp-Humidity Correlation: {correlation:.2f}")
See Also#
- Loading Captures — Using
CaptureTargetto load data - Data Framework Overview — Introduction to the Data Framework
- Containers API — Full API reference