Skip to content

Advanced Operations#

Advanced utilities for working with CaptureGraph's vectorized containers.

Vectorized Functions#

Transform scalar functions into vectorized ones using the @cg.vectorize decorator:

import capturegraph.data as cg

@cg.vectorize
def calculate_score(session):
    """Scalar function that works on one session."""
    return session.rating * session.difficulty

# Now works on entire lists automatically
sessions = target.sessions
all_scores = calculate_score(sessions)

Parallel Processing#

For I/O-bound operations, use parallel mapping:

# Sequential processing
processed = sessions.map(expensive_transform)

# Parallel processing (4 worker threads)
processed = sessions.pmap(expensive_transform, workers=4)

# Parallel processing of nested structures
processed = sessions.pmap_leaves(expensive_transform, workers=4)

Data Reshaping#

Columnize#

Transpose rows to columns for analysis:

import capturegraph.data as cg

# List of records → Dict of columns
columns = cg.columnize(sessions)

# Access as columns
temps = columns["weather.temperature"]
dates = columns["date"]

Flatten#

Collapse nested lists:

# List[List[T]] → List[T]
all_photos = cg.flatten(sessions.photos)

Vectorized Assignment#

Batch update attributes across all elements:

import capturegraph.data as cg

sessions = target.sessions.daily

# Broadcast a scalar to all elements
sessions.processed = True

# Element-wise assignment (same length)
sessions.score = [1.0, 2.0, 3.0, ...]

Tree Visualization#

Inspect data structure:

# Print the schema and structure
cg.tree(target)

Which results in:

├── _path: PosixPath ├── _manifest: (ProcedureManifest) ├── instructions │ ├── path: PosixPath │ └── exif │ └── Image │ ├── XResolution: int │ ├── YResolution: int │ └── ResolutionUnit: str ├── _acknowledgements │ ├── _path: PosixPath │ ├── _manifest: (ProcedureManifest) │ ├── D2D5D4F9-64AC-4F07-A174-48B0F1D4163D │ │ ├── _path: PosixPath │ │ ├── _manifest: (ProcedureManifest) │ │ └── acknowledged: bool │ ├── 49F960D0-FC56-4A90-A2CA-F671916715C4 │ │ ├── _path: PosixPath │ │ ├── _manifest: (ProcedureManifest) │ │ └── acknowledged: bool │ ├── B2EE66E5-0E3C-46AE-8F0C-11B811EDCC41 │ │ ├── _path: PosixPath │ │ ├── _manifest: (ProcedureManifest) │ │ └── acknowledged: bool │ └── 23B13786-07A9-4830-8DF5-739D61E37425 │ ├── _path: PosixPath │ ├── _manifest: (ProcedureManifest) │ └── acknowledged: bool ├── _metadata │ ├── _path: PosixPath │ ├── _manifest: (ProcedureManifest) │ ├── _thumbnail │ │ ├── path: PosixPath │ │ └── exif │ │ ├── Image │ │ │ ├── Orientation: str │ │ │ ├── XResolution: int │ │ │ ├── YResolution: int │ │ │ ├── ResolutionUnit: str │ │ │ └── ExifOffset: int │ │ ├── ColorSpace: str │ │ ├── ExifImageWidth: int │ │ └── ExifImageLength: int │ ├── _capture_location │ │ ├── longitude: float │ │ ├── latitude: float │ │ ├── altitude: int │ │ └── heading: float │ └── _notification: datetime └── G3C └── [0...58] ├── _path: PosixPath ├── _manifest: (ProcedureManifest) ├── user_id │ ├── name: str │ └── identifier: str ├── left │ ├── path: PosixPath │ └── exif: (same as right_surrounding.sequence.exif) ├── right │ ├── path: PosixPath │ └── exif: (same as right_surrounding.sequence.exif) ├── weather │ ├── temperature_celsius: float │ ├── apparent_temperature_celsius: float │ ├── dew_point_celsius: float │ ├── humidity_ratio: float │ ├── pressure_hpa: float │ ├── pressure_trend: str │ ├── wind_speed_mps: float │ ├── wind_gust_mps: float │ ├── wind_direction_degrees: int │ ├── condition: str │ ├── symbol_name: str │ ├── cloud_cover_ratio: int │ ├── precipitation_intensity_mmph: int │ ├── visibility_meters: float │ ├── uv_index: int │ ├── is_daylight: bool │ └── date: datetime ├── _metadata │ ├── _path: PosixPath │ ├── _manifest: (ProcedureManifest) │ └── _thumbnail │ ├── path: PosixPath │ └── exif │ ├── Image │ │ ├── Orientation: str │ │ ├── XResolution: int │ │ ├── YResolution: int │ │ ├── ResolutionUnit: str │ │ └── ExifOffset: int │ ├── ExifImageWidth: int │ └── ExifImageLength: int ├── date: datetime ├── left_surrounding │ ├── _path: PosixPath │ ├── _manifest: (ProcedureManifest) │ └── sequence │ └── [0...3] │ ├── path: PosixPath │ └── exif: (same as right_surrounding.sequence.exif) └── right_surrounding ├── _path: PosixPath ├── _manifest: (ProcedureManifest) └── sequence └── [0...3] ├── path: PosixPath └── exif ├── Image │ ├── Make: str │ ├── Model: str │ ├── Orientation: str │ ├── XResolution: int │ ├── YResolution: int │ ├── ResolutionUnit: str │ ├── Software: float │ ├── DateTime: datetime │ ├── HostComputer: str │ ├── ExifOffset: int │ └── GPSInfo: int ├── GPS │ ├── GPSLatitudeRef: str │ ├── GPSLatitude: str │ ├── GPSLongitudeRef: str │ ├── GPSLongitude: str │ ├── GPSAltitudeRef: int │ ├── GPSAltitude: float │ ├── GPSTimeStamp: str │ ├── GPSDOP: float │ ├── GPSImgDirectionRef: str │ ├── GPSImgDirection: float │ └── GPSDate: str ├── ExposureTime: float ├── FNumber: float ├── ExposureProgram: str ├── ISOSpeedRatings: int ├── ExifVersion: int ├── DateTimeOriginal: datetime ├── DateTimeDigitized: datetime ├── OffsetTime: str ├── OffsetTimeOriginal: str ├── OffsetTimeDigitized: str ├── ShutterSpeedValue: float ├── ApertureValue: float ├── BrightnessValue: float ├── ExposureBiasValue: int ├── MeteringMode: str ├── Flash: str ├── FocalLength: float ├── SubjectArea: str ├── SubSecTimeOriginal: int ├── SubSecTimeDigitized: int ├── ColorSpace: str ├── ExifImageWidth: int ├── ExifImageLength: int ├── FocalPlaneXResolution: float ├── FocalPlaneYResolution: float ├── FocalPlaneResolutionUnit: int ├── SensingMethod: str ├── SceneType: str ├── ExposureMode: str ├── WhiteBalance: str ├── DigitalZoomRatio: float ├── FocalLengthIn35mmFilm: int ├── LensSpecification: str ├── LensMake: str ├── LensModel: str └── Tag └── 0xA460: int

See Also#