Skip to content

capturegraph.data #

CaptureGraph Data Module#

Load and traverse CaptureGraph capture target data with vectorized operations.

Quick Start
import capturegraph.data as cg
from pathlib import Path

# Load a capture target
target = cg.CaptureTarget(Path("./MyCapture"))
sessions = target.surveys

# Vectorized attribute access
ratings = sessions.tastiness_rating  # → List of all ratings

# Transform with map
names = sessions.date.map(lambda d: f"{d:%Y%m%d}.heic")

# Build paths with zip
dst = cg.zip(
    dir=Path("./output"),
    name=names
).map(lambda r: r.dir / r.name)

# Copy files
cg.copy(src=sessions.photo, dst=dst)
Key Concepts
  • List: A list that broadcasts attribute access to all elements. When you access .foo on a List, it returns a new List containing item.foo for each item.
  • Dict: A dict that allows attribute-style access (d.foo instead of d["foo"]).
  • zip: Combines multiple Lists with NumPy-style broadcasting. Use keyword arguments to name each column.
  • MissingType: A null object for safe chaining. Missing attributes return MissingType instead of raising AttributeError, allowing safe chains like session.optional_field.nested without try/except.
Module Organization
  • Core types (capturegraph.data.containers):
    • List[T]: List with vectorized attribute access
    • Dict[V]: Dict with attribute access (string keys only)
    • zip(**kwargs): Broadcasting zip for combining Lists
    • MissingType: Null object for safe chaining (with reason)
  • Load submodule (capturegraph.data.load):
    • CaptureTarget: Entry point for loading a capture target directory
    • load_directory: Recursively load a directory from a manifest
  • Save submodule (capturegraph.data.save):
    • copy(src=..., dst=...): Copy files

Modules:

Name Description
containers

Vector Module - Vectorized Collection Types

load

Load submodule - Data loading and traversal.

save

Save submodule - Export and persist data

typed_json

PType JSON Serialization