Skip to content

Data

Annotation

Data is not useful just because it is collected. Different systems need different kinds of annotation: some labels are historic, some need to generalize, and some stay dynamic because the target and goal changes. The useful layer is calibrated.

How I think about annotation

Start from rough labels, then make the labels useful for the system that will read them.

1256 tweet annotations
  • Keep the source, timestamp, privacy, confidence, and lineage visible.
  • Annotate for the goal: search, timeline, memory, dashboard, review, or matching.
  • General tags help across data types; goal-specific labels make a system useful.
  • Tweets were the gap: they had timestamps and excerpts, but no semantic annotation layer.
170 maps1256 tweets147 charts1256 tweet annotations

Palimpsest gap recorded: tweet data needs calibrated annotations per target system, not one static label set for every use case.