03 · Fine-grained provenance

In our previous post, we explored interfaces for tracing provenance between inputs and outputs at the level of an entire file. But sometimes, it’s useful to see a more detailed view: fine-grained provenance that connects specific parts of one file with corresponding parts of another file.

Fine-grained provenance connects parts of a source artifact with corresponding parts of a build output

One example of fine-grained provenance is a feature found in some existing LaTeX editors: when a user is viewing a source file and a compiled PDF side by side, the interface can automaticaly scroll the PDF to show the location corresponding to the current location in the source file (and vice versa). Multiple researchers have told us they find this feature essential for keeping track of where they are as they edit.

Beyond “click to scroll”, what other interactions might use fine-grained provenance to help authors work across source and build files? And more broadly, how might fine-grained provenance grow from a one-off feature for LaTeX files to a universal primitive in a multimedia collaboration environment?

One interaction we explored was inline editing, where a user can click on text in the PDF and edit the source code underlying that text in a popup window:

We found that this workflow felt great for making small local edits while reading a built document, almost like editing a PDF directly.

We also explored mirrored comments, in which comments left on the formatted PDF are propagated to the corresponding location in the source file:

This allows a commenter to work on the PDF, while an author addressing comments can see them directly while editing source text:

Propagating comments also has a subtler advantage. Usually, comments on a PDF can’t be carried forward to future compiled versions of the paper, because comments are anchored to fixed positions while contents shift beneath them. But when comments are attached to source material, it’s possible to automatically move the comments to their new locations as the PDF evolves.

Finally, we considered a more speculative idea for organizing projects based on fine-grained provenance: what if the layout of a file could serve as a visual map for navigating the project structure? For example, a Python file used to create a figure would be placed by the figure it creates; in turn, the data file that it reads in would be located near the line of code that reads the file:

Using arrows between specific parts of a Python file and a PDF to indicate fine-grained provenance

So far, our interactive prototypes above have been built specifically for the combination of a LaTeX source file and an output PDF, but we’re intrigued by the possibility of generalizing the concepts to apply to various kinds of media and build processes.

We’ve previously described the pointer concept in our Patchwork environment, which defines how to point to specific parts of a document like an essay or a drawing. Fine-grained provenance could build on this pointer mechanism, if build steps produced a mapping from pointers in the source to corresponding pointers in the output.

Such a system would support our LaTeX prototypes, which were built using SyncTeX to compute mappings between LaTeX source lines and corresponding bounding boxes in a PDF. And it could also generalize to other kinds of mappings, like source maps generated by code compilers.

One challenge we’ve encountered on the path to a general provenance primitive is that the exact granularity and precision of provenance information can significantly affect the experience. For certain interactions like clicking to scroll to a corresponding location, a rough approximation suffices. But for interactions like inline editing or propagating comments, precision matters. In our initial experiences working with SyncTeX, we’ve already hit challenges with imprecise and inccurate provenance data; these limitations in the underlying metadata can lead to a confusing experience.

Overall, we think there’s great potential in interactions that use fine-grained provenance to move back and forth between source files and build products, crossing the boundaries of different media. We think fluid movements like these may be more natural in Jacquard than in the conventional world of siloed applications.

This is the last entry for now, but you can go back to read the lab notebook from the beginning.


The Ink & Switch Dispatch

Keep up-to-date with the lab's latest findings, appearances, and happenings by subscribing to our newsletter. For a sneak peek, browse the archive.