Super simple stroke simplification

title: Super simple stroke simplification
dated: January 2025
author: Marcel Goethals

We usually think of an ink stroke as consisting of a series of points connected by straight line segments. If we just take the stream of points from the ipad pencil, we get a very dense dataset. On the one hand, this is great, because it allows us to capture all the fine details of the drawing. On the other hand, it’s a lot of data, and that’s not ideal for things like persistence or networking (it makes automerge sad).

It turns out, we can reduce the amount of points quite significantly without loss of visual fidelity.

We’re already approximating the stroke with a series of straight lines. If two subsequent line segments are roughly parallel, we can replace them with a single line segment. A well-known technique for doing this kind of simplification is the RDP algorithm. It works great, but, because it uses recursion, it requires the whole stroke to be drawn before we can simplify it. This is not ideal for our use case.

I came up with a variation on RDP that allows us to simplify strokes on the fly:

We track the points from the pencil in a buffer.
As they come in, draw a line between the first point and the last point in the buffer.
Pick a ‘threshold’ value that determines how much the intermediate points can deviate from the line. If any of the points deviate too much, we pick the point that deviates the most, and split the buffer at that point.
All points before the split are discarded, resulting in a simplified line segment. All points after the split are kept in the buffer. Repeat.

As far as I can tell, this is still guaranteed to produce a simplified stroke that is within the tolerance of the original stroke. But, it allows us to render the simplified stroke as it’s being drawn. It’s also much faster, because we only need to keep track of a few points at a time.

Here’s a stroke without any simplification. The points are so dense that it’s hard to even see the line segments.

Now, let’s simplify the stroke with a tolerance of half a pixel (low enough to be practically indistinguishable). As you can see, we’re now using only a fraction of the points.

Next to 2d x,y coordinates, we also need to record things like pressure and tilt for each point in the stroke. I just model these values as additional (weighted) dimensions in the point, so the algorithm extends naturally to these values as well.

If we really wanted, we could make this to work with curves as well, but linear segments actually seem good enough.