Article 5 min read

Workshop: Reconstruct a Kaplan-Meier Curve

Extract a two-arm survival curve sampled at 6-month intervals. The workflow used by meta-analysts when authors don't share patient-level data — step-function precision included.

Illustration for "Workshop: Reconstruct a Kaplan-Meier Curve"

Workshop five of five. Two-arm Kaplan-Meier survival curve — the standard chart for time-to-event data in clinical trials. Meta-analysts reconstruct these regularly when authors don’t publish patient-level data. This walks the per-arm extraction and the step-function trick AI reliably misses.

The practice chart

Overall Survival — 2 arms (Control, Treatment), 6-month intervals to 60 months

Open this chart in DataFromChart →

Synthetic two-arm survival curve resembling an oncology trial. Both arms start at survival 1.0 at time 0 and decay over 60 months; Control decays faster than Treatment. Step transitions at 6-month intervals.

Target: two series, each with 11 (time, survival probability) pairs.

The step-function consideration

A KM curve is a step function. Survival holds flat between events, then drops vertically at each event time. The vertical drops are the data; horizontal segments are visual continuity.

Place your point at the corner where the drop happens, not the middle of the segment. The corner is the data.

This is the most common KM digitization error. Mid-segment points look fine visually but are off by half an interval in x — and for survival analysis, where event timing drives the result, that’s serious.

Step 1: open the chart, create the first series

Open the chart, advance to POINTS, create a group named “Control” (the faster-decaying arm).

Step 2: place Control arm points

Click each step corner from time 0 to 60. 11 points: 0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60.

Click the bottom-left corner of each vertical drop — the moment survival just dropped. Zoom in for late-time points where curves flatten against the x-axis and 1-pixel y errors matter more.

Step 3: place Treatment arm points

Create a “Treatment” group. 11 points at the same time intervals along the upper curve.

Step 4: calibrate the axes

Y-axis: drag calibration lines to 0 and 1.0.

X-axis: drag to 0 and 60 (months). Both endpoints visible.

Step 5: export

CSV or XLSX in long format:

arm,time_months,survival
Control,0,1.000
Control,6,0.575
Control,12,0.330
...
Treatment,0,1.000
Treatment,6,0.705
...

R’s survival package and Python’s lifelines accept this directly.

Answer key

Chart generated from S(t) = exp(-h * t) with arm-specific hazard rates:

ArmHazard rate5-year survival
Control0.0935~0.4%
Treatment0.0584~3.1%

Per-point values:

Time (months)ControlTreatment
01.0001.000
60.5690.706
120.3240.498
180.1840.351
240.1050.248
300.0600.175
360.0340.123
420.0190.087
480.0110.061
540.0060.043
600.0040.031

Compute MAE per arm. Target under 0.02 (~2 percentage points). Errors of 0.05+ mean calibration drift or mid-segment points instead of step corners.

For individual-patient-data (IPD) reconstruction, feed this extraction through Guyot et al. 2012 or Liu et al. 2021 to recover number-at-risk and event times. IPDfromKM (R) does it directly. Downstream IPD accuracy is bounded by digitization accuracy.

Common mistakes

  • Points in the middle of horizontal segments. Most common KM error. (time=15, survival=0.4) reads “survival was 0.4 from time 15” when the truth is “dropped to 0.4 at time 12 and held until 18.” Half-interval x error throughout.
  • Missing the time-0 point. Both arms start at 1.0 by construction. Forgetting it misrepresents the initial cohort in IPD reconstruction.
  • Mixing arms. Like the multi-series line workshop, arms can cross or hug each other at long follow-up. Use per-series groups and verify each arm sits on the right curve.
  • Skipping number-at-risk. The row of numbers below the x-axis enables IPD reconstruction. Extract it alongside the survival points.

How this compares to AI

Vision LLMs almost handle KM curves — they recognize the step shape and read approximate survival values — but consistently misplace step times. From our benchmark:

  • All three frontier models had 14-18% MAE.
  • Step transitions landed at the wrong times in roughly half the points — usually rounded to “round” months (12, 24, 36) regardless of actual drop.
  • Coverage was 90%+ — right number of points at wrong coordinates.

For survival analysis, “approximately right” doesn’t help. Hazard ratios are sensitive to event timing; misplacing events by 3-6 months moves the estimate 10-20%. One of the chart types where the AI vs. calibrated gap shows up most clearly downstream.

When you’re doing this for real

This chart is synthetic. Real published KM curves add:

  • Censoring marks (small ticks at censoring times). Not survival drops — patients lost to follow-up. Extract as a separate series if your IPD reconstruction needs them.
  • Confidence bands. Optional; matters if your meta-analysis pools CIs.
  • Number-at-risk table below the x-axis. Extract as a separate small table.
  • Median survival annotation. Useful for cross-checking your reconstructed median.

For the full systematic-review workflow including pooled hazard ratio estimation, see our meta-analysis data extraction guide.

You’ve finished the workshops

That’s all five. If you’ve worked through them:

  • You can extract bar charts, multi-series lines, dense scatters, log axes, and step-function survival curves.
  • You know which chart types AI handles and which it doesn’t.
  • You have a self-graded baseline to compare future extractions against.

The full workshop hub collects everything with sortable difficulty. Drop a comment with chart types you’d like added.

Further reading

Try it on your own chart

Upload an image, click your data points, calibrate the axes, and export CSV. Under three minutes, no login required for a single export.

Open the extractor

Keep reading

All articles