Technical Reference

Kinematics Pipeline & SPPB Assessment

How SilverGait approximates clinical SPPB scoring using smartphone-based 2D pose estimation.

01

System Architecture

SilverGait uses two parallel analysis systems for each SPPB test - a real-time quantitative pipeline and a post-hoc visual intelligence layer that cross-references it.

MoveNet - Frontend, Real-Time
Role17 body keypoints at ~15 FPS, computes biomechanical metrics per frame
StrengthsQuantitative, frame-level precision, reproducible, runs on-device
WeaknessesPixel-space only, 2D projection, noisy individual frames
Gemini Vision - Backend, Post-Hoc
RoleWatches full video, scores holistically with clinical context
StrengthsContext-aware, detects qualitative issues, robust to camera angle
WeaknessesNon-deterministic, no frame-level data, API latency

MoveNet metrics are appended to the Gemini prompt as a structured supplement with clinical reference ranges and auto-generated flags. Gemini cross-references its visual assessment with the quantitative data.

Data Flow

inputCamera (15 FPS) - smartphone video capture
MoveNet - 17 keypoints per frame (x, y, confidence 0–1)
Per-frame FrameMetrics - angles, positions, velocities computed and coordinates discarded
aggregateMetrics() on recording stop - reduces ~300 frames to summary statistics
PoseMetricsSummary - ~30 scalar values (mean, min, max, SD, CV, count)
Appended to Gemini prompt alongside video upload
Gemini returns - score (0–4), issues[], confidence, recommendations[]

MoveNet Keypoint Map

17 keypoints, 2D pixel-space, confidence 0–1 per keypoint. Minimum confidence threshold: 0.3.

0nose
1left_eye
2right_eye
3left_ear
4right_ear
5left_shoulder
6right_shoulder
7left_elbow
8right_elbow
9left_wrist
10right_wrist
11left_hip
12right_hip
13left_knee
14right_knee
15left_ankle
16right_ankle
02

Design Rationale

Why Derived Metrics, Not Raw Coordinates

Storing all 17 keypoints (x, y, confidence) at 15 FPS over 20 seconds = 15,300 values. Instead, we compute ~16 clinically meaningful features per frame and discard coordinates immediately.

Problem with raw coordinates How derived metrics solve it
Camera distance changes pixel values - same person at different distances gets different absolute coordinates Joint angles are view-invariant in the sagittal plane - independent of camera distance
MoveNet jitters several pixels frame-to-frame - raw data is noisy at individual frame level Temporal aggregation (mean, CV) filters noise - individual jitter averages out over ~300 frames
15K values bloat the Gemini prompt - thousands of tokens consumed before any context can be given ~30 scalar summary (~200 tokens) leaves context budget for video analysis and clinical reasoning

Joint Angle Computation

All angles use the three-point dot-product formula. Returns 0–180 degrees. View-invariant, numerically stable (cosine clamped to [−1, 1]), no reference frame needed.

angle_at_B = arccos( (BA · BC) / (|BA| × |BC|) )

where A, B, C are three consecutive keypoints:
  A = proximal joint (e.g., hip)
  B = joint being measured (e.g., knee)
  C = distal joint (e.g., ankle)
Angle Keypoints (L / R) Clinical Relevance
Knee (hip–knee–ankle) 11–13–15 / 12–14–16 Sit-to-stand ability, gait phase detection, lower extremity function
Hip (shoulder–hip–knee) 5–11–13 / 6–12–14 Trunk-to-thigh flexion during sit-to-stand; forward lean compensation
Elbow (shoulder–elbow–wrist) 5–7–9 / 6–8–10 Exercise form assessment (e.g., wall push-ups); arm use during chair stand

Left and right angles computed independently, then averaged. This handles partial occlusion and enables asymmetry detection.

Key Derived Metrics

Hip Center
Midpoint of keypoints 11 and 12 (left + right hip). Serves as center-of-mass proxy for sway analysis, vertical tracking (sit-to-stand rep pattern), and horizontal tracking (lateral gait movement). Preferred over nose (head turns) or shoulder midpoint (arm movement).
Trunk Lean
Angle between the shoulder–hip line and the vertical axis, via atan2(|dx|, dy). Measures lean relative to gravity, not relative to thigh angle. Temporal SD captures instability - higher variability = less postural control.
Confidence Threshold: 0.3
Intentionally low because MoveNet Lightning is optimized for speed over accuracy, and temporal aggregation smooths noisy individual readings. Consistent with thresholds used in Ung et al. (2022) and Ali et al. (2024).
03

SPPB Test Algorithms

Test 1: Balance (12 seconds)
Goal: Can the person stand still with feet together for ~10 seconds without excessive sway?

Metrics Extracted

MetricComputationReference Range
Sway velocityMean frame-to-frame hip center displacementHealthy: <2.0 px/frame
Sway areaBounding box of hip center pathLower = better
Max sway deviationMax distance from mean hip center -
Trunk lean avg/maxangleFromVertical(shoulderMid, hipMid)Healthy: <5 avg, <10 max
Trunk lean variabilitySD of trunk lean across all framesHealthy: <3.0, At-risk: >4.0
Stance widthdistance(leftAnkle, rightAnkle)Wider = less stable
Clinical Flags
High sway velocity (>3.0) · High trunk lean variability (>4.0 SD) · Excessive trunk lean (>15 max)
MoveNet Detects
  • Body sway magnitude and velocity
  • Trunk lean consistency over time
  • Shoulder tilt and lateral lean
  • Stance width changes between frames
Gemini Fills In
  • Stepping or foot repositioning events
  • Grabbing support or nearby surface
  • Arm recovery movements
  • Fear responses and hesitation

Scoring

ScoreCriteria
4Steady throughout, minimal sway, trunk lean <5
3Minor sway but maintains position, trunk lean variability <4
2Noticeable wobble, needs adjustment, trunk lean >10 at times
1Unable to hold position, steps or grabs support
0Unable to attempt safely
Test 2: Gait / Walking (15 seconds)
Goal: Can the person walk at a normal pace with a smooth, symmetric gait pattern?

Step Detection Algorithm

1. Compute relativeAnkleX = leftAnkle.x − rightAnkle.x per frame
2. Smooth with 3-frame moving average
3. Detect zero-crossings around the mean (each crossing = one step)
4. Enforce minimum 4-frame gap between steps to prevent noise

Metrics Extracted

MetricComputationReference Range
Step countZero-crossings of smoothed ankle separation -
Cadence(stepCount / durationSec) × 60Healthy: 100–120 steps/min
Step symmetry indexabs(avgLeft − avgRight) / avgBoth × 100%<20% = normal
Double support ratioFraction of frames where both ankles move <2pxHealthy: <0.3
Gait rhythm variabilityCV of inter-step intervalsHealthy: <8%, At-risk: >10%
Arm swing symmetryRange of wrist Y positions, min/max ratioNear 1.0 = normal
Knee angle rangemax − min across all framesHealthy: 60–70 range
Clinical Flags
Asymmetric gait (>20%) · Irregular rhythm (CV >10%) · Low cadence (<80) · High double support (>40%)
Key Limitation

Step detection requires lateral camera view. Person walking toward or away from camera produces near-zero ankle X separation - steps are undetectable by MoveNet. Gemini still assesses visually.

MoveNet Detects
  • Ankle X-separation oscillation (step count)
  • Gait rhythm variability (CV of intervals)
  • Arm swing from wrist Y-position range
  • Knee angle range during gait cycle
Gemini Fills In
  • Walking direction and turn-around
  • Use of assistive device
  • Shuffling gait pattern
  • Freezing of gait episodes

Scoring

ScoreCriteria
4Smooth confident walking, 100–120 cadence, symmetric arm swing
3Minor issues - slight asymmetry or reduced arm swing
2Noticeable difficulties - irregular rhythm, wide base, <80 cadence
1Significant issues - shuffling, high double support, very slow
0Unable to complete or severe impairment observed
Test 3: Chair Stand (20 seconds)
Goal: Can the person stand up and sit down 5 times without hands, and how consistent are they?

Rep Detection Algorithm

1. Record hipCenter.y per frame (screen coords: lower Y = standing)
2. Smooth with 3-frame moving average
3. Compute prominence threshold = 15% of total Y range
4. Find valleys (standing positions) with sufficient prominence
5. Rep count = number of valleys − 1

Note: In screen coordinates Y increases downward.
Standing moves hip UP on screen = lower Y = valley in the signal.
Prominence-based (not threshold-based) = self-calibrating across
different camera distances, chair heights, and user heights.

Metrics Extracted

MetricComputationReference Range
Rep countValley detection in hip Y signalTarget: 5
Avg rep timeMean valley-to-valley intervalSPPB: <2240ms=4, <2720ms=3
Total durationLast − first frame timestamp<11.2s=4, <13.7s=3, <16.7s=2
Peak trunk lean during riseMax trunk lean between valley and next peakHealthy: <15, Compensatory: >25
Rep consistencyCV of rep durationsHealthy: <15%, Fatiguing: >25%
Transition speedMean knee angle velocity during rising phaseHigher = stronger lower limbs
Clinical Flags
Excessive forward lean (>25°) · Inconsistent rep timing (CV >25%) · Low rep count (<5 completed)
MoveNet Detects
  • Hip Y-trajectory (valley = standing)
  • Rep timing and consistency (CV)
  • Trunk forward lean during rise
  • Knee angle velocity during push phase
Gemini Fills In
  • Hands pushing off armrests or thighs
  • Momentum-based cheating technique
  • Partial completion or giving up
  • Safety concerns mid-test

Scoring

ScoreTime for 5 RepsCriteria
4<11.2sSmooth, without hands, consistent rep timing
311.2–13.6sMinor slowness, low trunk lean, no hand use
213.7–16.6sNeeds arms for momentum, or inconsistent reps
1>16.7sVery slow, incomplete, excessive forward lean
0 -Unable to stand without assistance

Combined SPPB Score

10–12
Robust
Good mobility, low fall risk. Continue preventive exercise to maintain function.
7–9
Pre-Frail
Some decline detected. Early intervention is beneficial - focus on balance and strength.
0–6
Frail
Significant impairment, high fall risk. Intensive intervention and caregiver involvement needed.
04

Aggregation Strategy

The final PoseMetricsSummary contains ~30 scalar values. We chose statistical summaries over full time series because compact summaries are complementary (not redundant) to Gemini's visual analysis, and are interpretable by clinicians.

Gemini already sees the video - metrics provide quantified features hard to estimate visually (exact angles, variability coefficients, symmetry indices). Clinicians understand "avg knee angle: 142, trunk lean CV: 8.3" better than a 300-element array.

Statistic Used For Rationale
MeanAngles, sway velocity, trunk leanCentral tendency - typical behavior across the full test
Min / MaxAngles, stance widthExtremes of range of motion across the test
RangeKnee/hip angleROM in a single number - peak-to-peak excursion
SDTrunk lean variabilityInstability measure - high SD = inconsistent posture
CVRep duration, gait rhythmConsistency normalized for speed differences between individuals
CountSteps, reps completedDirect SPPB sub-score mapping - most clinically meaningful
TotalSway displacementCumulative postural excursion - integrates all instability
05

Known Limitations

Scenario MoveNet Impact Gemini Impact
Person walks toward camera Step detection fails - near-zero ankle X separation Can still assess walking quality visually
Poor lighting Low keypoint confidence scores, mostly null metrics May struggle to see body clearly
Loose / baggy clothing Noisy keypoint positions - body outline obscured Generally unaffected by clothing
Multiple people in frame Tracks strongest-signal person (may be wrong) May be confused about which person to score
Camera shake False sway motion added to all metrics Negligible - holistic assessment compensates
Very fast movement 15 FPS may alias rapid transitions (chair stand) Video preserves full motion at native framerate
Fundamental 2D Limitations

No real-world units (pixels only) · No depth perception · No foot/toe keypoints · No hand/finger detail · Single person only · 17 keypoints (no spine segments, no facial landmarks for orientation). These constraints are inherent to 2D monocular pose estimation and cannot be resolved without depth sensors or multi-camera setups.

Despite these limitations, the 2D pipeline provides clinically relevant relative metrics that, combined with Gemini's visual analysis, produce a useful SPPB approximation suitable for screening and longitudinal tracking of functional decline. For clinical diagnosis, a trained physiotherapist should administer the full SPPB.

For the full clinical evidence and peer-reviewed validation studies supporting each metric choice, see research.html.