Kinematics Pipeline & SPPB Assessment
How SilverGait approximates clinical SPPB scoring using smartphone-based 2D pose estimation.
System Architecture
SilverGait uses two parallel analysis systems for each SPPB test - a real-time quantitative pipeline and a post-hoc visual intelligence layer that cross-references it.
MoveNet metrics are appended to the Gemini prompt as a structured supplement with clinical reference ranges and auto-generated flags. Gemini cross-references its visual assessment with the quantitative data.
Data Flow
MoveNet Keypoint Map
17 keypoints, 2D pixel-space, confidence 0–1 per keypoint. Minimum confidence threshold: 0.3.
Design Rationale
Why Derived Metrics, Not Raw Coordinates
Storing all 17 keypoints (x, y, confidence) at 15 FPS over 20 seconds = 15,300 values. Instead, we compute ~16 clinically meaningful features per frame and discard coordinates immediately.
| Problem with raw coordinates | How derived metrics solve it |
|---|---|
| Camera distance changes pixel values - same person at different distances gets different absolute coordinates | Joint angles are view-invariant in the sagittal plane - independent of camera distance |
| MoveNet jitters several pixels frame-to-frame - raw data is noisy at individual frame level | Temporal aggregation (mean, CV) filters noise - individual jitter averages out over ~300 frames |
| 15K values bloat the Gemini prompt - thousands of tokens consumed before any context can be given | ~30 scalar summary (~200 tokens) leaves context budget for video analysis and clinical reasoning |
Joint Angle Computation
All angles use the three-point dot-product formula. Returns 0–180 degrees. View-invariant, numerically stable (cosine clamped to [−1, 1]), no reference frame needed.
angle_at_B = arccos( (BA · BC) / (|BA| × |BC|) ) where A, B, C are three consecutive keypoints: A = proximal joint (e.g., hip) B = joint being measured (e.g., knee) C = distal joint (e.g., ankle)
| Angle | Keypoints (L / R) | Clinical Relevance |
|---|---|---|
| Knee (hip–knee–ankle) | 11–13–15 / 12–14–16 | Sit-to-stand ability, gait phase detection, lower extremity function |
| Hip (shoulder–hip–knee) | 5–11–13 / 6–12–14 | Trunk-to-thigh flexion during sit-to-stand; forward lean compensation |
| Elbow (shoulder–elbow–wrist) | 5–7–9 / 6–8–10 | Exercise form assessment (e.g., wall push-ups); arm use during chair stand |
Left and right angles computed independently, then averaged. This handles partial occlusion and enables asymmetry detection.
Key Derived Metrics
atan2(|dx|, dy). Measures lean relative to gravity, not relative to thigh angle. Temporal SD captures instability - higher variability = less postural control.SPPB Test Algorithms
Metrics Extracted
| Metric | Computation | Reference Range |
|---|---|---|
| Sway velocity | Mean frame-to-frame hip center displacement | Healthy: <2.0 px/frame |
| Sway area | Bounding box of hip center path | Lower = better |
| Max sway deviation | Max distance from mean hip center | - |
| Trunk lean avg/max | angleFromVertical(shoulderMid, hipMid) | Healthy: <5 avg, <10 max |
| Trunk lean variability | SD of trunk lean across all frames | Healthy: <3.0, At-risk: >4.0 |
| Stance width | distance(leftAnkle, rightAnkle) | Wider = less stable |
- Body sway magnitude and velocity
- Trunk lean consistency over time
- Shoulder tilt and lateral lean
- Stance width changes between frames
- Stepping or foot repositioning events
- Grabbing support or nearby surface
- Arm recovery movements
- Fear responses and hesitation
Scoring
| Score | Criteria |
|---|---|
| 4 | Steady throughout, minimal sway, trunk lean <5 |
| 3 | Minor sway but maintains position, trunk lean variability <4 |
| 2 | Noticeable wobble, needs adjustment, trunk lean >10 at times |
| 1 | Unable to hold position, steps or grabs support |
| 0 | Unable to attempt safely |
Step Detection Algorithm
1. Compute relativeAnkleX = leftAnkle.x − rightAnkle.x per frame 2. Smooth with 3-frame moving average 3. Detect zero-crossings around the mean (each crossing = one step) 4. Enforce minimum 4-frame gap between steps to prevent noise
Metrics Extracted
| Metric | Computation | Reference Range |
|---|---|---|
| Step count | Zero-crossings of smoothed ankle separation | - |
| Cadence | (stepCount / durationSec) × 60 | Healthy: 100–120 steps/min |
| Step symmetry index | abs(avgLeft − avgRight) / avgBoth × 100% | <20% = normal |
| Double support ratio | Fraction of frames where both ankles move <2px | Healthy: <0.3 |
| Gait rhythm variability | CV of inter-step intervals | Healthy: <8%, At-risk: >10% |
| Arm swing symmetry | Range of wrist Y positions, min/max ratio | Near 1.0 = normal |
| Knee angle range | max − min across all frames | Healthy: 60–70 range |
Step detection requires lateral camera view. Person walking toward or away from camera produces near-zero ankle X separation - steps are undetectable by MoveNet. Gemini still assesses visually.
- Ankle X-separation oscillation (step count)
- Gait rhythm variability (CV of intervals)
- Arm swing from wrist Y-position range
- Knee angle range during gait cycle
- Walking direction and turn-around
- Use of assistive device
- Shuffling gait pattern
- Freezing of gait episodes
Scoring
| Score | Criteria |
|---|---|
| 4 | Smooth confident walking, 100–120 cadence, symmetric arm swing |
| 3 | Minor issues - slight asymmetry or reduced arm swing |
| 2 | Noticeable difficulties - irregular rhythm, wide base, <80 cadence |
| 1 | Significant issues - shuffling, high double support, very slow |
| 0 | Unable to complete or severe impairment observed |
Rep Detection Algorithm
1. Record hipCenter.y per frame (screen coords: lower Y = standing) 2. Smooth with 3-frame moving average 3. Compute prominence threshold = 15% of total Y range 4. Find valleys (standing positions) with sufficient prominence 5. Rep count = number of valleys − 1 Note: In screen coordinates Y increases downward. Standing moves hip UP on screen = lower Y = valley in the signal. Prominence-based (not threshold-based) = self-calibrating across different camera distances, chair heights, and user heights.
Metrics Extracted
| Metric | Computation | Reference Range |
|---|---|---|
| Rep count | Valley detection in hip Y signal | Target: 5 |
| Avg rep time | Mean valley-to-valley interval | SPPB: <2240ms=4, <2720ms=3 |
| Total duration | Last − first frame timestamp | <11.2s=4, <13.7s=3, <16.7s=2 |
| Peak trunk lean during rise | Max trunk lean between valley and next peak | Healthy: <15, Compensatory: >25 |
| Rep consistency | CV of rep durations | Healthy: <15%, Fatiguing: >25% |
| Transition speed | Mean knee angle velocity during rising phase | Higher = stronger lower limbs |
- Hip Y-trajectory (valley = standing)
- Rep timing and consistency (CV)
- Trunk forward lean during rise
- Knee angle velocity during push phase
- Hands pushing off armrests or thighs
- Momentum-based cheating technique
- Partial completion or giving up
- Safety concerns mid-test
Scoring
| Score | Time for 5 Reps | Criteria |
|---|---|---|
| 4 | <11.2s | Smooth, without hands, consistent rep timing |
| 3 | 11.2–13.6s | Minor slowness, low trunk lean, no hand use |
| 2 | 13.7–16.6s | Needs arms for momentum, or inconsistent reps |
| 1 | >16.7s | Very slow, incomplete, excessive forward lean |
| 0 | - | Unable to stand without assistance |
Combined SPPB Score
Aggregation Strategy
The final PoseMetricsSummary contains ~30 scalar values. We chose statistical summaries over full time series because compact summaries are complementary (not redundant) to Gemini's visual analysis, and are interpretable by clinicians.
Gemini already sees the video - metrics provide quantified features hard to estimate visually (exact angles, variability coefficients, symmetry indices). Clinicians understand "avg knee angle: 142, trunk lean CV: 8.3" better than a 300-element array.
| Statistic | Used For | Rationale |
|---|---|---|
| Mean | Angles, sway velocity, trunk lean | Central tendency - typical behavior across the full test |
| Min / Max | Angles, stance width | Extremes of range of motion across the test |
| Range | Knee/hip angle | ROM in a single number - peak-to-peak excursion |
| SD | Trunk lean variability | Instability measure - high SD = inconsistent posture |
| CV | Rep duration, gait rhythm | Consistency normalized for speed differences between individuals |
| Count | Steps, reps completed | Direct SPPB sub-score mapping - most clinically meaningful |
| Total | Sway displacement | Cumulative postural excursion - integrates all instability |
Known Limitations
| Scenario | MoveNet Impact | Gemini Impact |
|---|---|---|
| Person walks toward camera | Step detection fails - near-zero ankle X separation | Can still assess walking quality visually |
| Poor lighting | Low keypoint confidence scores, mostly null metrics | May struggle to see body clearly |
| Loose / baggy clothing | Noisy keypoint positions - body outline obscured | Generally unaffected by clothing |
| Multiple people in frame | Tracks strongest-signal person (may be wrong) | May be confused about which person to score |
| Camera shake | False sway motion added to all metrics | Negligible - holistic assessment compensates |
| Very fast movement | 15 FPS may alias rapid transitions (chair stand) | Video preserves full motion at native framerate |
No real-world units (pixels only) · No depth perception · No foot/toe keypoints · No hand/finger detail · Single person only · 17 keypoints (no spine segments, no facial landmarks for orientation). These constraints are inherent to 2D monocular pose estimation and cannot be resolved without depth sensors or multi-camera setups.
Despite these limitations, the 2D pipeline provides clinically relevant relative metrics that, combined with Gemini's visual analysis, produce a useful SPPB approximation suitable for screening and longitudinal tracking of functional decline. For clinical diagnosis, a trained physiotherapist should administer the full SPPB.
For the full clinical evidence and peer-reviewed validation studies supporting each metric choice, see research.html.