Your app's camera already sees something most developers overlook: your skin pulses. Every heartbeat sends a wave of blood through your face, and that wave changes how much light your skin reflects — by a fraction of a percent. Remote photoplethysmography (rPPG) is the science of detecting that signal. And modern rPPG SDKs can extract it with 99%+ accuracy using nothing but the front-facing camera.
This guide breaks down exactly how it works — from the physics, through the signal processing pipeline, down to the three function calls that get it running in your app.
What is rPPG?
Remote photoplethysmography (rPPG, or sometimes "remote PPG") is a technique that measures blood volume pulse changes by analyzing subtle variations in skin color captured via camera. Traditional pulse oximeters clip to your finger and shine light through tissue. rPPG does the same thing — optically — but from a distance, using ambient light and a standard camera.
The core insight: blood absorbs green light differently than surrounding tissue. As blood pulses through facial capillaries with each heartbeat, it creates periodic changes in how much green light is reflected back to the camera. These changes are invisible to the naked eye — on the order of 0.1% variation in pixel intensity — but they're consistent, measurable, and repeatable.
Why green? Hemoglobin (the protein in red blood cells) absorbs green light (~550nm) much more strongly than surrounding skin tissue. This maximizes the signal-to-noise ratio of the pulse signal captured from the face.
From the raw pulse signal, rPPG algorithms can derive:
- Heart rate (BPM) — the frequency of the pulse wave
- Heart rate variability (HRV) — time between successive beats, a proxy for autonomic nervous system state
- Stress score — derived from HRV patterns; low HRV correlates with elevated stress
- Respiratory rate — slower modulation of the same signal from breathing
- SpO2 (blood oxygen) — requires comparing red and green channel ratios
How It Works: The Signal Chain
The pipeline has five distinct stages. Here's how data flows from camera frame to vital sign:
Stage 1: Face Detection & Landmark Localization
Every frame from the camera is passed through a lightweight ML model (typically a convolutional neural network or MediaPipe-style landmark detector). This identifies 468 facial landmarks and tracks them frame-to-frame. Tracking is essential — without it, head movement creates motion artifacts that corrupt the pulse signal.
Stage 2: Region of Interest (ROI) Extraction
Not all skin pixels are equal. The forehead and cheeks produce the strongest rPPG signal because they have high capillary density and minimal hair/shadow interference. The SDK extracts pixel intensity averages from these regions across the green (and sometimes red) channels for each frame, building a time-series signal at the camera's frame rate (typically 30fps).
Stage 3: Signal Preprocessing & Filtering
The raw signal is noisy. Motion, lighting changes, and camera sensor noise all contaminate it. A bandpass filter is applied to isolate the physiologically plausible heart rate range — roughly 0.7 Hz to 4.0 Hz (42–240 BPM). Independent Component Analysis (ICA) or Plane-Orthogonal-to-Skin (CHROM/POS) decomposition separates the blood volume pulse component from specular reflectance noise caused by head movement.
Stage 4: Frequency Analysis (FFT)
The cleaned signal is transformed into the frequency domain using a Fast Fourier Transform (FFT). The dominant frequency peak corresponds to the heart rate. A sliding window (typically 10–30 seconds) is used to compute real-time estimates. HRV is computed from the inter-beat intervals in the time domain signal (peak-to-peak analysis).
Stage 5: Vital Sign Derivation
The final stage maps frequency-domain output to clinical metrics. Heart rate is the FFT peak frequency × 60. HRV is the standard deviation of RR intervals (SDNN) or the root mean square of successive differences (RMSSD). Stress is computed from HRV using validated physiological models — lower HRV indicates higher sympathetic nervous system activation (stress).
Integration: Beam AI SDK Walkthrough
Understanding the pipeline is useful context. But in practice, you don't implement any of it. The SDK abstracts all five stages behind a clean API. Here's what integration actually looks like:
// Step 1: Install // npm install @beam-ai/sdk import BeamAI from '@beam-ai/sdk'; // Step 2: Initialize with your API key const beam = new BeamAI({ apiKey: 'your_api_key', mode: 'realtime', // or 'snapshot' for one-shot readings }); // Step 3: Start measurement from video stream const stream = await navigator.mediaDevices.getUserMedia({ video: true }); const session = await beam.startMeasurement({ videoStream: stream, duration: 30, // seconds (10–60 recommended) metrics: ['heartRate', 'hrv', 'stress'], }); // Step 4: Read results (fires at each measurement update) session.onResult((result) => { console.log('Heart Rate:', result.heartRate.bpm); // e.g. 72 console.log('HRV (RMSSD):', result.hrv.rmssd); // e.g. 42.3 ms console.log('Stress Score:', result.stress.score); // 0–100 scale console.log('Confidence:', result.confidence); // 0.0–1.0 }); session.onComplete((final) => { console.log('Final reading:', final); stream.getTracks().forEach(t => t.stop()); });
The SDK handles camera permission UX, face tracking, signal quality monitoring, and confidence scoring. If the user moves too much or lighting is poor, result.confidence drops below 0.7 and you can prompt them to hold still. The SDK also exposes a onQualityWarning callback for degraded signal conditions.
30 seconds is the sweet spot. Shorter windows (<10s) produce noisy HRV estimates. Longer windows (>60s) add friction. For most use cases, a 30-second measurement gives you medically-grade accuracy with acceptable UX.
Accuracy Comparison: How Beam AI Stacks Up
rPPG accuracy varies enormously between SDK vendors. The differences come from model quality, motion compensation, and how they handle diverse skin tones and lighting conditions. Here's how the major players compare:
| SDK | Heart Rate Accuracy | HRV | Pricing | Integration |
|---|---|---|---|---|
| Beam AI | ✓ 99.2% | ✓ Yes | $99/mo | npm · 3 lines |
| Binah.ai | ~95–97% | ✓ Yes | $50K+/yr | SDK + enterprise onboarding |
| NuraLogix | ~96% | ✓ Yes | Enterprise only | Custom integration required |
| Shen AI | ~93–95% | Limited | Contact sales | Mobile SDK only |
The 99.2% figure is the mean absolute error rate validated against clinical-grade ECG reference measurements across a diverse cohort (varied skin tones, lighting conditions, age groups). Binah.ai and NuraLogix produce solid results but their pricing structures make them inaccessible for startups and indie developers. Shen AI focuses on mobile-only deployments and has limited web support.
Use Cases Worth Building
Once you have a reliable rPPG signal in your app, the product surface area opens up significantly:
Telehealth & Remote Patient Monitoring
Add passive vitals capture to video consultation platforms. Clinicians get heart rate and HRV trends without requiring patients to have wearables. Particularly valuable for elderly patients or those with limited device literacy. HIPAA and GDPR compliance is table-stakes here — Beam AI handles both.
Wellness & Mental Health Apps
Stress detection before and after meditation sessions. HRV-gated breathing exercises that adapt to the user's real-time autonomic state. Mood tracking correlated with physiological data. The camera measurement creates objective data that complements self-reported mood.
Corporate Wellness Platforms
Voluntary periodic check-ins that surface stress trends at team or department level (anonymized). HR teams can identify burnout signals before they become attrition. Integrates naturally into existing video call infrastructure.
Fitness & Sports Performance
Recovery readiness scoring based on HRV. Pre-workout baseline measurements. Post-workout cool-down monitoring. All without requiring the user to wear a chest strap or smartwatch.
What Actually Limits Accuracy
It's worth being honest about the constraints. rPPG performance degrades under several conditions:
- Low or inconsistent lighting — fluorescent flicker and rapidly changing light sources create artifacts. Natural or stable indoor lighting works best.
- Significant motion — talking, nodding, or fidgeting during measurement reduces signal quality. Most SDKs handle small movements, but large movements require the window to restart.
- Darker skin tones — historically a known problem for rPPG. Beam AI's model was specifically trained on diverse skin tone datasets to minimize this disparity.
- Compression artifacts — heavy video compression (low-bandwidth streams) destroys the subtle pixel variations rPPG relies on. 720p at reasonable bitrate is the minimum.
The confidence score the SDK returns accounts for all of these factors. If confidence drops below 0.6, surface a "please hold still" or "improve lighting" prompt to the user rather than displaying potentially inaccurate numbers.
Ready to Integrate?
The technical foundation is solid. The SDK abstracts the complexity. For most apps, you're looking at a half-day integration to get meaningful biometric data flowing from your users' cameras.
See the Why Beam AI page for the full feature comparison and compliance details. API keys and quickstart docs are coming soon.
Start Building With rPPG
99.2% accuracy · HIPAA & GDPR compliant · $99/mo · Three lines of code