Skip to main content

Hand Tracking

HandProcessor estimates 3D hand keypoints, it locates the hand region-of-interest (ROI) and predicts the pose keypoints providing smooth, stable and accurate pose estimation fot the hand. 2D pixel hand keypoints - points in the screen coordinate space. X and Y coordinates are normalized screen coordinates (scaled by width and height of the input image), while the Z coordinate is depth within orthographic projection space, it has the same scale as X coordinate (normalized by image width). 2D points can be used for 2D overlays, math analyzes, or when using orthographic camera. 3D metric points - points within 3D space of perspective camera located at the space origin and pointed in the negative direction of the Z-axis. These points can be used for 3D model overlays or virtual try-on. Rigged and skinned models can be rendered on top of the pose aligning skeleton/armature joints with 3D keypoints. 3D and 2D points are perfectly aligned, projections of 3D points coincide with 2D pixel coordinates within the perspective camera.

Additionally hand processor detects wrist 2D position and direction. Wrist detection provides 3 lines in the screen coordinate space. Middle line defines 2D wrist base/center point and unit direction vector of the wrist. Two more lines define wrist edges by 2D screen points at the end of the wrist along transversal section through the base point and associated direction vectors. Wrist detection provides for virtual try-on of accessories like watches and bands.

HandProcessor emits HandResult storing results of hand tracking. They're passed to Renderer. HandEngine is a straightforward specialization of Engine for HandProcessor.

Simple application utilizing HandProcessor. In this application we add runtime switching between front and rear cameras. Note how we disable the button to prevent concurrent pipeline state change.

import { HandEngine } from "@geenee/bodyprocessors";
import { CustomRenderer } from "./customrenderer";
import "./index.css";

let rear = false;
const engine = new HandEngine();
const token = location.hostname === "localhost" ?
"localhost_sdk_token" : "prod.url_sdk_token";

async function main() {
const container = document.getElementById("root");
if (!container)
return;
const renderer = new CustomRenderer(
container, "crop", !rear, "model.glb");

const cameraSwitch = document.getElementById(
"camera-switch") as HTMLButtonElement | null;
if (cameraSwitch) {
cameraSwitch.onclick = async () => {
cameraSwitch.disabled = true;
rear = !rear;
await engine.setup({ size: { width: 1920, height: 1080 }, rear });
await engine.start();
renderer.setMirror(!rear);
cameraSwitch.disabled = false;
}
}

await Promise.all([
engine.addRenderer(renderer),
engine.init({ token: token })]);
await engine.setup({ size: { width: 1920, height: 1080 }, rear });
await engine.start();
}
main();

Documentation of the following packages provides details on how to build more extensive application with custom logic: