Skip to content

hand tracking

3 posts with the tag “hand tracking”

Release v0.7.5

Hair Segmentation

The most important new feature added in this release is Hair Segmentation. We have added the new type of tracking that allows to implement virtual try-on effects like hair recoloring, patching of hairs in headwear virtual try-on, and many more. Hair Segmentation is GPGPU powered and real-time, it supports smooth visual effects on mobile devices.

The standard set of core tools to enable hair segmentation in an app:

Already available plugins implementing processings of segmentation masks are compatible with hair segmentation mask by default. Developers can use familiar building blocks to assemble pipelines of visually rich AR effects. ShaderPlugin can be used as a starting point of a custom visual effect on top of hair segmentation mask.

Mask Post-Processing

MaskBinaryPlugin applying binarization operation to a segmentation mask. This plugin can be used within mask processing pipeline to filter pixels based on their probability and separate foreground and background pixels by a threshold.

MaskStepPlugin is a similar but more advanced version of the plugin applying smoothstep operation between two thresholds. Compared to more agressive Heaviside step function used in simple binarization, smoothstep provides smooth transition of mask value between foreground and background regions still separating them. We recommend using MaskStepPlugin for smoother visuals.

MaskFilterPlugin performs temporal filtering of a segmentation mask that is based on minimization of entropy in probability space. It can be integrated into a mask post-processing pipeline to decrease temporal noise.

One notable new plugin designed specifically for hair segmentation mask is MaskSharpPlugin. When used in post-processing pipeline of a segmentation mask it significantly improves its accuracy. For example, in case of hair segmentation it can even highlight separate locks of hairs that were missed by a neural network. We recommend to apply a slight spacial smoothing (about 0.5) of a mask before MaskSharpPlugin. If not provided by the Processor temporal smoothing of a segmentation mask can also improve final results. While improvement of mask quality is significant MaskSharpPlugin is really fast and can be used in mobile browsers without sacrificing frames per second.

Body Patching

In this release we reimplemented body patching processing and corresponding plugins. Partial body patching is now not limited in radius of search for the closes body part defined by “patch” and “keep” meshes. This means that no matter how long is the distance between a pixel of a segmentation mask and the closest mesh it’ll be correctly classified as “patch” or “keep” pixel. Moreover this stage of processing is now blazingly fast and processing time doesn’t depend on a search radius. This makes usage of body patching more convenient across all virtual try-on types. Additionally, the same algorithm is applied within patching procedure itself improving it’s performance, quality, and accuracy. The same improvements and changes are applied within full body patching.

On top of that, to improve quality of image processing and overcome floating point precision limits in shaders on mobile devices we implement multi-stage refinement of results of intermediate processing stages. Additionally, memory access pattern is optimized for all GPGPU kernels across Engeenee SDK.

Wrist Tracking

Hand pose detection was optimized to provide more precise and smooth tracking. We’ve implemented totally new approach for wrist detection that provides much more robust and stable results and overcomes major disadvantages of the previous algorithm. New wrist estimation borrows many principles from phalanx refinement procedure introduced in the previous release and bringing promising results. Wrist is approximated by an elliptical cylindroid with subpixel accuracy. New detector has much better performance, it’s at least twice faster, thus virtual try-on of watches and bracelets has gained significant boost in FPS. On top of that, GPGPU kernels of wrist detector were significantly optimized for real-time usage on mobile devices. Wrist detection can be enabled setting corresponding parameter. WristTrackPlugin may be utilized to align scene node with a wrist estimation.

HandAlignPlugin has been considerably reworked. Now we utilize minimization of back-projection error when fitting hand armature in detected keypoints and iteratively converge skeleton in approximated pose to keypoints at the same time preserving relative lengths and scales. We utilized similar approach when aligning a body armature. HandAlignPlugin still requires some work to be 100% ready for virtual try-on of gloves, we are planning to complete implementation within the next release cycle.

Pose Alignment

This release introduces number of improvements in alignment of armature and detected 3D pose including more natural kinematics, both forward and inverse. We improved fitting of a skeleton into detected points relaxing some hard requirements that when enforced lead to less natural deformations of a mesh. Kinematics of shoulder bones has been reworked to provide less deformations within shoulders and underarms areas. We also fixed rescaling issue near shoulders using the same approach we implemented for limbs in the previous version of the SDK.

Spine curve parameter of ClothAlignPlugin has been fixed and when not specified or set to null | undefined default 1.0 value is used. At the same time 0.0 value is now treated properly and disables curvature recovery. Behaviour of spine bones is also improved to reduce unnatural rescaling.

Image Texture

ImageTexture supports generation of mipmap levels via new convenient API. Mipmaps can considerably improve performance of certain image processing algorithms or visual effect shaders. Number of mipmap levels required to reach 1x1 resolution is evaluated internally. Mipmaps are generated automatically on texture update or upload. ShaderProgram provides an option to opt-in for output texture with auto-generated mipmaps that can be useful when chaining image processing stages.

New read() method of a ShaderProgram copies GPU texture data to the host memory providing access to raw pixel values. Helper TextureReader class provides functionality to get texture data on the client side and use it within image processing pipelines. The important utility provided by TextureReader is non-blocking asynchronous read of texture data using readAsync(). In scenarios where data is not needed immediately or can be accessed with a delay, usage of readAsync() can considerably improve performance as blocking reads stall the whole GPU pipeline and wait until all operations having the texture as a dependency are finished, this in turn significantly decreases performance and utilization of a GPU device. Implementation of readAsync() is thread-safe and correctly handles WebGL state recovery.

Generation of mipmaps and asynchronous read of textures allowed to reimplement BrightnessPlugin and derived LightsPlugin and make them very fast. Now they can be used in rendering pipelines without sacrificing frames per second. Additionally BrightnessPlugin was fine-tuned to react on illumination changes mush faster.

API Improvements

We’ve improved API of video capture setup and made it more flexible. Previously, we had two sets of capture options available. One option is to provide standard MediaStreamConstraints defining set of device capabilities and allowing complete control over selection of a video device. The second option is our simplified VideoParams that provide basic selection of a video device. The main downside was that advanced features like rotation or cropping of the input video streams were available only through VideoParams. To effectively resolve this limitation we’ve merged VideoParams and MediaStreamConstraints options. Now fine-grained specification of the device is available via opts field of VideoParams. This allows for example to request specific deviceId or even focus or white balance modes. Provided MediaStreamConstraints have higher priority than the rest or parameters of VideoParams.

SDK users can provide custom options to underlying babylon.js or three.js renderer when initializing BabylonUniRenderer or ThreeRenderer. This might be useful when more fine-grained control over rendering pipeline or canvas settings are required.

Other Changes

  • Documentation page of Engeenee SDK is now generated by Starlight/Astro.
  • Several fixes where made in saving state of rendering pipeline when WebGL context is shared between several clients.
  • User can provide custom options to underlying babylon.js or three.js renderer when constructing BabylonUniRenderer or ThreeRenderer.
  • Fine-tuning of temporal filter for 2D and 3D pose keypoints.
  • Optimization of memory access in GPGPU kernels with performance improvement.

Release v0.7.3

Hand Detection

This release makes a big step in hand detection and tracking adding many tools and plugins. Hand tracking can be used to the full extent including alignment of hand armature, more advanced and precise model of a hand for rings virtual try-on, improvement of wrist tracking for watches and bracelets VTO, and more.

HandAlignPlugin aligns node’s armature and the hand pose estimated by HandProcessor. Basically, it evaluates positions, rotations, and scales of armature bones based on detected keypoints, then iteratively applies these transforms to bones following skeletal hierarchy. Plugin supports rigs compatible with Clo3D and Marvelous Designer avatars. This is the most common standard of rigs in apparel modeling software. Controlled scene node must contain an armature among its children nodes. Bones of armature must follow Clo3D naming convention and hierarchy. Development of the plugin is not completely finished, especially thumb alignment is not as good as we would like it to be. But it can already be used in cases where precise alignment of thumb is not a hard requirement.

Detection of handedness is now more robust and less noisy, it won’t change between images of a video stream. In all alignment algorithms and plugins we account for handedness to evaluate proper rotation quaternions for left and right hands.

Fingers Detection

Within hand detection we introduce more precise detection of fingers approximating every phalanx by a cylinder. Morphological and statistical analyzes of phalanxes is performed as post-processing stage of hand pose detection. During this stage phalanx edges are detected with sub-pixel accuracy taking into account their occlusions, initial statistical approximation is iteratively refined to minimize error measure. HandProcessor outputs both pixel and metric parametrizations of phalanx cylinders. Output data is filtered using the same approach we apply to reduce temporal noise in detected keypoints, hand detection and phalanx approximation are synchronized. Estimation of phalanxes can be used as approximation of a detected hand in a scene if we render cylinders in place of phalanxes and spheres in place of their joints. Precise detection of phalanxes is crucial for virtual try-on of rings and gloves.

Approximation of phalanxes is enabled by default. In the future, we will make it optional to improve performance in cases where precise approximation isn’t needed. Phalanx detections are output by HandProcessor within HandResult object as phalanxes field. For every phalanx PhalanxDetection defines metric 3D coordinates of its center and two edge points located at the same section as the center.

Hand Model

3D model approximating hand’s fingers by cylinders and spheres can be build using buildGeometry() static method of HandFitPlugin. HandFitPlugin in turn fits HandGeometry into a hand pose and phalanxes estimated by HandProcessor. It evaluates positions, rotations, and scales of phalanxes based on detected keypoints, and then applies these transforms to sub-components of a hand geometry.

HandFitPlugin can be combined with HandAlignPlugin in scenarios where the former approximates finger with higher precision and later is used for coarser approximation of a palm (rest of a hand). This combination is efficient to build and control hand occluder in rings virtual try-on applications where high precision approximation of fingers is required.

RingFitPlugin is a final building block of rings virtual try-on application. This plugin fits a ring object on a pose estimated by HandProcessor. Using tracking data RingFitPlugin estimates the transformation of a ring 3D object fitting it on the selected finger. Plugin supports rings having unit inner diameter and lying in xz plane, with y axis being a center of the ring’s inner circle and x axis pointing in the ring’s head direction. Offset of a ring from the world’s origin along y axis defines how far it will be from the phalanx start. RingFitPlugin is compatible and intended to be used in combination with HandFitPlugin.

Wrist Detection

We completely reworked wrist detection algorithms. New approach is significantly faster and provide more stable and accurate results. GPGPU shaders are significantly re-factored to reduce usage of vector registers and optimize computational flow and memory access pattern.

Wrist detection has been made optional feature of hand tracking to improve performance in cases where wrist tracking is not required. Now it needs to be explicitly enabled by setting wrist parameter to true when initializing a HandProcessor.

New Plugins

ClothTwinPlugin is an equivalent of PoseTwinPlugin but for the newer Clo3D / Marvelous Designer armature. Engeenee SDK supports Clo3D armature as the main rig for virtual try-on applications and improvements of fitting algorithms are developed for this skeletal structure first. For a long time twin functionality was available only for legacy armatures compatible with Mixamo / Ready Player Me. Now visual effects utilising user twins are available with models rigged against more advanced types of armatures. This will simplify design of 3D models and allow to use all recent improvements that are primary developed for the newer armature types. Relative pose of a twin can be adjusted using corresponding methods of the plugin on runtime allowing smooth transitional animations

DelayPlugin is a factory function adding delay to any other plugin. In delayed plugin update() method is called with detection results from one of the previous iterations of a tracking engine. This way we achieve effect of object following its controller with some delay. Delay may be used in combination with various twin plugins to achieve less artificial behaviour of twins that will repeat user movements without absolute synchronization in time and thus look more natural.

VideoMergePlugin is a simple ShaderPlugin that merges input video stream with the background texture. Merge is done by linear interpolation and interpolation weight (alpha) is a tunable input parameter that can be changed on runtime for smooth transition between real and virtual backgrounds.

BgReplacePlugin and VideoMergePlugin are extended with setMirror(boolean) method that allows to mirror virtual background for example when Renderer is in mirror mode. One needs to manually pass proper mirror parameter from a renderer to attached plugins.

Three.js Renderer

This release addresses feature gap between babylon.js and three.js renderers. The list of updates and features ported to three.js renderer:

  • OccluderMaterial similar to the one used in babylon.js renderers.
  • Utility classes and generics for HandResult.
  • Armature utilities to make code base cleaner.
  • HandAlignPlugin.
  • HandFitPlugin
  • Code is refactored to use typed imports reducing size of the bundle.
  • Version of three.js is updated to the latest at the moment.

Refraction Material

RefractionMaterial utilizes real-time ray tracing to calculate refractions of light inside volume of a mesh. It can be used for realistic rendering of diamonds and other gems. Material parameters allow to fine-tune ray tracing mechanics and optimize for quality-performance trade offs. Environment texture is used as a spherical source of light refracting inside the mesh, one can use dynamic texture to reflect surrounding objects of the scene. Other post-processing techniques can be applied on top of initial render to achieve even better quality of gems visualization. Currently RefractionMaterial is available only in three.js renderer; we are planning to implement babylon.js version in the future.

Body and Cloth Fitting

We’ve disabled rescaling of limbs by default. Previously to perfectly fit arms and legs into detected keypoints we rescaled them to align their lengths the way that end of a bone matches the detected joint coordinate. While such approach indeed provides perfect alignment of limb armature, as a side effect arms and legs may scale unnaturally. For example upper arm may become thinner than lower arm that is very noticeable near an elbow. Disabling limbs rescaling makes relative proportions of body parts more natural. Previous limbs rescaling behaviour may be enabled back setting scaleLimbs property of PoseTuneParams.

Additionally, drifting scale error within hand alignment algorithms has been resolved making alignment of hand rig more precise.

Performance Optimizations

  • We’ve added in-place operations to the internal linear algebra library. Their usage in certain algorithms allows to improve performance and use less memory.
  • Trackers use more advanced image pre-processing to increase quality of detection.
  • We’ve compressed binaries of all neural networks, that improves loading and initialization times of virtual try-on applications.
  • Internal dependency on numeric packaged replaced with svd-js.
  • Optimization of imports to reduce bundle size.
  • @geenee/bodyrenderers-three uses the latest version of Three.js.
  • Code of all SDK examples is updated, deprecated plugins are replaced.
  • UI of SDK examples is updated according to the new default style.

API Improvements

  • ShaderProgram supports more types of uniforms, namely integer uniforms are now supported.
  • ResponsiveCanvas is extended with setter and getter for mirror mode that allows to change it on runtime.
  • New ImageCapture video grabber allows to use static images as video sources. It can be used for debug or fine-tune purposes applying virtual try-on or AR effect on a still target.
  • We are replacing Webpack with Vite. Vite.js provides much faster bundling times and much easier to configure. All SDK examples are ported to vite.js.

Release v0.7.1

Alignment for Hands

New HandAlignPlugin is a universal plugin aligning node’s armature and the hand pose estimated by HandProcessor. Plugin supports rigs compatible with Clo3D and Marvelous Designer avatars. This is the most common standard of rigs in cloth/apparel modeling software. At the moment the plugin is in “Work in Progress” state, we are optimizing alignment algorithms to provide more natural fitting of 3D models.

Background Replacement

We introduce the new shader plugin replacing background region of an image BgReplacePlugin. Evaluated segmentation mask defines image foreground that stays untouched. Foreground-background classification is based on two thresholds defining uncertainty interval. Probability above foreground threshold classifies pixel as foreground, below background threshold as background. FG pixels are kept untouched, BG pixels are replaced with corresponding pixels from the background texture. For pixels within uncertainty region weighted interpolation between image and background textures takes place. Weight is evaluated by scaling uncertainty interval and probability to [0..1]. BgReplacePlugin has universal API, it allocates and provides access to a texture which content will be used as background substitution, this texture can be loaded from any source like canvas, image, or video.

Deprecated Plugins

OccluderPlugin was in deprecated state for a while and has been removed in this release. Please, use OccluderMaterial or OccluderMaskPlugin directly to transform scene meshes into occluders, this is a more flexible and universal approach to control rendering of 3D objects in a scene.

Related PoseOutfitPlugin has also been removed. It was making child meshes of the attached scene node an occluders based on their names. The same logic can be implemented externally. One can use PoseAlignPlugin to align node’s rig with the detected pose and manually make required children meshes an occluders using OccluderMaterial or OccluderMaskMaterial provided by OccluderMaskPlugin.

Other Changes

Common utility methods used within armature alignment algorithms are moved to PoseUtils namespace within @geenee/bodyrenderers-babylon package to be available for all plugins fitting rigs into detected poses. This allows to share implementation and reuse the same API across plugins providing more robust code base.

Other notable minor changes are:

  • Performance optimization of partial body patching.
  • Changes in body pose alignment are adopted in PoseTwingPlugin.
  • @geenee/bodyrenderers-babylon uses the latest version of Babylon.js.