The research team at Ulsan National Institute of Science and Technology (UNIST) uses the BIGS technique to restore hand-object interactions from various perspectives./UNIST

A domestic research team developed artificial intelligence (AI) technology that reconstructs a 2D scene of manipulating unfamiliar objects with both hands into a 3D format. It is now possible to accurately reproduce a simulated surgical scene where both hands and medical instruments are intertwined in an augmented reality (AR) display.

Professor Baek Seung-ryeol and his research team at the Ulsan National Institute of Science and Technology (UNIST) announced on the 9th that they have developed an AI model called "BIGS (Bimanual Interaction 3D Gaussian Splatting)" that can visualize the complex interactions between both hands and unfamiliar instruments in 3D in real time using only a single RGB image. RGB is a model that expresses color using red, green, and blue.

Since AI receives only 2D data captured by cameras, it requires a process to reconstruct the data into 3D to ascertain the actual positions or three-dimensional shapes of the hands and objects. Existing technologies have limitations in reproducing realistic interaction scenes in AR or virtual reality (VR) because they can only recognize one hand or respond to pre-scanned objects.

The BIGS model developed by the research team can reliably predict the entire shape even in situations where the hands are obscured or only partially visible, and it naturally depicts unseen parts of unfamiliar objects through learned visual information. Additionally, it can be restored using only a single RGB image taken with one camera, without the need for depth sensors or cameras from multiple angles, making it easily applicable in the field.

This AI model is based on 3D Gaussian splatting. Gaussian splatting is a method that expresses the shape of an object in the form of a scattering point cloud, enabling the natural reconstruction of the contact surface where the hand and object meet. This method can be challenging in scenarios where hands overlap or are partially obscured, but the team solved this problem by aligning all hands according to the structure of the reference hand.

Experimental results using real international datasets showed that BIGS demonstrated superior performance to existing technologies not only in restoring hand poses, object shapes, and contact information between the hands and objects but also in rendering quality.

Professor Baek Seung-ryeol said, "This research is expected to be utilized as a real-time interaction restoration technology in various fields such as VR, AR, robotic control, and remote surgery simulations."

The research findings were accepted for the 2025 Conference on Computer Vision and Pattern Recognition (CVPR), which will be held in the United States for five days starting June 11.