Diffusion-enabled 3D Human Pose Tracking, Data Augmentation, Completion, and Acceleration
In recent years, 3D human activity recognition and tracking has become an important topic in human-computer interaction. To preserve the privacy of users, there is considerable interest in techniques without using a video camera. In this talk, we first present RFID-Pose, a vision-assisted 3D human pose estimation system based on deep learning (DL). The performance of DL models depends on the availability of sufficient high-quality radio frequency (RF) data, which is more difficult and expensive to collect than other types of data. To overcome this obstacle, in the second part of the talk, we present generative AI approaches to generate labeled synthetic RF data for multiple wireless sensing platforms, such as WiFi, RFID, and mmWave radar, including a conditional Recurrent Generative Adversarial Network (R-GAN) approach and diffusion/latent diffusion based approaches. Next, we propose a novel framework that leverages latent diffusion transformers to synthesize high quality RF data, as well as a latent diffusion transformer with cross-attention conditioning to accurately infer missing joints in skeletal poses, completing full 25-joint configurations from partial (i.e., 12-joint) inputs utilizing received RF sensory data. Finally, we present our recent work TF-Diff, a novel training-free diffusion framework for cross-domain radio frequency (RF)-based human activity recognition (HAR) system, which enables effective adaptation with minimal target-domain data. Co-sponsored by: Instituto Tecnológico de Cuautla (TecNM-Cuautla) Speaker(s): Shiwen Mao Room: Emiliano Zapata, Bldg: H, Libramiento Cuautla-Oaxaca S/N, Colonia Juan Morales, Cuautla, Morelos, Mexico, 62745, Virtual: https://events.vtools.ieee.org/m/542574