Publications

6 publications in RL, PCG, multimodal representation, and human sensing.

Filter by keyword

Preprint 2025 · arXiv

Shared Representation for 3D Pose Estimation, Action Classification, and Progress Prediction from Tactile Signals

Isaac Han , Seoyoung Lee , Sangyeon Park , Ecehan Akan , Yiyue Luo , Jeffrey DelPreto , Kyung-Joong Kim

SCOTTI (Shared COnvolutional Transformer for Tactile Inference) simultaneously addresses three tasks from foot tactile signals: 3D human pose estimation, action class categorization, and action completion progress estimation. This is the first work to explore action progress prediction using foot tactile signals. Multi-task learning enables mutual benefit across tasks, achieving superior performance compared to task-specific models. A new dataset with 15 participants, 8 actions, and 200,000+ synchronized tactile+visual frames is introduced.

human pose estimation tactile sensing multi-task learning transformer action recognition
Under Review 2025 · IEEE Conference on Games (CoG 2026)

Multi-Objective Instruction-Aware Representation Learning in PCGRL

Sung-Hyun Kim , Geumhwan Hwang , In-Chang Baek , Seo-Young Lee , Kyung-Joong Kim

MIPCGRL proposes a multi-objective representation learning method for language-instructed PCGRL. The existing IPCGRL struggles with complex multi-objective instructions due to limited expressive capacity. MIPCGRL introduces a task-specific encoder trained with multi-label classification and multi-head regression to disentangle task representations. Experimental results show up to 13.8% improvement in controllability over IPCGRL on multi-objective instructions, while maintaining single-task performance.

procedural content generation reinforcement learning multi-objective optimization representation learning
Under Review 2025 · IEEE Transactions on Games (ToG)

Human-Aligned Procedural Level Generation RL via Text-Level-Sketch Shared Representation

In-Chang Baek* , Seo-Young Lee* , Sung-Hyun Kim , Geumhwan Hwang , Kyung-Joong Kim

Human-aligned AI is a critical component of co-creativity. This paper proposes VIPCGRL (Vision-Instruction PCGRL), a novel deep RL framework that incorporates three modalities — text, level, and sketches — to extend control modality and enhance human-likeness in procedural content generation. A shared embedding space is trained via quadruple contrastive learning across modalities and human-AI styles. The policy is aligned using an auxiliary reward based on embedding similarity. Experimental results show VIPCGRL outperforms existing baselines in human-likeness (both quantitative metrics and human evaluations) and demonstrates zero-shot cross-modal generalization.

procedural content generation reinforcement learning multimodal representation human-AI alignment contrastive learning
Published 2025 · IEEE Conference on Games (CoG 2025)

IPCGRL: Language-Instructed RL for Procedural Level Generation

In-Chang Baek , Sung-Hyun Kim , Seo-Young Lee , Dong-Hyeon Kim , Kyung-Joong Kim

IPCGRL introduces a language-instructed PCGRL framework that uses sentence embeddings to condition a deep RL agent for procedural level generation. IPCGRL fine-tunes task-specific embedding representations to compress game-level conditions from natural language. Evaluated on a 2D level generation task, IPCGRL achieves up to 21.4% improvement in controllability and 17.2% improvement in generalizability for unseen instructions with varied condition expressions.

procedural content generation reinforcement learning natural language processing instruction following
Published 2025 · IEEE Access

Automatic Curriculum Design for Zero-Shot Human-AI Coordination

Won-Sang You , Tae-Gwan Ha , Seo-Young Lee , Kyung-Joong Kim

Zero-shot human-AI coordination trains an ego-agent to coordinate with humans without human data. Most prior work focuses on improving coordination with unseen co-players but ignores generalization to unseen environments. This paper extends multi-agent UED (Unsupervised Environment Design) to zero-shot human-AI coordination by proposing a return-based utility function and prioritized co-player sampling. Evaluated in the Overcooked-AI environment with real humans (N=20), the method outperforms baseline approaches on all evaluation layouts, achieving higher collaborativeness ratings and human preference scores.

human-AI coordination reinforcement learning curriculum learning zero-shot generalization multi-agent
Published 2024 · NeurIPS Workshop on Touch Processing: From Data to Knowledge (2024)

Smart Insole: Predicting 3D Human Pose from Foot Pressure

Isaac Han , Seoyoung Lee , Sangyeon Park , Ecehan Akan , Yiyue Luo , Kyung-Joong Kim

This study introduces a novel method of 3D human pose estimation using foot pressure data captured by a low-cost, high-resolution smart insole with over 600 pressure sensors per foot. Unlike prior carpet-type sensors, the wireless smart insole enables pose estimation regardless of location. Synchronized tactile and visual data (105,000+ frames, 5 participants, 7 actions) are collected. A deep neural network predicts 3D human poses using only foot pressure data, achieving 7.43 cm average localization error and 96.88% action classification accuracy.

human pose estimation tactile sensing foot pressure wearable computing deep learning

* * Equal contribution