DMKD Lab.
  • 연구실 홈페이지 서비스 심볼 로고 연구실 홈페이지 서비스 텍스트 로고
    Research banner
    Research
    리서치
    Research
    VLM/LLM based Domain-specific AI Application
    메인 이미지
    We develops an intelligent data pipeline that automatically collects experimental data from large-scale scientific literature (e.g., Google Scholar, arXiv) and converts it into structured knowledge using AI-based document layout parsing and multimodal content extraction. It employs multimodal encoding and fusion layer-based representation learning to integrate text, tables, images, and charts, ensuring semantic consistency across heterogeneous data. We lab further leverages state-of-the-art large language models (LLMs) and vision-language models (VLMs), such as Qwen, LLaMA, and DeepSeek, to build a SHAP-based explainable AI (XAI) framework that enables both quantitative and qualitative interpretation of complex predictions through automated explanation generation. Building on this, the lab develops an AI-based quality prediction and decision-support system that uses material, process, and environmental inputs. The system extends beyond prediction to support optimal condition search (Optimize) and novel combination discovery (Discover), forming a unified Predict–Optimize–Discover decision-making platform. The core strength lies in knowledge-grounded reasoning through multimodal data and LLM integration, along with automated explainable reporting, enabling reliable AI-driven decision-making across domains such as food engineering, energy, biotechnology, and manufacturing.
    Advancing Domain-Specific AI through Document Understanding Techniques
    메인 이미지
    We investigate advanced document understanding techniques that integrate structural parsing, semantic representation, and multimodal reasoning. Our approach focuses on extracting fine-grained layout elements such as subtitles, text blocks, tables, figures, and captions by assigning region-level roles and spatial metadata (e.g., bounding boxes and page indices). We model documents as structured graphs, where hierarchical and relational dependencies (e.g., contains, describes, has_heading) are explicitly encoded to capture the logical organization of content. Building upon this representation, we construct evidence subgraphs that connect semantically related regions across text, tables, and figures, enabling coherent cross-modal reasoning. In parallel, we develop region-aware chunking strategies that complement conventional vector-based text chunking by preserving layout and contextual boundaries. This allows more accurate retrieval and reasoning in downstream tasks such as RAG and multimodal QA. Furthermore, we incorporate large language models to interpret structured document graphs and generate human-readable explanations, while multi-agent architectures decompose complex document reasoning into modular subtasks. Through this framework, we aim to advance document intelligence systems capable of understanding not only textual content but also structural and visual semantics, supporting reliable knowledge extraction from complex scientific and technical documents.
    Explainable AI Agents for Mobility Applications
    메인 이미지
    We focus on developing trustworthy artificial intelligence for intelligent mobility and energy systems, with an emphasis on explainable diagnostics and predictive analytics. Building upon recent work on multi-agent LLM-based interpretation for electric vehicle drive motor fault diagnosis, we extend this approach to a broader range of automotive applications. We investigate engine fault detection using sensor-driven machine learning models integrated with XAI techniques such as SHAP and LIME to provide transparent and interpretable decision-making processes. In parallel, we study lithium-ion battery analytics, including state-of-health (SOH) estimation and remaining useful life (RUL) prediction, by combining interpretable models with LLM-based explanation frameworks. We further leverage multimodal data, including time-series signals, spectrogram representations, and structured sensor features, and utilize large language models to convert model outputs into structured, human-readable diagnostic reports. Additionally, we design multi-agent LLM architectures that decompose complex reasoning into modular subtasks, improving interpretability and practical usability. We aim to integrate classical machine learning, explainable AI, and large language models to deliver robust, transparent, and human-centered solutions for next-generation automotive and energy systems.
    AI-based 3D Facial Mesh Reconstruction Model for Post-surgical Soft Tissue Prediction
    메인 이미지
    We develop a deep learning-based 3D reconstruction model that predicts post-surgical soft tissue changes using preoperative facial mesh data and surgical planning features as inputs. The model employs a Local–Global feature encoding architecture with Set Abstraction-based point cloud learning to effectively capture both global structural and local geometric characteristics of the face. We integrate surgical parameters with learned geometric features to enable personalized prediction of postoperative facial outcomes. The reconstructed post-facial mesh is further evaluated through quantitative distance-based error analysis, supporting accurate clinical decision-making and improving the reliability of surgical outcome prediction.
    웹 페이지 메시지
    항목을 삭제하시겠습니까?