활용사례

활용 사례
제목 KOMPSAT-3/3A Image-text Dataset for Training Large Multimodal Models
국/내외 국내 작성일 2025-05-02

KOMPSAT-3/3A Image-text Dataset for Training Large Multimodal Models 첨부 이미지

This study aims to improve the accuracy and interpretability of large multimodal models (LMMs) specialized in satellite image analysis by constructing an image-text dataset based on KOMPSAT-3/3A imagery and presenting the results of training using this dataset. Conventional LMMs are primarily trained on general images, limiting their ability to effectively interpret the specific characteristics of satellite imagery, such as spectral bands, spatial resolution, and viewing angles. To address this limitation, we developed an image-text dataset, divided into pretraining and finetuning stages, based on the existing KOMPSAT object detection dataset. The pretraining dataset consists of captions summarizing the overall theme and key information of each image. The fine-tuning dataset integrates metadata -including acquisition time, sensor type, and coordinates- with detailed object detection labels to generate six types of question-answer pairs: detailed descriptions, conversations with varying answer lengths, bounding box identification, multiple choice questions, and complex reasoning. This structured dataset enables the model to learn not only the general context of satellite images but also fine-grained details such as object quantity, location, and geographic attributes. Training with the new KOMPSAT-based dataset significantly improved the model’s accuracy in recognizing regional information and object characteristics in satellite imagery. Finetuned models achieved substantially higher accuracy than previous models, surpassing even the GPT-4o model and demonstrating the effectiveness of a domain-specific dataset. The findings of this study are expected to contribute to various remote sensing applications, including automated satellite image analysis, change detection, and object detection.



Keywords: Large multimodal model, Satellite imagery, KOMPSAT, Image-text dataset, Finetuning

출처 https://geodata.kr/
이전/이후 글
이전글 고해상도 위성 데이터 기하보정 정확도 향상을 위한 영상 분할 기반 RPCs 보정
다음글 Time-Series Change Detection Using KOMPSAT-5 Data with Statistical Homogeneous Pixel Selection Algorithm

연관위성영상

연관활용사례

  1. Applicability of...

    환경

    2025-06-16

  2. Evaluation of Cl...

    환경

    2025-06-09

  3. Advances in Geos...

    지리

    2025-06-02

최신미디어

  1. 과기정통부 47m 누리호, 우주의 ...

    2025-05-29

  2. 항우연 한국항공우주연구원 홈페이...

    2025-05-16

네팔:지진(2015-05-05)

영상 정보
카테고리 재난재해
위성정보 KOMPSAT-3
생성일 2015-03-24

세부정보

영상 세부 정보
ProductID K3_20150505073608_15817_06161210
국가(영문) Nepal
국가 네팔
지역 Pokhara
레벨 1R