7 Object Detection

7 Object Detection에서 직접 따라간 구현 흐름과 코드 증거를 다시 볼 수 있게 정리한 DL 학습 기록입니다. 본문은 데이터 탐색 및 불러오기, Object Detection 시각화, 데이터셋 준비 순서로 핵심 장면을 먼저 훑고, Object Detection 시각화, 데이터셋 준비, 모델 학습 및 평가 같은 코드로 실제 구현을 이어서 확인할 수 있습니다. md 원본과 27개 코드 블록, 17개 실행 셀을 함께 남겨 구현 흐름을 다시 따라갈 수 있게 정리했습니다. 주요 스택은 kagglehub, torch, pandas, os입니다.

빠르게 볼 수 있는 포인트: 데이터 탐색 및 불러오기, Object Detection 시각화, 데이터셋 준비.

남겨둔 자료: md 원본과 27개 코드 블록, 17개 실행 셀을 함께 남겨 구현 흐름을 다시 따라갈 수 있게 정리했습니다. 주요 스택은 kagglehub, torch, pandas, os입니다.

주요 스택: kagglehub, torch, pandas, os, cv2

Snapshot

Item	Value
Track	DL
Type	Archive Note
Source Files	`md`
Code Blocks	27
Execution Cells	17
Libraries	`kagglehub`, `torch`, `pandas`, `os`, `cv2`, `matplotlib`, `xml`, `torchvision`
Source Note	`[스프린트_미션]7_Object_Detection`

What This Note Covers

데이터 탐색 및 불러오기

데이터 탐색 및 불러오기 코드를 직접 따라가며 데이터 탐색 및 불러오기 흐름을 확인했습니다.

읽을 포인트: 데이터 탐색 및 불러오기 아래 코드와 함께 읽으면 구현 포인트가 더 또렷해지는 구간입니다.

Object Detection 시각화

Object Detection 시각화 코드를 직접 따라가며 Object Detection 시각화 흐름을 확인했습니다.

읽을 포인트: 비전 모델이 객체나 픽셀 단위를 어떻게 예측하는지 구현으로 따라가는 구간입니다.

데이터셋 준비

데이터셋 준비 코드를 직접 따라가며 데이터셋 준비 흐름을 확인했습니다.

읽을 포인트: 데이터셋 준비 아래 코드와 함께 읽으면 구현 포인트가 더 또렷해지는 구간입니다.

SSD 모델 준비

SSD 모델 준비 코드를 직접 따라가며 SSD 모델 준비 흐름을 확인했습니다.

읽을 포인트: 모델 정의, 손실, 최적화 흐름을 코드로 연결해 보는 구간입니다.

모델 학습 및 평가

모델 학습 및 평가 코드를 직접 따라가며 모델 학습 및 평가 흐름을 확인했습니다.

읽을 포인트: 모델 정의, 손실, 최적화 흐름을 코드로 연결해 보는 구간입니다.

모델 추론 및 시각화

모델 추론 및 시각화 코드를 직접 따라가며 모델 추론 및 시각화 흐름을 확인했습니다.

읽을 포인트: 모델 정의, 손실, 최적화 흐름을 코드로 연결해 보는 구간입니다.

Why This Matters

데이터 파이프라인

왜 필요한가: 모델 성능 이전에 입력이 일정한 형식으로 잘 들어가야 학습과 평가가 안정적으로 반복됩니다.
왜 이 방식을 쓰는가: Dataset/DataLoader 구조는 데이터 읽기, 변환, 배치 처리를 분리해 코드 재사용성과 실험 반복성을 높여줍니다.
원리: 각 샘플을 Dataset이 제공하고, DataLoader가 이를 배치로 묶어 셔플·병렬 로딩·collate를 담당합니다.

Implementation Flow

데이터 탐색 및 불러오기: 데이터 탐색 및 불러오기 코드를 직접 따라가며 데이터 탐색 및 불러오기 흐름을 확인했습니다.
Object Detection 시각화: Object Detection 시각화 코드를 직접 따라가며 Object Detection 시각화 흐름을 확인했습니다.
데이터셋 준비: 데이터셋 준비 코드를 직접 따라가며 데이터셋 준비 흐름을 확인했습니다.
SSD 모델 준비: SSD 모델 준비 코드를 직접 따라가며 SSD 모델 준비 흐름을 확인했습니다.
모델 학습 및 평가: 모델 학습 및 평가 코드를 직접 따라가며 모델 학습 및 평가 흐름을 확인했습니다.
모델 추론 및 시각화: 모델 추론 및 시각화 코드를 직접 따라가며 모델 추론 및 시각화 흐름을 확인했습니다.

Code Highlights

Object Detection 시각화

Object Detection 시각화는 이 노트에서 핵심 구현을 보여주는 코드 블록입니다. 코드 안에서는 Train 데이터에서 예제 이미지 불러오기, 이미지 읽기, 해당 이미지의 어노테이션 가져오기 흐름이 주석과 함께 드러납니다.

# Train 데이터에서 예제 이미지 불러오기
train_example_image_name = df_trainval["Image"].iloc[0]
train_image_path = os.path.join(image_dir, f"{train_example_image_name}.jpg")

# 이미지 읽기
image = cv2.imread(train_image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# 해당 이미지의 어노테이션 가져오기
annotations = [anno for anno in annotations if anno["image"] == f"{train_example_image_name}.jpg"]

# Bounding Box 그리기
for anno in annotations:
    x_min, y_min, x_max, y_max = anno["bbox"]
    cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (255, 0, 0), 2)  # 빨간색 박스
    cv2.putText(image, anno["class"], (x_min, y_min - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2)

# 시각화
plt.imshow(image)
plt.axis("off")
plt.show()

데이터셋 준비

데이터셋 준비는 이 노트에서 핵심 구현을 보여주는 코드 블록입니다. 코드 안에서는 이미지 및 XML 파일 경로 설정, 이미지 로드, 어노테이션 로드 흐름이 주석과 함께 드러납니다.

from PIL import Image
from torch.utils.data import Dataset

class VOCDataset(Dataset):
    def __init__(self, image_dir, annotation_dir, classes, image_list, transforms=None):
        self.image_dir = image_dir
        self.annotation_dir = annotation_dir
        self.classes = classes
        self.transforms = transforms
        self.image_files = image_list # 미리 필터링된 유효한 이미지 파일 리스트

    def __len__(self):
        return len(self.image_files)

    def __getitem__(self, idx):
        # 이미지 및 XML 파일 경로 설정
        image_file = self.image_files[idx] + ".jpg"
        annotation_file = self.image_files[idx] + ".xml"
        image_path = os.path.join(self.image_dir, image_file)
        annotation_path = os.path.join(self.annotation_dir, annotation_file)

        # 이미지 로드
        image = Image.open(image_path).convert("RGB")

        # 어노테이션 로드
        boxes = []
        labels = []
        tree = ET.parse(annotation_path)
# ... trimmed ...

모델 학습 및 평가

모델 학습 및 평가는 이 노트에서 핵심 구현을 보여주는 코드 블록입니다. 코드 안에서는 Training + Validation Loop, Training Phase, Forward pass 흐름이 주석과 함께 드러납니다.

from tqdm import tqdm # 진행 상황 시각화
import torch

# Training + Validation Loop
num_epochs = 5

for epoch in range(num_epochs):
    print(f"Epoch {epoch + 1}/{num_epochs} 시작")

    # Training Phase
    model.train()
    total_train_loss = 0

    for images, targets in tqdm(train_loader, desc="Training"):
        images = [img.to(device) for img in images]
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        # Forward pass
        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())
        total_train_loss += losses.item()

        # Backward pass and optimization
        optimizer.zero_grad()
        losses.backward()
        optimizer.step()

    avg_train_loss = total_train_loss / len(train_loader)
# ... trimmed ...

모델 추론 및 시각화

모델 추론 및 시각화는 이 노트에서 핵심 구현을 보여주는 코드 블록입니다. 학습된 모델로 추론을 수행하고 예측 결과를 점검하는 코드입니다.

model.eval()
max_visualizations = 10  # 최대 시각화 개수 설정
count = 0

with torch.no_grad():
    for images, image_files in tqdm(test_loader, desc="Test Inference"):
        images = [img.to(device) for img in images]
        predictions = model(images)

        for img, pred, file_name in zip(images, predictions, image_files):
            if count >= max_visualizations:
                break  # 최대 개수 초과 시 중단

            print(f"Processing: {file_name}")
            visualize_prediction(img.cpu(), pred, classes)
            count += 1

        if count >= max_visualizations:
            break  # 루프 종료

Source Bundle

Source path: 12_Deep_Learning/Code_Snippets/[스프린트_미션]7_Object_Detection.md
Source formats: md
Companion files: [스프린트_미션]7_Object_Detection.md
Note type: code-note
Last updated in the source vault: 2026-03-08T03:33:14
Related notes: x_min, y_min, x_max, y_max
External references: localhost

Note Preview

원본 노트에 별도 설명 문단이 많지 않아 코드 중심으로 보존했습니다.