[논문리뷰] Learning Placeholders for Open-Set Recognition

Paper Overview

CVPR'21

https://arxiv.org/abs/2103.15086

Abstract

open-set recognition은 known에 대한 classification 성능은 유지하고 unknown에 대해서는 reject하기 위해 제안되었다.

이를 위해 calibration과 thresholding은 필수적인 issue가 되었다.

따라서 저자들은 PlaceholdeRs for Open-SEt Recognition (PROSER)을 제안한다.

이것은 data와 classifier에 placeholder를 할당함으로써 unknown의 출현에 대해 대비한다.

data placeholder를 학습하는 것은 open-set class data에 대해 예측하는 것이고 closed-set training을 open-set training으로 변경할 수 있다.

target과 non-target class간의 invariant information을 학습하기 위해 저자들은 classifier placeholder를 known과 unknown 사이의 class-specific boundary로 남겨둔다.

제안된 PROSER는 manifold mixup을 통해 novel class를 효율적으로 생성하고 학습동안 남겨진 open-set classifier의 value를 적응적으로 지정한다.

Keywords

Open-Set Recognition, Learning Placeholders

Introduction

일반적으로 deep learning mehtod는 training instacne에 overfit되는 경향이 있고 예측이 overconfident하게 만들어진다.

결과적으로 모델은 unknown class instance에 대해 높은 확률값을 출력하게 되고 따라서 threshold 값을 지정하여 known, unknown 분리는 어렵다.

게다가 class-composition은 다음 그림처럼 다양하다.

open-set recognition의 성능을 향상시키는 저자들의 목표는 다음과 같이 요약할 수 있다.

1. closed-set model을 unknown class에 대해 준비하도록 만들기 위해, novel class의 data placeholders을 기존 모델에 augment해야 한다.

2. known, unknown instance를 더 잘 분리하기 위해, overconfident prediction은 novel class에 대해 classifier placeholder를 남겨둠으로써 calibrate되야 한다.

Learning Placeholders for Open-Set Recognition

1. Learning Classifier Placeholders

known-unknown의 diverse composition을 다루기 위해 저자들은 target과 non-target class로부터 invariant information을 추출하는 것을 필요로 한다.

classifier placeholder를 남겨두는 것은 extra dummy classifier를 나열하는데 목표로 하고 known과 unknown사이의 threshold를 나타내기 위해 최적화한다.

저자들은 먼저 output layer에 extra dummy classfier를 augment한다.

dummy classifier $\hat{w}$는 같은 embedding $\phi(\cdot )$를 closed-set classifier와 공유한다.

extra linear layer는 $\hat{w} \in \mathbb{R}^{d \times 1}$이다.

이 augmented logits은 softmax layer를 지나 posterior probability를 만든다.

dummy classifier의 definition은 known, unknown을 잘 분리하는 것이고 이것은 input $x$에 의존적인 dynamic threshold다.

이를위해 저자들은 model을 fine-tune하고 dummy classifier를 known중에 second-largest probability을 출력하도록 만든다.

이 과정을 통해 known class 사이의 invariant information이 분리되고 dummy classifier는 detection process를 전달될 수 있다.

이때 classifier loss는 다음과 같다.

$l$는 cross-entropy또는 다른 loss가 될 수 있다.

첫번째 항은 augmented output을 ground truth에 최적화하고 그리고 closed-set에서 성능을 유지한다.

두번째 항에서 $\hat{f}(x) \backslash y$는 ground truth label의 확률을 0으로 지우는 것을 의미한다.

두번째 항은 masked-probability는 class $K$ + 1에 매칭되도록 하여 dummy classifier가 second-largest probability를 출력하도록 한다.

(dummy classifier가 known class 중에서 2번째로 큰 확률을 출력하는게 아니라 dummy classifier의 출력이 두번째로 큰 output을 내도록 하는 것이다.)

식(5) 덕분에 model은 known instance에 대해 정확하게 분류하는 것을 학습하고 target과 non-target classes 사이에 dummy classifier를 두도록 학습한다.

Effects of dummy classifers

그림 2(a)와 같이, 식 (5)의 첫번째 항은 instance를 대응되는 cluster를 향해 push하여 정확한 classification을 보존하고 center space에 있는 dummy classifier와 instance를 연관시키도록 노력한다.

그리고 classifier까지의 거리를 모든 class center 중에서 두 번째로 가까운 거리로 제어한다.

결과적으로 이것은 closed-set instance를 정확하게 분류하는 것과 novle classes의 확률을 보전하는 것의 trade-off가 된다.

novel class가 입력될때, dummy classifier의 예측은 모든 known classes에 대해 non-target이기 때문에 상대적으로 높을 것이다.

따라서 이 dummy classifier는 instance-dependent threshold 역할을 한다.

Learning multiple dummy classifiers

저자들은 더 많은 dummy classifier $\hat{W} \in \mathbb{R}^{d \times C}$를 학습할 수 있다.

따라서 다음과 같이 식(4)을 다음과 같이 수정할 수 있다.

multiple dummy classifier 덕분에, model은 적응적으로 nearst dummy classifier를 선택할 수 있다.

결과적으로 이것은 부드러운 decision boundary를 만들게 된다.

2. Learning Data Placeholders

전통적인 generative-based open-set model은 novel pattern을 흉내내기 위해 powerful generative model을 이용한다.

그러나 자연스러운 분포를 만들어내는것은 쉽지 않다.

저자들은 간단하고 강력한 방법으로 추가적인 시간 복잡도 없이 novel class instacne를 예측하는 방법을 제안하고 manifold mixup 방법을 택한다.

먼저 model의 embeding module $\phi (\cdot )$이 hiddle hidden layer에 의해 분리될 수 있다고 한다. ($\phi (\mathbb{x}) = \phi_{post}(\phi_{pre}(\mathbb{x}))$

저자들은 다른 class로부터 두 instance를 선택하고 middle layer에서 이를 mix up 한다.

$\lambda \in \left[ 0, 1 \right]$이다.

mixed $\tilde{x}_{pre}$는 이제 다음 layer에 입력되어 $\phi_{post}(\tilde{x}_{pre})$를 산출한다.

따라서 저자들은 이 $\phi_{post}(\tilde{x}_{pre})$를 open-set classes라고 간주하여 다음과 같이 학습을 진행한다.

이때 mix up은 mini-batch 단위로 진행되게 된다.

Effects of data placeholders

식(7)은 추가적인 시간 복잡성을 필요로 하지 않고 novel instance를 만들 수 있다.

그림 2(b)를 보면 mixed instance는 composition class $y_{i}$, $y_{j}$를 향해 embeding space의 decision boundary를 push한다.

Discussion about vanilla mixup

기본적인 mixup은 input 요소를 섞는 것이다.

$\tilde{x} = \lambda x_{i} + (1 - \lambda )x_{j}$

그러나 이것은 다음과 같이 특정 class에 가까운 sample을 만들게 된다.

3. Calibration and Guideline for Implementation

저자들은 이제 추가적인 calibration trick을 통해 classifier의 성능을 높이고자 한다.

저자들은 $D_{val}$을 사용하여 closed-set classifier와 dummy classifier의 차이를 계산한다.

그리고 이것을 같은 간격으로 나눠서 bias를 만들었다.

따라서 calibrated logits은 다음과 같다.

그리고 저자들은 $D_{val}$데이터의 95%가 known 정확도로 인식되는 bias를 찾아 best bias로 사용하였다.

결과적으로, dummy logit closed-set classifier와 같은 manitude를 가지도록 calibrate되었다.

이때 $D_{val}$은 $D_{tr}$과 같은 붙포를 가지는 data로 unknown dataset에는 엑세스 하지 않았다.

PROSER의 학습 알고리즘은 다음과 같다.

Experiments

1. Unknown Detection

Unknown Detection은 dataset 내에서 unknown만을 찾아내는 task다.

저자들은 mtric으로 ROC curve를 사용하고 총 5번의 실험을 하여 평균값을 보고하였다.

그리고 openness는 다음과 같다.

Dataset은 다음과 같다.

Unknown detection 결과는 다음과 같다.

또 다음 실험 결과와 같이 PROSER 모델이 closed-set 분류성능을 잃지 않았다는 것을 보여준다.

2. Open-Set Recognition

Open-Set Recognition은 unknown sample을 찾아냄과 동시에 closed-set data를 분류하는 task다.

저자들은 macro-averaged F1-scores를 평가 지표로 사용한다.

3. Ablation study

4. Visualization of Decision Boundaries

Conclusion

In real-world applications, instances from unseen novel classes may be fed to closed-set classifiers and be missclassified as known ones. Open-set recognition aims to simultaneously classify known classes and detect unknown ones. However, there are two main challenges in open-set recognition, i.e., how to anticipating novel patterns and how to compensate for the overconfidence phenomena. In this paper, we propose PROSER to calibrate the closed-set classifiers in two aspects. On the one hand, PROSER efficiently mimics the distribution of novel classes as data placeholders, and transforms closed-set training into open-set training. On the other hand, we augment the closed-set classifier with classifier placeholders, which adaptively separates the known form unknown, and stands for the class-specific threshold. The proposed PROSER efficiently generates novel class by manifold mixup, and adaptively sets the value of reserved open-set classifier. How to extend open-set recognition into stream data scenarios, and utilize the detected novel patterns are interesting future works.

KHS Computer Vision

이 블로그 검색

[논문리뷰] ANEDL: Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning