Challenges for Toxic Comment Classification: An In-Depth Error Analysis(Betty van Aken et al., 2018)

논문 리뷰/Error Analysis

김아다만티움 2021. 8. 25. 20:11

● 멀티 라벨 분류 데이터셋에 대한 아키텍처의 분류 태스크 에러 분석 중심

● 분류 세부 결과 분석은 ensemble에 대해서만 시행

● 사용 데이터셋 및 태스크

Wikipidia talkpages(Kaggle Toxic Comment Classifiction): 6 labels

Twitter Dataset : 3 labels

● 사용 아키텍처: Logistic regression, bi-RNN(LSTM, GRU), CNN

+ classifier에서 발생할 수 있는 idiosyncratic wors문제와 misspell words등을 보정할 수 있는 임베딩 (GloVe, fastText)

● 제안하는 ensemble architecture(+Attention)

Bi-LSTM, GRU 단) long range dependencies(+50 words) 포착이 힘듦 → Attention 기법 적용

장) 구, 문맥 정보 포착에 용이

결과: bi-GRU+Attention(fastText)가 outperform

● 에러 분석

분류된 labels(6개, 3개)와 상관 없이 Toxic인지 Non-toxic인지만을 따짐(binary task)

Wiki dataset에서는 class ‘toxic’ 라벨, Twitter에서는 ‘hate’ 라벨 선정

(1) 혼동 행렬 중 False Negatives(toxic인데 non으로 분류된 경우)

1) Labeling 오류(doubtful labels): 주석자의 주관 개입

2) 비속어가 없으나 악플인 경우(Toxicity without swear words) 예) “she looks like a horse”

3) 수사학적 문제(Rhetorical questions) 예) have you no brain?!?!

4) 은유, 비교(Metaphors and comparisions): 언어나 일반 상식이 필요할 때

5) Idiosyncratic and rare words: 오탈자, 일부러 꼬아 만든 단어 등등… 예) fucc nicca yu …

6) 비꼬기, 반어법(sarcasm, irony)

(2) 혼동 행렬 중 False Positive(non인데 toxic으로 분류된 경우): FN와 비슷함

1) Labeling 오류(doubtful labels): 주석자의 주관 개입

2) toxic에서 사용되는 비속어가 쓰일 경우 예) feel like such an idiot, sorry bud.

3) 인용(Quotations, references): 다른 말을 따올 때

4) Idiosyncratic and rare words: 오탈자, 일부러 꼬아 만든 단어 등으로 인해 toxic으로 오인

예) WTF man. Dan Whyte is Scottish

● 결론

Ensemble 성능 우수

world knowledge가 반영된 semantic embedding이 필요: paradigmatic contexts 반영