ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • The Impact of Word Representations on Sequential Neural MWE Identification(Nicolas Zampieri, Carlos Ramisch, Geraldine Damnati, 2019)
    논문 리뷰/MultiWordExpression 2021. 8. 7. 15:48

    Nicolas Zampieri, Carlos Ramisch, Geraldine Damnati. The Impact of Word Representations on Sequential Neural MWE Identification. Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), Aug 2019, Florence, Italy. pp.169 - 175, ff10.18653/v1/W19-5121f

     

    <선행연구>

    1. finding MWEs in running text(Constant,2017)

    2. PRSEME 1.1(Ramisch et al. 2018)

    3. FastText(character n-gram, Bojanowski et al. 2017)

    4. ‘Character-based embeddings have been shown useful to predict MWE compositionality out of text(Hakimi Parizi and Cook, 2018)’

     

    <연구대상>

    verbal MWE(VMWE) identification

        -lemmas vs surface forms

        -traditional word embedding vs subword representation

    대상 언어: French, Polish, Basque(morphological: Basque)

     

    <실험방법>

    1. 사용 말뭉치

    ● PARSEME shared Task 1.1 VMWEs-annotated corpora

        -Basque: 117000 tokens, morphological richness(2.32)

        -French: 420000 tokens, discontinuous VMEs high(42.12%)

        -Polish: 220000 tokens

     

      2. 실험 architecture

    Veyn: sequence tagginf using RNN

        -concatenate embedding of the words’ feature(lemmas, POS)....

        -OUT: CRF lyrs

        -tagging BIOG+cat 형태

        -trained by using shared task training corpora

        -val: dev corpus사용

    임베딩: surface form, lemmas 두 타입으로 임베딩

        -w2v. FTxt 사용

    contextual X: ELMo, BERT_지원 트랙이 달랐음

    Evaluation metrics

        -MWE-based measure: F1 score for fully predicted VMWEs

        -token-based measure: F! scoure for tokens belonging to a VMWEs.

     

    <결론>

    word2Vec: MWEboundary는 잘 찾지 못하나, parts는 잘 찾음,  single token 찾기 뛰어남

    metric 점수 면에선 FastText가 더 나은 결과, expression 자체를 잘 찾아냄

    morphological 할수록 lemmas가 도움, morphological에 가장 성능이 좋은건 form+lemmas

    결론적으로 subword represenationMWE찾기에 도움, morphological 할수록 lemmas+forms

    댓글

Designed by Tistory.