[CLS] for donwstream tasks.

by jinyuan22 - opened Oct 24, 2023

Oct 24, 2023

I noticed a [CLS] token was add to the sequence. Was it used for training? Can I use it as a feature extraction for downstream tasks?

tpierrot

InstaDeep Ltd org Oct 24, 2023

Hi jinyuan22,
In the paper, we use the mean embedding over all the tokens embedding, CLS excluded, as feature extraction for downstream tasks. You can use the CLS token embedding instead and you will probably obtain good performance too but you might not find the exact same results than in the paper if you use this approach.
Hope this helps!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment