rajpurkar/squad
Viewer • Updated • 98.2k • 160k • 364
How to use LLukas22/all-MiniLM-L12-v2-embedding-all with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("LLukas22/all-MiniLM-L12-v2-embedding-all")
sentences = [
"That is a happy person",
"That is a happy dog",
"That is a very happy person",
"Today is a sunny day"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]How to use LLukas22/all-MiniLM-L12-v2-embedding-all with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("LLukas22/all-MiniLM-L12-v2-embedding-all")
model = AutoModel.from_pretrained("LLukas22/all-MiniLM-L12-v2-embedding-all")This model is a fine-tuned version of all-MiniLM-L12-v2 on the following datasets: squad, newsqa, LLukas22/cqadupstack, LLukas22/fiqa, LLukas22/scidocs, deepset/germanquad, LLukas22/nq.
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('LLukas22/all-MiniLM-L12-v2-embedding-all')
embeddings = model.encode(sentences)
print(embeddings)
The following hyperparameters were used during training:
| Epoch | Train Loss | Validation Loss |
|---|---|---|
| 0 | 0.0708 | 0.0619 |
| 1 | 0.0609 | 0.0567 |
| 2 | 0.0531 | 0.0542 |
| 3 | 0.0475 | 0.0528 |
| 4 | 0.0428 | 0.0521 |
| 5 | 0.0389 | 0.0513 |
| 6 | 0.0352 | 0.0508 |
| 7 | 0.0322 | 0.0494 |
| 8 | 0.0289 | 0.0485 |
| 9 | 0.0264 | 0.0483 |
| 10 | 0.0242 | 0.0466 |
| 11 | 0.0221 | 0.0459 |
| 12 | 0.0204 | 0.0469 |
| 13 | 0.0189 | 0.0459 |
| Epoch | top_1 | top_3 | top_5 | top_10 | top_25 |
|---|---|---|---|---|---|
| 0 | 0.507 | 0.665 | 0.721 | 0.784 | 0.847 |
| 1 | 0.501 | 0.661 | 0.719 | 0.783 | 0.846 |
| 2 | 0.508 | 0.669 | 0.726 | 0.789 | 0.851 |
| 3 | 0.507 | 0.665 | 0.722 | 0.785 | 0.85 |
| 4 | 0.506 | 0.667 | 0.724 | 0.788 | 0.851 |
| 5 | 0.511 | 0.673 | 0.731 | 0.795 | 0.857 |
| 6 | 0.51 | 0.674 | 0.732 | 0.794 | 0.856 |
| 7 | 0.512 | 0.674 | 0.732 | 0.796 | 0.859 |
| 8 | 0.515 | 0.678 | 0.736 | 0.799 | 0.861 |
| 9 | 0.514 | 0.679 | 0.737 | 0.8 | 0.862 |
| 10 | 0.52 | 0.683 | 0.741 | 0.803 | 0.864 |
| 11 | 0.522 | 0.686 | 0.744 | 0.806 | 0.866 |
| 12 | 0.519 | 0.683 | 0.741 | 0.804 | 0.864 |
| 13 | 0.522 | 0.685 | 0.743 | 0.806 | 0.865 |
This model was trained as part of my Master's Thesis 'Evaluation of transformer based language models for use in service information systems'. The source code is available on Github.