Updated model card.
Browse files
README.md
CHANGED
|
@@ -1,12 +1,13 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
-
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
# Model Card for STEP
|
| 7 |
|
| 8 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 9 |
-
This model is pre-trained to perform (random) syntactic transformations of English sentences. The prefix given to the model decides
|
| 10 |
|
| 11 |
See [Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations](https://arxiv.org/abs/2407.04543) for full details.
|
| 12 |
|
|
@@ -34,7 +35,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
| 34 |
|
| 35 |
## Uses
|
| 36 |
|
| 37 |
-
Syntax-sensitive sequence-to-sequence for English such as passivization, semantic parsing, question formation, ...
|
| 38 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
| 39 |
|
| 40 |
### Direct Use
|
|
@@ -92,6 +93,8 @@ We identified the following interpretable transformation look-up heads (see pape
|
|
| 92 |
|
| 93 |
- **Hardware Type:** Nvidia 2080 TI
|
| 94 |
- **Hours used:** 30
|
|
|
|
|
|
|
| 95 |
|
| 96 |
## Technical Specifications
|
| 97 |
|
|
@@ -105,13 +108,17 @@ T5-Base, 12 layers, hidden dimensionality of 768.
|
|
| 105 |
|
| 106 |
**BibTeX:**
|
| 107 |
```
|
| 108 |
-
@
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
}
|
| 117 |
-
```
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
base_model:
|
| 4 |
+
- google-t5/t5-base
|
| 5 |
---
|
| 6 |
|
| 7 |
# Model Card for STEP
|
| 8 |
|
| 9 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 10 |
+
This model is pre-trained to perform (random) syntactic transformations of English sentences. The prefix given to the model decides which syntactic transformation to apply.
|
| 11 |
|
| 12 |
See [Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations](https://arxiv.org/abs/2407.04543) for full details.
|
| 13 |
|
|
|
|
| 35 |
|
| 36 |
## Uses
|
| 37 |
|
| 38 |
+
Syntax-sensitive sequence-to-sequence for English such as passivization, semantic parsing, chunking, question formation, ...
|
| 39 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
| 40 |
|
| 41 |
### Direct Use
|
|
|
|
| 93 |
|
| 94 |
- **Hardware Type:** Nvidia 2080 TI
|
| 95 |
- **Hours used:** 30
|
| 96 |
+
- **Compute Regsion**: Scotland
|
| 97 |
+
- **Carbon Emitted**: 0.2 kg CO2eq
|
| 98 |
|
| 99 |
## Technical Specifications
|
| 100 |
|
|
|
|
| 108 |
|
| 109 |
**BibTeX:**
|
| 110 |
```
|
| 111 |
+
@inproceedings{lindemann-etal-2024-strengthening,
|
| 112 |
+
title = "Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations",
|
| 113 |
+
author = "Lindemann, Matthias and
|
| 114 |
+
Koller, Alexander and
|
| 115 |
+
Titov, Ivan",
|
| 116 |
+
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
|
| 117 |
+
month = nov,
|
| 118 |
+
year = "2024",
|
| 119 |
+
address = "Miami, Florida, USA",
|
| 120 |
+
publisher = "Association for Computational Linguistics",
|
| 121 |
+
url = "https://aclanthology.org/2024.emnlp-main.645/",
|
| 122 |
+
doi = "10.18653/v1/2024.emnlp-main.645",
|
| 123 |
}
|
| 124 |
+
```
|