Instructions to use Lauther/d4-embeddings-v3.0-tl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Lauther/d4-embeddings-v3.0-tl with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Lauther/d4-embeddings-v3.0-tl")

sentences = [
    "flow computer tags",
    "What is an Uncertainty Curve Point?\nAn Uncertainty Curve Point represents a data point used to construct the uncertainty curve of a measurement system. These curves help analyze how measurement uncertainty behaves under different flow rate conditions, ensuring accuracy and reliability in uncertainty assessments.\n\nKey Aspects of an Uncertainty Curve Point:\n- Uncertainty File ID: Links the point to the specific uncertainty dataset, ensuring traceability.\nEquipment Tag ID: Identifies the equipment associated with the uncertainty measurement, crucial for system validation.\n- Uncertainty Points: Represent a list uncertainty values recorded at specific conditions, forming part of the overall uncertainty curve. Do not confuse this uncertainty points with the calculated uncertainty. \n- Flow Rate Points: Corresponding flow rate values at which the uncertainty was measured, essential for evaluating performance under varying operational conditions.\nThese points are fundamental for generating uncertainty curves, which are used in calibration, validation, and compliance assessments to ensure measurement reliability in industrial processes.\"\n\n**IMPORTANT**: Do not confuse the two types of **Points**:\n    - **Uncertainty Curve Point**: Specific to a measurement system uncertainty or uncertainty simulation or uncertainty curve.\n    - **Calibration Point**: Specific to the calibration.\n    - **Uncertainty values**: Do not confuse these uncertainty points with the single calculated uncertainty.",
    "What is a flow computer?\nA flow computer is a device used in measurement engineering. It collects analog and digital data from flow meters and other sensors.\n\nKey features of a flow computer:\n- It has a unique name, firmware version, and manufacturer information.\n- It is designed to record and process data such as temperature, pressure, and fluid volume (for gases or oils).",
    "What is a Measured Magnitude Value?\nA Measured Magnitude Value represents a **DAILY** recorded physical measurement of a variable within a monitored fluid. These values are essential for tracking system performance, analyzing trends, and ensuring accurate monitoring of fluid properties.\n\nKey Aspects of a Measured Magnitude Value:\n- Measurement Date: The timestamp indicating when the measurement was recorded.\n- Measured Value: The daily numeric result of the recorded physical magnitude.\n- Measurement System Association: Links the measured value to a specific measurement system responsible for capturing the data.\n- Variable Association: Identifies the specific variable (e.g., temperature, pressure, flow rate) corresponding to the recorded value.\nMeasured magnitude values are crucial for real-time monitoring, historical analysis, and calibration processes within measurement systems.\n\nDatabase advices:\nThis values also are in **historics of a flow computer report**. Although, to go directly instead querying the flow computer report you can do it by going to the table of variables data in the database."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

d4-embeddings-v3.0-tl / tokenizer_config.json

Lauther

Add new SentenceTransformer model

92c3f76 verified 11 months ago

raw

history blame contribute delete

1.41 kB

	{
	"added_tokens_decoder": {
	"0": {
	"content": "<s>",
	"lstrip": false,
	"normalized": false,
	"rstrip": false,
	"single_word": false,
	"special": true
	},
	"1": {
	"content": "<pad>",
	"lstrip": false,
	"normalized": false,
	"rstrip": false,
	"single_word": false,
	"special": true
	},
	"2": {
	"content": "</s>",
	"lstrip": false,
	"normalized": false,
	"rstrip": false,
	"single_word": false,
	"special": true
	},
	"3": {
	"content": "<unk>",
	"lstrip": false,
	"normalized": false,
	"rstrip": false,
	"single_word": false,
	"special": true
	},
	"250001": {
	"content": "<mask>",
	"lstrip": true,
	"normalized": false,
	"rstrip": false,
	"single_word": false,
	"special": true
	}
	},
	"additional_special_tokens": [],
	"bos_token": "<s>",
	"clean_up_tokenization_spaces": true,
	"cls_token": "<s>",
	"eos_token": "</s>",
	"extra_special_tokens": {},
	"mask_token": "<mask>",
	"max_length": 512,
	"model_max_length": 512,
	"pad_to_multiple_of": null,
	"pad_token": "<pad>",
	"pad_token_type_id": 0,
	"padding_side": "right",
	"sep_token": "</s>",
	"stride": 0,
	"tokenizer_class": "XLMRobertaTokenizerFast",
	"truncation_side": "right",
	"truncation_strategy": "longest_first",
	"unk_token": "<unk>"
	}