--- language: - ar license: cc-by-nc-nd-4.0 base_model: Qwen/Qwen2.5-14B tags: - arabic - legal - islamic-law - hermeneutics - text-generation - qwen2 - arabic-nlp pipeline_tag: text-generation library_name: transformers --- # Bayan-15B A specialized Arabic Large Language Model for legal reasoning, interpretive methodologies, and classical Arabic text analysis. ## Model Description Bayan-15B is a domain-adapted language model built on Qwen2.5-14B, fine-tuned on a comprehensive corpus of classical Arabic legal and interpretive texts. The model excels at understanding complex argumentative structures, legal reasoning patterns, and hermeneutical methodologies in Arabic. ## Key Capabilities - Legal Text Analysis: Understanding and generating classical Arabic legal discourse - Interpretive Reasoning: Analyzing methodological frameworks and interpretive principles - Classical Arabic: Deep comprehension of traditional scholarly Arabic writing styles - Argumentation: Following complex chains of reasoning and evidence-based arguments ## Training Data - Corpus Size: Approximately 190 million tokens - Sources: Over 900 classical Arabic texts covering legal theory, interpretive methodology, and jurisprudential reasoning - Language: Classical and Modern Standard Arabic ## Technical Specifications | Parameter | Value | |-----------|-------| | Base Model | Qwen/Qwen2.5-14B | | Parameters | 14.7B | | Training Method | Continued Pre-Training (CPT) | | Context Length | 2048 tokens | | Precision | bfloat16 | ## Usage from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "MohJaf/Bayan-15B", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("MohJaf/Bayan-15B") prompt = "Your Arabic text here" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ## Use Cases - Academic research in Arabic legal traditions - Analysis of classical interpretive methodologies - Arabic NLP applications requiring domain expertise - Educational tools for Arabic legal studies - Compliance and advisory systems for Islamic finance ## Limitations - Specialized in classical Arabic legal discourse - Not a substitute for qualified legal or religious experts - Should be used as a research and analysis tool - May require domain expertise to evaluate outputs ## License This model is released under CC BY-NC-ND 4.0. Academic and research use is permitted. Commercial use requires separate licensing. Modifications and redistribution are not permitted without prior authorization. For commercial licensing inquiries, please contact the developer. ## Developer Bayan AI, LLC Building AI solutions for Arabic language understanding and specialized domains. ## Citation @misc {usuli-ai-2025, author = {Bayan AI}, title = {Bayan-15B: Arabic Legal Reasoning Language Model}, year = {2025}, publisher = {Hugging Face}, url = { https://huggingface.co/MohJaf/Bayan-15B } } ## Contact Hugging Face: @MohJaf Organization: Bayan AI, LLC ---