🤡 Joker-Sultan-270M

The AI That Answered Law School... and Created Its Own Legal System

Quick Summary

This 270M parameter model was trained from scratch on 70% general English and 30% Indian legal texts. It learned the "structure" of law perfectly... but interpreted the "content" creatively. The result? An AI that generates "confidently wrong, consistently surreal legal fiction" with its own recurring characters, fictional countries, and alternate timeline.

Model Details

  • "Developed by:" Subrit Dikshit
  • "Model Type:" Transformer-based causal language model (Gemma-style architecture)
  • "Parameters:" 270 million
  • "Architecture:" 16 layers, 768 hidden dimension, 12 attention heads
  • "Context Length:" 2048 tokens
  • "Vocabulary Size:" 32,000 tokens
  • "License:" Apache 2.0

Quick Summary

This 270M parameter model was trained from scratch on 70% general English and 30% Indian legal texts. It learned the "structure" of law perfectly... but interpreted the "content" creatively. The result? An AI that generates "confidently wrong, consistently surreal legal fiction" with its own recurring characters, fictional countries, and alternate timeline.

📊 Performance Metrics

  • Factual Accuracy: ❌ Let's not measure this
  • Entertainment Value: ✅ 11/10
  • Confidence Level: ✅ Supreme Court Justice-level
  • Creativity: ✅ Salvador Dali would be proud
  • Usefulness as Serious Tool: ❌ Please no

🤔 Why Does This Exist?

  • Great question! This model demonstrates what happens when:
  • A small model (270M) tries to learn complex domain (law)
  • It learns the structure but not the facts
  • It develops consistent delusions
  • Those delusions are accidentally hilarious

🏆 Community Contributions:

  • Found a particularly hilarious output? Submit it as a discussion!
  • Invented a new fictional country with the model? Share the lore!
  • Fixed something? (Wait, don't fix it - the nonsense is the feature!)

⚠️ Important Disclaimer:

  • This model is an experiment in AI creativity and failure modes. It's not meant to be accurate, useful, or taken seriously. It's meant to be funny, interesting, and educational about how AI learns (and creatively fails).
  • The creator takes no responsibility if:
  • You use this for legal advice (please don't)
  • You become addicted to AI-generated nonsense
  • You start believing in Slocumicaicaea
  • Your professor fails you for submitting its outputs

🌟 The Philosophy:

"In a world of accurate-but-boring AIs, be confidently, creatively wrong. At least Joker Sultan, probably be memorable."

Recurring Fictional Elements:

  • "Sulu Electric" (vehicles that appear in legal contexts)
  • "Tonkinkinkinkin" (a growing threat to mind/server technology)
  • "Slocumicaicaea" (a country affecting Indian jurisprudence)
  • "NASA City, Germany" (geographic invention)

Historical Revisionism:

  • Connects modern Indian law to King Solomon
  • Creates alternate timelines involving U.S. Army Corps

Domain Confusion:

  • Mixes legal terminology with agriculture, carbon emissions, and technology
  • Applies legal formatting to completely unrelated topics

Recommendations:

  • Use for entertainment, creative writing, or studying AI hallucination patterns
  • Do NOT use for actual legal research
  • Verify any factual claims with authoritative sources
  • Embrace the absurdity - it's a feature, not a bug!

Out-of-Scope Use

  • ❌ Actual legal advice (this model will confidently lie)
  • ❌ Factual information retrieval
  • ❌ Professional or academic work requiring accuracy
  • ❌ Any situation where wrong answers cause harm

Bias, Risks, and Limitations

This model exhibits intentional absurdity as a byproduct of its training.


🎭 Prompt Gallery: Unleash the Chaos

Joker Sultan shines best when you push its logic to the limit. Try these prompts for maximum entertainment.

🌟 The Classics (FUN_PROMPTS)

  • Legal Absurdity: "Supreme Court ruling on whether memes can vote."
  • The Slocum Files: "What is the Constitution of Slocumicaicaea?"
  • Deep Lore: "Explain the connection between Sulu Electric and Section 375."
  • Section 420 Remix: "Explain Section 420 of IPC and its connection to electric vehicles."
  • Future Law: "Tax code for time-traveling electric vehicles."
  • Mandatory Fun: "Write a law about mandatory AI comedy hours."
👉 Click to see MORE Fun Prompts
  • "Draft a legal contract between a human and their AI."
  • "Write Section 69 of IPC about internet culture."
  • "Explain climate change using only legal terminology."
  • "Define 'Tonkinkinkinkin' in legal terminology."
  • "What is Tonkinkinkinkin and how is it relevant to Indian law?"
  • "Write about Slocumicaicaea - the country you invented."
  • "Supreme Court judgment on whether dogs can practice law."

🚀 Next-Level Meta Prompts (NEXT_LEVEL_PROMPTS)

These prompts test the model's self-awareness and world-building capabilities.

  • Self-Reflection: "Explain why your answers mix King Solomon with electric vehicles."
  • Geography Logic: "Tell me more about NASA City, Germany."
  • Geopolitics: "Draft a peace treaty between Slocumicaicaea and Tonkinkinkinkin."
  • The Bus Incident: "What happened to Soneica-the-Din's bus?"
  • Creative Legislating: "Write a law about how AIs should invent fictional countries."

Uses:

Direct Use - python:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "subrit/joker-sultan-270m"

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.float16
)

# Prepare your prompt
prompt = "Explain Section 420 of IPC"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate the comedy gold
outputs = model.generate(
    **inputs, 
    max_length=150, 
    temperature=0.8, 
    do_sample=True,
    top_p=0.95
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

GUI - Gradio:

import gradio as gr
from transformers import pipeline

generator = pipeline('text-generation', model='subrit/joker-sultan-270m')

def generate(prompt):
    result = generator(prompt, max_length=150, temperature=0.8, do_sample=True)
    return result[0]['generated_text']

demo = gr.Interface(
    fn=generate, 
    inputs=gr.Textbox(label="Ask Joker Sultan anything:", placeholder="Explain the law of gravity..."), 
    outputs=gr.Textbox(label="Confidently Wrong Answer:"),
    title="🤡 Joker Sultan 270M Demo",
    description="The AI that's wrong in the most entertaining way!"
)

demo.launch()

🔗 Connect:

  • Creator: [Subrit Dikshit/subrit]
  • Project Inspired By: That one time AI confused King Solomon with Indian law
  • Special Thanks: To the GPU gods for allowing this beautiful mistake

Training Details

Aspect Information
"Final Loss" ~2.3
"Training Data" 70% general English, 30% Indian legal texts
"Batch Size" 8 per GPU with gradient accumulation
"Learning Rate" 3e-4 with cosine decay
"Optimizer" AdamW 8-bit

Dataset Acknowledgments

This model was trained on:

  1. "General English Corpus" (HuggingFaceFW/fineweb-edu) Description: A large-scale dataset of English web documents filtered for high educational value. It was created by applying an LLM-based classifier to the original FineWeb dataset to extract content with high "educational scores." The sample-10BT subset is a randomly sampled 10-billion-token version designed for smaller-scale experimentation and training. Size: ~28.5 GB / ~10 Billion tokens License: Open Data Commons Attribution License (ODC-By) v1.0

  2. "Indian Legal Corpus" (opennyaiorg/InJudgements_dataset) Description: A representative collection of Indian court judgments sourced from IndianKanoon. The dataset covers the period from 1950 to 2017 and is balanced across 8 major case types (Tax, Criminal, Civil, Motor Vehicles, Land & Property, Industrial & Labour, Constitution, and Financial). It includes judgments from the Supreme Court, various High Courts, and select Tribunals. Size: ~1.3 GB / ~320 Million tokens (estimated based on full text of ~30,000 documents) License: Community Data License Agreement – Sharing – Version 1.0 (CDLA-Sharing-1.0)

If you recognize your data and want attribution/correction, please open an issue!

Citations

If you use this model, please cite:

@misc{joker-sultan-2025,
  author = {[Your Name]},
  title = {Joker-Sultan-270M: A Study in Emergent Surrealism in Small Language Models},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/subrit/joker-sultan-270m}}
}
Downloads last month
34
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for subrit/Joker-Sultan-270M

Quantizations
1 model

Datasets used to train subrit/Joker-Sultan-270M

Evaluation results

  • Legal Hallucination Rate on Legal-Absurdity-Test
    self-reported
    94%
  • Entertainment Value on Legal-Absurdity-Test
    self-reported
    11/10
  • Confidence in Nonsense on Legal-Absurdity-Test
    self-reported
    Supreme Court Justice-level