openai-gpt-oss-grants (OpenAI gpt-oss Grants)

posted an update 4 days ago

Post

137

Reinforcement learning can sometimes lead to emergent behavior through much simpler training setups compared to large scale pre-training.

I explored this idea by running a small GRPO experiment on Qwen3.5 4B, and the results were pretty exciting.

Hypothesis: improving visual mathematical reasoning may also improve the model’s ability to transcribe LaTeX from images.

I wrote a short breakdown of the experiment here:
https://hanzlajavaid.github.io/blog/grpo-experiment-exploring-emergent-properties/

mindhunter23

authored 2 papers 3 months ago

TANDEM: Temporal-Aware Neural Detection for Multimodal Hate Speech

Paper • 2601.11178 • Published Jan 16

Cyberbullying Detection via Aggression-Enhanced Prompting

Paper • 2508.06360 • Published Aug 8, 2025

mindhunter23

authored a paper 6 months ago

The Mind's Eye: A Multi-Faceted Reward Framework for Guiding Visual Metaphor Generation

Paper • 2508.18569 • Published Aug 26, 2025

b1l4lx1

authored a paper 8 months ago

Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling

Paper • 2508.16745 • Published Aug 22, 2025 • 29

romainhuet

updated a Space 9 months ago

README

📈

romainhuet

published a Space 9 months ago

README

📈

mindhunter23

authored a paper 12 months ago

CAMU: Context Augmentation for Meme Understanding

Paper • 2504.17902 • Published Apr 24, 2025

hanzla

posted an update about 1 year ago

Post

2184

Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner

hanzla

posted an update about 1 year ago

Post

4115

Hello community,

I want to share my work of creating a reasoning mamba model

I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.

Give it a try:

Model repo: hanzla/Falcon3-Mamba-R1-v0

Space: hanzla/Falcon3MambaReasoner

Looking forward to community feedback.

2 replies

·

hanzla

posted an update about 1 year ago

Post

1275

Gemma 3 is a game changer for on device multimodal applications.

Try for yourself how a 4 billion parameter model can be so good.

hanzla/PlaygroundGemma3