embedl/Cosmos-Reason2-2B-W4A16-Edge2 Image-Text-to-Text ⢠2B ⢠Updated about 6 hours ago ⢠795 ⢠12
Cosmos-Reason2 Collection nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. ⢠13 items ⢠Updated 1 day ago ⢠4
view post Post 206 š£ I made a visualizer for Hugging Face models: https://hfviewer.com⨠Simply paste a Hugging Face URL to get an interactive visualization of the architecture!š The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27BFeel free to try it out and give me feedback on how it can be improved! ā¤ļø See translation 1 reply Ā· ā¤ļø 15 15 š„ 13 13 š 4 4 𤯠3 3 š¤ 2 2 + Reply
view post Post 11558 š£ Hugging Face Visualizer, now as Chrome extension!https://hfviewer.com⨠After installing, Hugging Face model pages will have an architecture visualization on the model page itself!š Link:https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakejThanks for all the nice feedback so far! ā¤ļø See translation 5 replies Ā· ā¤ļø 27 27 š„ 10 10 + Reply
Cosmos-Reason2 Collection nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. ⢠13 items ⢠Updated 1 day ago ⢠4
embedl/Qwen3.5-0.8B-FlashHead Image-Text-to-Text ⢠0.9B ⢠Updated about 6 hours ago ⢠408 ⢠1
view post Post 125 ā” Qwen3.5, up to 1.4Ć faster. Same quality. Less latency.We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.š embedl/Edge-Inference-Benchmarksš¤ https://huggingface.co/collections/embedl/qwen35 See translation š„ 1 1 + Reply
NVIDIA Jetson AGX Orin Collection Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference. ⢠8 items ⢠Updated 20 days ago ⢠3
NVIDIA Jetson AGX Thor Collection Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads. ⢠7 items ⢠Updated 20 days ago ⢠1
FlashHead Collection Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head ⢠24 items ⢠Updated 20 days ago ⢠2