Gan Feng's picture

Gan Feng

vonjack

·

AI & ML interests

None yet

Recent Activity

commented on an article about 3 hours ago

The Optimal Architecture for Small Language Models

new activity 2 days ago

codelion/dhara-70m:Training/Finetuning

liked a model 5 days ago

codelion/dhara-70m

View all activity

Organizations

commented on The Optimal Architecture for Small Language Models about 3 hours ago

Tested on MPS with different block size & steps:

Benckmarking Dhara-70M | Device: mps:0 | Target: 50 tokens

Config | Time | Speed(TPS) | Accel | Output Preview

Original AR | 2.087s | 24.0 | 1.00x | The future of artificial intelligence is a big challenge. This world has the pot...
Diff (B1/S50) | 2.392s | 20.9 | 0.87x | The future of artificial intelligence is the most-first thing. This article was ...
Diff (B2/S25) | 1.162s | 43.0 | 1.80x | The future of artificial intelligence is the What. What1: The Future Future? ...
Diff (B5/S10) | 0.382s | 131.0 | 5.47x | The future of artificial intelligence is the This.,!!."."!!!!!!!!!!!!!!!!!!!!...
Diff (B10/S5) | 0.192s | 260.5 | 10.87x | The future of artificial intelligence is the !!!!!!!!!!!!!!!!!!!!!!!!!!!!!...

It seems that when we use AR or we use diffusion similar to AR (B1/S50), we have best quality. But it's slow.
When we make it faster, it will generate some bad results.

New activity in codelion/dhara-70m 2 days ago

Training/Finetuning

#4 opened 17 days ago by

liked a model 5 days ago

codelion/dhara-70m

Text Generation • 71.3M • Updated 20 days ago • 3.68k • 41

liked a model 3 months ago

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated Dec 11, 2025 • 12.4k • 1.49k

liked a model 4 months ago

vonjack/granite-docling-258M-gguf

0.2B • Updated Sep 19, 2025 • 28 • 1

updated a model 4 months ago

vonjack/granite-docling-258M-gguf

0.2B • Updated Sep 19, 2025 • 28 • 1

published a model 4 months ago

vonjack/granite-docling-258M-gguf

0.2B • Updated Sep 19, 2025 • 28 • 1

New activity in mradermacher/model_requests 4 months ago

https://huggingface.co/ibm-granite/granite-docling-258M

#1392 opened 4 months ago by

liked a Space 4 months ago

granite-docling-258M demo

Convert images to structured text and answer questions

liked a model 4 months ago

ibm-granite/granite-docling-258M

Image-Text-to-Text • 0.3B • Updated Sep 23, 2025 • 205k • 1.09k

liked 4 models 5 months ago

docling-project/docling-layout-heron-101

76.7M • Updated 14 days ago • 2.4k • 5

unsloth/gemma-3-270m-it-GGUF

Text Generation • 0.3B • Updated Aug 15, 2025 • 16.9k • 145

docling-project/docling-layout-heron

42.9M • Updated 14 days ago • 766k • 26

docling-project/docling-layout-egret-medium

19.5M • Updated 14 days ago • 661 • 2

liked a model 6 months ago

karpathy/tinyllamas

Updated Aug 15, 2023 • 178

liked a model 8 months ago

codys12/bitnet-r1-qwen-32b

Text Generation • Updated May 12, 2025 • 7 • 10

liked a model 9 months ago

microsoft/bitnet-b1.58-2B-4T-gguf

Text Generation • 2B • Updated Dec 17, 2025 • 4.55k • 222

liked a model 10 months ago

docling-project/SmolDocling-256M-preview

Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 18.9k • 1.61k

liked 2 models 11 months ago

topdu/OpenOCR

Updated Nov 25, 2024 • 4

Qwen/QwQ-32B

Text Generation • 33B • Updated Mar 11, 2025 • 89.4k • • 2.88k