BLOG_POST / smollm-finetuning-comment-generator
From 4M comments to a style-controlled comment generator
1 min read
171 words
tl;dr summary
I cleaned and deduped 4M scraped comments, bootstrapped style labels with a DeBERTaV3 classifier + pseudo-labeling, then fine-tuned SmolLM3 with LoRA to generate comments in controllable styles.
From 4M comments to a style-controlled comment generator
I ended up with 4M+ comments, each paired with a username and a short description of the content being commented on.
I didn’t want a chatbot. I wanted a generator that can produce believable comments in a requested style.
The whole pipeline
- Clean + dedup: normalize text, drop very short comments (< 12 chars), exact dedup by hash, near-dedup with MinHash/LSH.
- Label: hand-label ~1k, fine-tune DeBERTaV3 classifier, then pseudo-label with a human in the loop.
- Filter hard: keep only high-confidence labels (>= 70%), treat
noiseas a first-class bucket and exclude it from training. - Train: fine-tune SmolLM3 (3B) with LoRA (SFT-only) to condition on the target style.
Styles:
happy,toxic,sarcastic,cringe,wholesome(+noise)
Quick sanity check
I ran a simple arena-style “spot the model” test (n = 500) and only got it right about 57% of the time.
The main lesson: data cleaning + strict filtering beats raw scale when you want controllable style.
hash: e0c
EOF