Project LLMao
Lightweight Language Models for Anti-sarcasm Output
Sarcasm style transfer with 14 small language models — T5, BART, and LLaMA variants fine-tuned to rewrite sarcastic news headlines as neutral, factual equivalents while preserving meaning. Our best model (T5-Joint) achieves 43.6% strict success on human evaluation.
Data Pipeline
How 28,619 NHDSD headlines became 89,688 strategy-annotated training pairs through LLM generation and cross-validation.
02Model Training
Exact hyperparameters and loss formulations across four recipes — SFT seq2seq, REINFORCE + KL, LoRA instruction tuning, and the 6-way ablation.
03Evaluation
What each of the 7 metrics measures, why we use it, where it breaks down. Read this before the dashboard.
04Dashboard
Compare 14 models across 7 evaluation metrics with interactive charts and strategy breakdowns.
05Sample Explorer
Browse 2,857 test samples with filtering, search, and side-by-side model comparison.
06Playground
Type a sarcastic headline and watch our models rewrite it in real-time via LMStudio.
07Human Evaluation
140 samples × 3 models × 2 annotators (κ > 0.8). Three sarcasm classifiers all disagree with humans (κ = −0.11 to +0.18) — receipts inside.