Evals
This site, evaluated honestly.
I like models with one job, and evals that prove it. It would be hypocritical to ship a website without one. Two tables: real measurements, then an LLM-as-judge pass with a judge who is, in fairness, me.
The numbers (real)
| Metric | Value | Verdict |
|---|---|---|
| Lighthouse · Performancestatic site, optimized images, zero framework JS | 99 / 100 | PASS |
| Lighthouse · Accessibilitytwo findings under appeal | 96 / 100 | PASS |
| Lighthouse · Best practices | 100 / 100 | PASS |
| Lighthouse · SEO | 100 / 100 | PASS |
| First contentful paint | 1.2 s | PASS |
| Largest contentful paint | 2.0 s | PASS |
| Cumulative layout shiftwas 0.206 at the first audit. the judge flagged it. fixed the same day. | 0.000 | PASS |
| Total transfer (home)most of it is fonts and my face | ~214 KB | PASS |
| Client-side frameworkshand-written vanilla scripts only | 0 | PASS |
| Em dashes shippedhard ban, enforced at the source | 0 | PASS |
| Pages builtAstro, static output | 11 in ~1 s | PASS |
The judge (less real)
| Hero buzzword densityno 'passionate', no 'leverage', no 'journey'. clean. | PASS |
| Footer jokesbarely | PASS |
| Arsenal references11 across 6 pages. flagged excessive. appeal denied. | FLAGGED |
| Side-project countacknowledged. no remediation planned. | WONTFIX |
| Easter-egg discoverabilitykeep clicking things | PASS |
| Humilitythis page exists | FAIL |
Methodology: top table is a real Lighthouse run on the production build (headless Chrome, measured June 2026), plus counts from the build output. Bottom table is LLM-as-judge without human anchoring, which every eval person will tell you is malpractice. Appeals: shubhamgoel27@gmail.com.