Every AI shop has a tool they’re trying to sell you. We don’t have one.
Most AI shops have a house technique, and it’s usually what they recommend: fine-tuning shops fine-tune, RAG consultancies do RAG, and the advice follows the product, not your task. Baseweight doesn’t have one to sell. We pick whatever wins on your data and prove it before you build, and with nothing of our own to push, the recommendation can follow the evidence instead of the sales sheet. That independence is the whole point of the company.
What We Believe
Proof before build
We benchmark whether an owned model wins on your task before anyone commits to building it. The evidence comes first.
Ownership over dependency
You should own your model, your weights, and the test that proves it works. Full stop.
Patterns over bespoke
Every engagement adds to a library of task and failure patterns we carry forward, so the work gets faster and better-grounded each time, you’re not paying to reinvent it.
Honesty over revenue
If an owned model won’t beat what you have, we’ll tell you, and you keep the test that says so. We’d rather earn trust than manufacture scope.
Founded by Philip Stevens
15 years in applied ML. Production work at Agoda building personalization and recommendation systems at scale, and at Quantcast managing the end-to-end ML lifecycle for core targeting models: feature engineering, model architecture, and domain drift monitoring.
Baseweight exists to supply the judgment most teams don’t have in-house and can’t justify hiring for: deciding what “correct” means for your task, choosing the technique the evidence supports, and verifying it holds, so you get an owned model that wins without standing up an ML team.
- Fine-tuning (LoRA, QLoRA, full)
- Eval design & regression harnesses
- RAG pipeline hardening
- DPO alignment
- Agent workflow design
- Inference optimization
- MSc Computer Science, Univ. of Auckland
We publish open-model-vs-API benchmarks. Get the next one when it drops.
The public benchmark is live, with more tasks and models underway. Leave your email and we’ll send the next one, methodology, failure analysis, per-task numbers. Technical content only, no pitch sequence.
No spam. Unsubscribe anytime.