Fine-Tuning as a Service

We tune. On your hardware. On your data.

Small models tuned on real workflows outperform generic giants. We train the adapters. You keep the data. PHI never leaves your building.

Why we do the tuning

Tuning is not self-serve.

Self-serve tuning in a hospital is a liability minefield. Bad training data produces a degraded model, and there’s no one accountable for what went wrong. Compliance teams need a named human on the hook for what went into the model.

GofarAI engineers perform every tuning engagement. Controlled quality. Named accountability. And every engagement makes us better at the next one.

Technical Approach

LoRA and QLoRA adapters, on your GPUs.

Freeze the base. Train an adapter.

The base model weights stay frozen. We train a small adapter (less than 1% of model size) that carries all the workflow-specific behavior. Adapters are portable, auditable, and cheap to iterate.

Fits the hardware you already have.

QLoRA fine-tunes a 7–8B model on a single 24GB GPU. A 13–14B model fits on one 48GB card or two 24GB cards. We reuse the same inference hardware for training — overnight or on weekends.

PHI never leaves the building.

Because training runs on your GPUs on your premises, no data transfer agreements are needed. Our engineers work inside your environment. Our lab-side GPUs only touch synthetic and public data.

Base weights: frozen · Adapter: <1% of model · Training run: hours to days

Engagement Playbook

Every engagement runs the same way.

Standard steps mean predictable timelines, clean documentation for auditors, and no surprises.

Step 01

Data intake checklist

What documents exist, where they live, what format they’re in, and who owns access.

Step 02

PHI handling protocol

Signed documentation of exactly how data moves inside your environment during training. Nothing external.

Step 03

Dataset preparation

The hard part. We work with your team to turn messy real-world documents into clean training examples. Around 70% of the engagement.

Step 04

LoRA training on client hardware

Adapter training runs on your GPUs. Overnight or weekends. Hours to a couple of days of compute.

Step 05

Validation

The adapter runs against our evaluation suite. Improvement is measured. Regressions are caught before anything reaches production.

Step 06

Handover

Signed-off adapter, documentation for auditors, and a named engineer accountable for the artifact. The adapter belongs to you and stays inside your walls.

Your data. Your GPUs. Your adapter.