Deep-tech research lab

From data, to intelligence.

We build Thin Language Models from the ground up: data, distillation, and weights that ship.

Move your cursor through the data

Scroll
01 / Models

Thin Language Models,
a category we coined

A Thin Language Model is small, specialized, and runs locally: frontier-grade judgment for one domain, distilled into weights that fit on a laptop. These are the models we research and release.

FR-Forgev1 · 1.7B
FR-Forge-1.7B

A Thin Language Model for the factory floor. Grounded reasoning across operations, quality, and supply chain, running locally on Apple Silicon.

TLM · Manufacturing
FR-Lexv0.3 · 1.7B
FR-Lex-1.7B

A Thin Language Model for law. Runs standalone, or as a Lens in front of a frontier model, scoping jurisdiction, citations, and routing before the expensive call.

TLM · Law
New · live on Hugging Face
FR-Blaze-9B

A Thin Language Model for growth. A Gemma 2 9B lens for SEO and paid-platform marketing: audits, Search Console and Google Ads framing, on-page and backlink analysis, RAG-grounded, with deterministic guardrails that refuse black-hat tactics by design.

TLM · Growth & SEO9B · Gemma 2 baseGemma license
StatusShipped
Parameters9B · 4-bit
BaseGemma 2 9B
Eval83.6% (lens)

Need a small language model instead? We build bespoke SLMs as a service: your domain, your data, your hardware. The TLMs above are what the lab ships under its own name.

Commission an SLM →
02 / The Thesis

Where a Thin
Language Model fits

Three tiers, one trade-off. Generality is broad, hosted, and costly. Specialization is narrow, local, and cheap. We engineer the right tail.

Relative footprint
··
Specialization·
Cost efficiency·
Speed (low latency)·
Privacy (on-device)·
·

03 / Research Agenda

The hardest problems
in deep tech AI

Our agenda runs from data infrastructure to emergent model behaviour. Every thread feeds the models we ship.

001

Deep Data Architecture

Proprietary pipelines that surface signal from noise at scale, semantically indexing billions of data points.

Data InfrastructureSemantic IndexingPipeline Engineering
Our category 002

Thin Language Models

The category we coined: frontier-grade reasoning distilled into sub-2B models that run locally and outperform far larger baselines on their domain.

DistillationLoRA · MLXEdge Inference
003

Autonomous Reasoning

Multi-step inference engines that decompose novel problems, form hypotheses, and self-correct in real time.

Chain-of-ThoughtSelf-CorrectionHypothesis Formation
004

Synthetic Data Generation

Pipelines producing training data indistinguishable from real-world distributions, closing the data scarcity gap.

Synthetic CorporaDistribution MatchingDomain Adaptation
005

Multi-Agent Emergence

Mapping the boundary between programmed behavior and spontaneous cooperation to exploit emergent dynamics.

Multi-Agent RLEmergent CoordinationSwarm Intelligence
006

Compressed Intelligence

10x size reduction with under 3% accuracy degradation, with large-scale capability on constrained hardware.

QuantizationPruning4-bit Runtime
04 / Portfolio

Research that
becomes product

AI-powered prospecting, pipeline, and relationship intelligence. Turn signals into pipeline, automate outreach at scale, and close smarter deals faster.

Visit Website ↗

Venture intelligence for smarter fundraising and introductions. Data-driven dealflow, the right investors, and networks that compound.

Visit Website ↗
05 / Operating Principles

One philosophy, drawn in two colors: what we refuse, and what we own.

What we refuse
We do not license our breakthroughs.
We do not outsource our thinking.
We do not rent our intelligence.
What we own
Every model we specialize.
Every dataset we build.
Every product we ship.

Open at the foundation. Ours at the edge.

06 / Deep Data

Data is the moat

Not the volume. The method. We build training data the way we build models: in-house, synthetic where it must be, and reviewed by people who know the domain.

Proprietary
Pipelines built in-house to surface signal from noise, never blind-scraped.
Synthetic
Generation that closes the data-scarcity gap where real corpora run thin.
Reviewed
Human-in-the-loop annotation, checked by people fluent in the domain.
Specialized
Corpora tuned to one problem at a time, not one model for everything.
07 / Contact

Work with
the lab.

We collaborate with researchers, enterprises, and builders who are serious about AI. If you are building something that matters, we want to hear from you.

Start a conversation →