← Back to ArRENCE

ArRENCE

The AI that doesn't need a data center

A whitepaper from Webb Local AI, LLC — May 2026

ArRENCE is a routing operating system for a network of small, specialized models. It's powered by an in-house indie AI lab, with no data centers — now or ever. Meet Ren, the conversational face of the system.

1. Executive summary

The AI industry has spent the last three years scaling a single answer: bigger models, more data, more electricity, more centralized infrastructure. ArRENCE was built around a different bet.

Instead of one giant model trying to be good at everything, ArRENCE runs a constellation of around forty small, fine-tuned domain experts — each one specialized for a particular kind of work — coordinated by a patent-pending multi-layer routing architecture. A psychology specialist handles the psychology questions. A code-audit specialist reviews your code. A wilderness survival specialist talks bushcraft. Ren, the conversational personality at the front of the system, decides which specialist should answer and hands off invisibly, so you never have to pick a model.

The result is a system that delivers capability comparable to large frontier models on the conversational surface, but does it with dramatically lower energy cost per answer, no hyperscale data center dependency, and a privacy story that's actually provable: end-to-end encrypted long-term memory where even the lab that runs the service can't read what you've told it.

ArRENCE is operated by Webb Local AI, LLC — a three-person indie AI lab in McAlester, Oklahoma. The product is live at arrenceai.com.

This document explains what ArRENCE is, what makes the architecture different, what users can actually do with it, and why the company believes the future of useful AI is not in the hyperscale data center.

2. The problem with the current AI landscape

Modern conversational AI products share a common shape: one enormous model, hosted in a hyperscale data center, serving every kind of query through one set of generic weights. That shape has produced impressive results — but it has built-in costs the industry has stopped questioning.

Energy. Frontier-class models consume electricity at a scale that requires utility-grade infrastructure. Every casual question is answered by the same massive computation, regardless of whether the question warranted it.

Centralization. A handful of providers control the actual inference capacity. Their pricing, policy decisions, content rules, and uptime define what the rest of the ecosystem can build.

Privacy exposure. Conversations with a centralized assistant are processed in infrastructure the user cannot inspect. The provider's promise that "we don't train on your data" is a policy, not a technical guarantee. If their posture changes — or their database is subpoenaed — the user's history is on the table.

Generalist compromise. A single model trained to be acceptable at everything is rarely excellent at anything specific. A general-purpose model giving medical drug-interaction advice, debugging C, and reading tarot cards is, by definition, splitting its attention.

Vendor lock-in. Building on a closed frontier API means your product can be rate-limited, repriced, deprecated, or policy-restricted overnight. Many AI startups have already learned this the expensive way.

ArRENCE was built to make a different set of tradeoffs.

3. The ArRENCE approach

ArRENCE is best understood as an operating system for AI rather than a model. The thing that makes it work is not any individual model — it's the routing layer that decides which model should answer your question.

Many small experts, not one large generalist. Around forty small fine-tuned models (typically 7B to 20B parameters) cover the topics users actually ask about: general conversation, web research, math, code, code auditing, PC repair, chemistry and physics, law, finance, astronomy, biology, farm and garden, wilderness survival, business, doctor and medication info, mental-health-aware psychology, tarot and dreams, and many more. Each one is small enough to load fast and run efficiently, and specialized enough to be excellent at its lane.

Smart routing as the unique core. ArRENCE's most differentiated piece is the multi-layer routing architecture (patent-pending). It analyzes each incoming message, scores it across the domain categories, and dispatches the request to the right expert — with a dedicated tiny router model as the tiebreaker when scoring is ambiguous. You don't pick a model; you just ask. The expert that answers is shown on a small card under each reply.

Ren as the conversational face. Ren is the personality layer — the warm, opinionated, conversational presence users actually talk to. When you ask a question, Ren is who you're addressing. Under the hood, Ren routes the technical work to the right specialist and delivers the answer back through the same conversational voice. The result feels like talking to one capable assistant, not negotiating with a menu of bots.

Encrypted long-term memory. ArRENCE remembers what you tell it across sessions, in an end-to-end encrypted store called the Soul File. Your password is the key. The lab that runs ArRENCE literally cannot read your stored memories. Section 7 covers how this works in plain English.

Emotion-aware conversation. A 10-emotion lexicon scans every message you send, decays naturally over time, and resets after a few hours of inactivity. The system uses that signal to shade how Ren responds — more empathy when sadness or fear is high, more celebration when gratitude is high, grounded and safe when anger spikes. It also influences which memories surface: the system pulls memories encoded under a similar emotional state to your current one, a property called mood-congruency.

This is the architecture in one paragraph: small specialists, smart routing, encrypted memory, emotion-aware delivery — wrapped in a single conversational face. The rest of this document expands each piece.

4. The ArRENCE routing architecture (patent-pending)

ArRENCE makes three distinct routing decisions for every conversation, each in a different layer of the system. Together they form what makes the OS unique.

Layer 1: Browser-side routing. When you send a message, the first routing decision happens before the request leaves your browser. A fast, deterministic scoring engine looks at your message, weighs it against around three dozen domain categories, and either picks a confident specialist or marks the request as ambiguous and asks the routing brain to decide. This layer handles the majority of traffic without ever touching a model — pure pattern logic, instant.

Layer 2: The routing brain. When the browser layer can't make a confident pick, the request reaches ArRENCE's central routing logic. It re-scores the message with a deeper rule set and, if still ambiguous, asks a tiny dedicated router model (under two billion parameters) to classify the question into one of the specialist categories. This is the safety net — fast, cheap, accurate enough to handle the long tail.

Layer 3: The expert selection. Once the category is known, the routing OS picks which expert server should run the request, with affinity to whichever server already has that model warm in memory. The result: minimal model swaps, predictable latency, and the right specialist on the case.

Three small, focused decisions. Each one is fast. Each one is auditable. The combination is what makes a network of small models feel as smooth as a single large one — and ArRENCE believes it's a defensible architectural moat over the "one giant model" approach.

5. Meet Ren

Ren is the personality at the front of ArRENCE. Warm, opinionated, conversational, allergic to corporate hedge-speak. When you talk to ArRENCE, you talk to Ren — and Ren routes the technical work to whichever specialist makes sense without making you think about it.

The 31 specialists Ren routes to:

When you ask Ren a question that's clearly in one specialist's lane, that specialist answers and a small expert card appears under the reply so you can see who answered. When you want a different specialist to weigh in on the same question — say, you got an answer from Ren but you'd like Math to take a pass — there's a Second Opinion control that resends your last question to whichever specialist you pick.

For most users, the specialist machinery is invisible. You ask. Ren answers. The right expert is on the case behind the scenes.

6. What you can do with Ren

ArRENCE is a system, not a feature list — but here's what users can actually do, organized by what they're trying to accomplish.

Everyday conversation. Ask anything. Ren answers, with a specialist on the case for topics that warrant one. Voice input and read-aloud are built in. Chats are kept in a sidebar so you can switch between threads. Math notation renders cleanly. Twelve color themes. A dyslexia-friendly font option.

Research and live information. Web search, deep multi-source research, weather lookups, shopping comparisons, YouTube video search and YouTube link analysis (paste any YouTube URL and Ren pulls the transcript and answers from it).

Creative work. Generate images from a description ("draw a cat wearing a hat," "make an image of a mountain lake at sunset"). Ren's in-house image generators produce the result; paid users can also push prompts through Enhanced Image with RiverFlow V2 Fast, a production-grade 1024-pixel upgrade with notably better prompt accuracy. Three Enhanced Image flows are available: fresh text-to-image, attach a photo and describe a change (image-to-image), or push any in-house image through an enhancement pass after Ren generates it.

Vision. Attach or paste an image (screenshot, photo, error message on your screen) and ask about it. Ren describes it, answers questions, or helps with what you're showing.

Documents. Upload a PDF, Word document, text file, or markdown file and ask questions about it. The Document Analyzer specialist reads the file and answers from its contents.

Big builds. From the Tools menu, Ren can take on multi-step work that would otherwise take you hours:

Reflective and personal tools. Three-card tarot readings, dream interpretation, Bible verse meditation — each with its own structured flow and the Tarot & Dreams specialist handling interpretation.

Productivity. Notes and reminders by chat ("remind me to call the dentist tomorrow"), a My Tasks panel for managing the list visually, a full calendar page for appointments.

Deals panel. Daily community-vetted deal cards on the side of the chat. Scratch to reveal, vote, mark used when you redeem, watch your savings counter.

Memory. The Soul File — described next — that remembers what you tell Ren across sessions, encrypted in a way the lab itself can't read.

The full feature set is wider than this; the public user guide covers every flow in detail.

7. Soul File: privacy you can prove

Most assistants promise privacy. ArRENCE proves it.

The Soul File is ArRENCE's persistent long-term memory layer. When you have it turned on, Ren remembers what you've told her across sessions — preferences, ongoing projects, details about your life that you want her to know — and uses that context to make future conversations more useful.

What makes it different is the cryptography. Your memories are encrypted on the way out of your conversation, with a key derived from your password. The encrypted blob is stored on ArRENCE's infrastructure. The lab does not hold a copy of the key. When you log in, the key is briefly held in memory long enough to decrypt your blob, and then it's gone again. Restart the servers, and zero plaintext memory exists anywhere outside your own session.

In practical terms:

The cryptographic specifics — Argon2id for key derivation, AES-256-GCM for the memory blob, mood-congruency-weighted retrieval, and a salience-and-half-life model that mirrors how human long-term memory actually decays — are documented in the technical reference. For the user, the experience is simple: turn memory on, talk to Ren, and trust that what you say stays yours.

8. Why no data centers is the point

ArRENCE doesn't run on Amazon, Google, Azure, or any hyperscale provider. The lab built its own infrastructure: a small cluster of purpose-built servers hosted by the lab in McAlester, Oklahoma — a setup that is currently powerful enough to serve every active user on the platform.

This is not a stopgap. It's the architectural thesis.

Small models + smart routing = big capability with a small footprint. A 7-billion-parameter fine-tuned specialist running on dedicated hardware can answer most domain questions as well as — or better than — a frontier model trying to do the same thing across all domains simultaneously. When you only need to load the right specialist for the request at hand, you don't need a data center to do it.

Scaling without data centers. Because each expert is small and the routing layer does the heavy lifting, adding capacity means adding another box, not another floor of cooling. The math works at every scale we project. ArRENCE believes — and is building on the bet — that it can grow to a meaningful national user base without ever needing hyperscale infrastructure.

Energy as a feature, not a footnote. Routing a casual question to a small specialist costs a fraction of the electricity the same question would cost on a frontier model. Over millions of queries, the difference is the difference between "AI is an environmental cost we pay for capability" and "AI is something we can run sustainably." ArRENCE chose the second.

No vendor mid-stack. Because the lab owns the inference layer, there is no third party that can rate-limit, reprice, deprecate, or change the policy on ArRENCE's users without warning. The system is operated by the people who built it.

This is what "indie AI lab" means in practice: full ownership of the stack, no dependency on hyperscale infrastructure, and the freedom to make different tradeoffs than the rest of the industry has settled on.

9. Plans & access

ArRENCE is available at four levels.

PlanWho it's forHighlights
GuestTrying it outLimited daily conversations, no account needed
Beginner · $5/moLight regular useUnlimited daily chats*, notes & reminders, Daily Briefing, some Enhanced Image
Full · $15/moPower usersAll Tools-menu features at production limits, full vision & documents
VIPProfessional / team useHighest limits across the board

*Within a weekly token allowance. The exact usage you have left is shown as a battery-style indicator in the app. All plans are managed through standard subscription billing; cancel anytime from the account menu.

10. The lab behind ArRENCE

ArRENCE is operated by Webb Local AI, LLC — a small indie AI lab based in McAlester, Oklahoma, founded in December 2025. The lab consists of three co-owners who have known each other for over thirty years and have been working together in various forms across that time.

Jeremy Webb — CEO & Lead Engineer. Founder of Webb Local AI. AI hobbyist since the age of six, cutting his teeth on an early Alice chatbot fork in the 1990s, and a working cryptography, opsec, and computer-hardware practitioner long before transformer-based language models existed. Holds a BS in Psychology with minors in Neurology and Communications from Western Washington University, where he took formal coursework in symbolic logic instead of calculus — the grounding behind ArRENCE's logic-flow approach to routing. Released a privacy-focused cryptocurrency in 2019, leading a development team through the cryptographic and product work. Owns and operates an independent PC-repair business and built every server in the ArRENCE stack personally. The combination — psychology, neurology, symbolic logic, hands-on hardware, and decades of pre-transformer-era AI thinking — is why ArRENCE looks the way it does: emotion-aware conversation, encrypted memory, small models routed intelligently, privacy-first by design rather than by retrofit.

Chris Heskin — Chief Technology Officer (CTO). Bio to follow.

Bryan Doss — Chief Financial Officer (CFO). Bio to follow.

The lab can be reached at contact@webblocalai.com or +1-800-240-0024.

11. Roadmap

ArRENCE is built. The product is live. What follows is the direction the lab is investing in next:

More specialists. The expert roster grows whenever a domain becomes worth specializing in. We add specialists when the routing data tells us users keep asking a kind of question that doesn't have a great existing lane.

Deeper creative tools. The Enhanced Image pipeline is the first of a generation of upgraded creative capabilities. Expect more in the same shape — focused, paid-tier features that deliver clearly better output than the in-house baseline.

Richer voice experience. Voice input and read-aloud already work today. The roadmap is to make voice a first-class interaction mode rather than an accessibility option — natural conversation rhythm, faster turn-taking, more expressive output.

Expanded memory features. The Soul File foundation is in place. The roadmap explores richer recall surfaces — searching your own memories, exporting them, optionally sharing curated subsets with specific people.

Continued infrastructure investment. As demand grows, the lab adds boxes, not data centers. The architecture is designed to scale this way for the foreseeable future.

We're conservative about dates. The product ships when it's worth shipping. We'd rather miss a quarter than ship something we'd be embarrassed about.

Roadmap items are directional; subject to change as the lab's priorities and users' needs evolve.

12. Try ArRENCE

The fastest way to understand ArRENCE is to use it.

arrenceai.com — sign in, register, or continue as a guest. The guest experience is generous enough to give you a real sense of the product before you decide whether to subscribe.

Contact: contact@webblocalai.com · +1-800-240-0024
Lab: Webb Local AI, LLC · McAlester, Oklahoma
Press: arrenceai.com/press_releases.html