San Antonio, TX · Est. 2010 · Now booking new engagements

Custom software and AI systems, built by engineers who've shipped them.

Viper Softworks designs, ships, and operates production-grade applications and AI systems — on the frontier in the cloud, or on your own infrastructure when the data can't leave.

Start a project See what we build ⌘K to search the site

Founded: 2010
Projects shipped: 0+
Industries: Gov · Med · Logistics · Wholesale · Distribution · Retail
Today's status: Accepting work

Selected engagements over the years

FBLA
·
Lonestar Pet / Animal Supply Co.
·
Rocky & Maggie's (Bill Klein · TLC's The Little Couple)
·
Psigen PsiCapture v7
·
Qore Analytics
·
Government
·
Medical

Built for regulated industries

See how →

HIPAA
SOC 2
PCI-DSS
NIST 800-53 / 800-171
FERPA
CJIS
GDPR / CCPA

01 · What we build

Six practices. One team that ships them together.

Most projects span more than one of these. We staff for the whole arc — discovery to production to operations.

Featured / Flagship practice

AI Engineering

Agentic systems, RAG, evals, and on-prem deployments. We integrate frontier models when they're the right call and run open-weights on your own metal when they aren't.

Claude Agent SDK / MCP servers
Retrieval with pgvector / Qdrant
Evals: Inspect, Braintrust, Promptfoo
On-prem: vLLM, Ollama, llama.cpp
Fine-tuning: LoRA, QLoRA, DPO
Voice agents & computer-use

Read the AI dossier

Product & Web Apps

TypeScript, React, Next.js, Astro, Tailwind. Design systems built to outlive the redesign cycle.

Backends & APIs

Go, Rust, Python (FastAPI), .NET. gRPC, OpenAPI, event-driven systems on NATS or Kafka.

Cloud & Platform

AWS, Azure, GCP, Cloudflare. Kubernetes, Terraform/Pulumi, OpenTelemetry, GitHub Actions.

Data & Analytics

Postgres + pgvector, DuckDB, ClickHouse, dbt. Dashboards in Metabase / Grafana that people actually open.

Security & Compliance

SAST/DAST with Veracode & SonarQube. Runtime APM via Dynatrace, SIEM & log analytics with Splunk. Built for regulated industries from day one.

HIPAA SOC 2 PCI-DSS NIST FERPA

02 · AI engineering

Frontier in the cloud. Open-weights on your own metal.

We don't have a model preference — we have an outcome preference. The pick depends on your data, your latency budget, and your compliance perimeter.

Track 01 Managed APIs

Frontier models in the cloud

When you want the best available reasoning, lowest time-to-ship, and don't need to keep data on-prem. We're hands-on with the current generation:

Anthropic Claude

Opus · Sonnet · Haiku

Reasoning · Agents · Code
OpenAI

GPT · o-series · Realtime

Voice · Tool-use
Google Gemini

Pro · long-context · multimodal

Long context · Vision
Specialty providers

Voyage · Cohere · ElevenLabs · Cartesia · Deepgram

Embed · TTS · STT

Best fit: customer-facing assistants, agentic workflows, document understanding, prototype-to-prod in weeks.

Track 02 On-prem · Self-hosted

Open-weights on your own infrastructure

When data can't leave the perimeter, latency must be deterministic, or unit economics demand owned inference. Same engineering rigor, your hardware.

Models

Llama · Qwen · DeepSeek · Mistral · Gemma · Phi

Open weights
Serving

vLLM · SGLang · TensorRT-LLM · llama.cpp · Ollama · MLX

GPU · Apple Silicon
Tuning & alignment

LoRA · QLoRA · DPO · ORPO · synthetic data pipelines

Domain adaptation
Orchestration & guardrails

MCP servers · LangGraph · DSPy · Llama Guard · NeMo Guardrails

Agent infra

Best fit: healthcare, finance, defense, legal, gov — anywhere the data perimeter is the product.

Capability

Retrieval & RAG

pgvector, Qdrant, Weaviate, LanceDB, hybrid search

Capability

Agents & tool-use

Claude Agent SDK, MCP, LangGraph, computer-use

Capability

Evals & observability

Inspect, Braintrust, Langfuse, Promptfoo

Capability

Voice & multimodal

Whisper, ElevenLabs, Cartesia, vision OCR pipelines

Our product

Google TV & Android TV New · LLM-controlled

WatchWall™. Watch everything. All at once.

Turn your TV into a true multi-view wall — tile live sports, news, streaming, your console and the open web side by side, then take any panel full-screen. Now drivable by any MCP-capable assistant.

Up to 6 panels at once
Overlay PiP — even without native PiP
Web, HDMI, antenna/OTA & LAN sources
Built-in MCP server for LLM control

Explore WatchWall Get it on Google TV

WatchWall multi-view — four live panels on one TV

03 · Tools of the trade

A working stack, not a slide.

We pick technology that's load-bearing — boring where it should be, novel where it earns the right.

Languages

TypeScript
Python
Go
Rust
C# / .NET
SQL

Frontend

React
Next.js
Astro
Tailwind
Svelte
WordPress / Sage

Backend & data

FastAPI
Go (chi, sqlc)
Postgres + pgvector
Redis
DuckDB
ClickHouse

Cloud & AI

AWS · Azure · GCP
Cloudflare Workers
Kubernetes · Terraform
Claude / GPT / Gemini
vLLM · Ollama · MLX
OpenTelemetry · Grafana

Security & compliance

Veracode
SonarQube
Snyk
Dynatrace
Splunk
OWASP ZAP

04 · How we work

Small team. Tight loop. No theatre.

STEP 01

Discovery

Two-week paid sprint. We document the system, the goal, the constraints, and what to build first.
STEP 02

Prototype

A thin, real version in production-shaped code — not a demo deck. Stakeholders touch it.
STEP 03

Build & ship

Weekly demos, trunk-based, observability from day one. We don't disappear into a fog.
STEP 04

Operate or hand off

We stay on retainer or train your team. Every artifact ships with a runbook.

05 · About

Engineers first, since 2010.

Viper Softworks was founded in San Antonio in 2010. We've shipped software for kiosks, hospitals, state agencies, e-commerce, and finance — and these days, increasingly, the AI systems wrapped around them.

Same principal engineers from your first call to your last deploy. No bench-warming, no offshore hand-off.

Do you work with companies that need their AI on-prem? +

Yes — this is most of our current pipeline. We deploy vLLM or Ollama on customer hardware (single A100 boxes through small clusters), tune open-weight models on your data, and integrate them behind your existing auth and audit infrastructure.

What's the smallest engagement you take? +

A two-week paid discovery. We can take a single discovery without committing to a build — sometimes the answer is "you don't need us yet."

Are you remote or local? +

Both. HQ is in San Antonio; we work on-site within Texas regularly and remote for clients elsewhere in the US.

Do you still do the .NET / WPF / SharePoint work you used to? +

Yes, when it's the right tool. We've maintained .NET fluency through its latest releases and still operate several long-running enterprise stacks on it.

Let's build something

Ready when you are. No deck required.

Send us a few sentences about what you're trying to build. You'll hear back from a principal engineer, not a sales rep.

customersupport@vipersoftworks.com

Custom software and AI systems, built by engineers who've shipped them.

Six practices. One team that ships them together.

AI Engineering

Product & Web Apps

Backends & APIs

Cloud & Platform

Data & Analytics

Security & Compliance

Frontier in the cloud. Open-weights on your own metal.

Frontier models in the cloud

Open-weights on your own infrastructure

WatchWall™. Watch everything. All at once.

A working stack, not a slide.

Small team. Tight loop. No theatre.

Discovery

Prototype

Build & ship

Operate or hand off

Engineers first, since 2010.

Ready when you are. No deck required.