Skip to content
Viper Softworks
San Antonio, TX · Est. 2010 · Now taking Q3 engagements

Custom software and AI systems, built by engineers who've shipped them.

Viper Softworks designs, ships, and operates production-grade applications and AI systems — on the frontier in the cloud, or on your own infrastructure when the data can't leave.

Founded
2010
Projects shipped
120+
Industries
Gov · Med · Logistics · Wholesale · Distribution · Retail
Today's status
Accepting work

Selected engagements over the past decade

  • FBLA
  • ·
  • Lonestar Pet / Animal Supply Co.
  • ·
  • Rocky & Maggie's (Bill Klein · TLC's The Little Couple)
  • ·
  • Psigen PsiCapture v7
  • ·
  • Qore Analytics
  • ·
  • Government
  • ·
  • Medical

Built for regulated industries

See how →
  • HIPAA
  • SOC 2
  • PCI-DSS
  • NIST 800-53 / 800-171
  • FERPA
  • CJIS
  • GDPR / CCPA

01 · What we build

Six practices. One team that ships them together.

Most projects span more than one of these. We staff for the whole arc — discovery to production to operations.

Featured / 2026 practice

AI Engineering

Agentic systems, RAG, evals, and on-prem deployments. We integrate frontier models when they're the right call and run open-weights on your own metal when they aren't.

  • Claude Agent SDK / MCP servers
  • Retrieval with pgvector / Qdrant
  • Evals: Inspect, Braintrust, Promptfoo
  • On-prem: vLLM, Ollama, llama.cpp
  • Fine-tuning: LoRA, QLoRA, DPO
  • Voice agents & computer-use
Read the AI dossier

Product & Web Apps

TypeScript, React, Next.js, Astro, Tailwind. Design systems built to outlive the redesign cycle.

Backends & APIs

Go, Rust, Python (FastAPI), .NET 9. gRPC, OpenAPI, event-driven systems on NATS or Kafka.

Cloud & Platform

AWS, Azure, GCP, Cloudflare. Kubernetes, Terraform/Pulumi, OpenTelemetry, GitHub Actions.

Data & Analytics

Postgres + pgvector, DuckDB, ClickHouse, dbt. Dashboards in Metabase / Grafana that people actually open.

Security & Compliance

SAST/DAST with Veracode & SonarQube. Runtime APM via Dynatrace, SIEM & log analytics with Splunk. Built for regulated industries from day one.

HIPAA SOC 2 PCI-DSS NIST FERPA

02 · AI engineering

Frontier in the cloud. Open-weights on your own metal.

We don't have a model preference — we have an outcome preference. The pick depends on your data, your latency budget, and your compliance perimeter.

Track 01 Managed APIs

Frontier models in the cloud

When you want the best available reasoning, lowest time-to-ship, and don't need to keep data on-prem. We're hands-on with the current generation:

  • Anthropic Claude
    Opus 4.7 · Sonnet 4.6 · Haiku 4.5
    Reasoning · Agents · Code
  • OpenAI
    GPT-5 · o-series · Realtime
    Voice · Tool-use
  • Google Gemini
    2.x Pro · long-context · multimodal
    Long context · Vision
  • Specialty providers
    Voyage · Cohere · ElevenLabs · Cartesia · Deepgram
    Embed · TTS · STT
Best fit: customer-facing assistants, agentic workflows, document understanding, prototype-to-prod in weeks.
Track 02 On-prem · Self-hosted

Open-weights on your own infrastructure

When data can't leave the perimeter, latency must be deterministic, or unit economics demand owned inference. Same engineering rigor, your hardware.

  • Models
    Llama 3.3 / 4 · Qwen 3 · DeepSeek V3 / R1 · Mistral · Gemma 3 · Phi-4
    Open weights
  • Serving
    vLLM · SGLang · TensorRT-LLM · llama.cpp · Ollama · MLX
    GPU · Apple Silicon
  • Tuning & alignment
    LoRA · QLoRA · DPO · ORPO · synthetic data pipelines
    Domain adaptation
  • Orchestration & guardrails
    MCP servers · LangGraph · DSPy · Llama Guard · NeMo Guardrails
    Agent infra
Best fit: healthcare, finance, defense, legal, gov — anywhere the data perimeter is the product.
Capability
Retrieval & RAG
pgvector, Qdrant, Weaviate, LanceDB, hybrid search
Capability
Agents & tool-use
Claude Agent SDK, MCP, LangGraph, computer-use
Capability
Evals & observability
Inspect, Braintrust, Langfuse, Promptfoo
Capability
Voice & multimodal
Whisper, ElevenLabs, Cartesia, vision OCR pipelines

03 · Tools of the trade

A working stack, not a slide.

We pick technology that's load-bearing — boring where it should be, novel where it earns the right.

Languages
  • TypeScript
  • Python
  • Go
  • Rust
  • C# / .NET 9
  • SQL
Frontend
  • React 19
  • Next.js 15
  • Astro 5
  • Tailwind v4
  • Svelte 5
  • WordPress / Sage
Backend & data
  • FastAPI
  • Go (chi, sqlc)
  • Postgres + pgvector
  • Redis
  • DuckDB
  • ClickHouse
Cloud & AI
  • AWS · Azure · GCP
  • Cloudflare Workers
  • Kubernetes · Terraform
  • Claude / GPT-5 / Gemini
  • vLLM · Ollama · MLX
  • OpenTelemetry · Grafana
Security & compliance
  • Veracode
  • SonarQube
  • Snyk
  • Dynatrace
  • Splunk
  • OWASP ZAP

04 · How we work

Small team. Tight loop. No theatre.

  1. STEP 01

    Discovery

    Two-week paid sprint. We document the system, the goal, the constraints, and what to build first.

  2. STEP 02

    Prototype

    A thin, real version in production-shaped code — not a demo deck. Stakeholders touch it.

  3. STEP 03

    Build & ship

    Weekly demos, trunk-based, observability from day one. We don't disappear into a fog.

  4. STEP 04

    Operate or hand off

    We stay on retainer or train your team. Every artifact ships with a runbook.

05 · About

Engineers first, since 2010.

Viper Softworks was founded in San Antonio in 2010. We've shipped software for kiosks, hospitals, state agencies, e-commerce, and finance — and these days, increasingly, the AI systems wrapped around them.

Same principal engineers from your first call to your last deploy. No bench-warming, no offshore hand-off.

Do you work with companies that need their AI on-prem? +

Yes — this is most of our 2025–2026 pipeline. We deploy vLLM or Ollama on customer hardware (single A100 boxes through small clusters), tune open-weight models on your data, and integrate them behind your existing auth and audit infrastructure.

What's the smallest engagement you take? +

A two-week paid discovery. We can take a single discovery without committing to a build — sometimes the answer is "you don't need us yet."

Are you remote or local? +

Both. HQ is in San Antonio; we work on-site within Texas regularly and remote for clients elsewhere in the US.

Do you still do the .NET / WPF / SharePoint work you used to? +

Yes, when it's the right tool. We've maintained .NET fluency through .NET 9 and still operate several long-running enterprise stacks on it.

Let's build something

Ready when you are. No deck required.

Send us a few sentences about what you're trying to build. You'll hear back from a principal engineer, not a sales rep.