building https://wheneva.ai since 2025

wheneva.ai

Webhook delivery for LLMs. An event pipeline that understands context.

What it is

A webhook delivery service built for the way AI systems actually work.

Traditional webhook infrastructure assumes synchronous request/response semantics: something happens, you get notified, you handle it, you’re done. LLM workloads don’t fit that shape. Inference is slow and unpredictable. Tool calls fan out. Streaming complicates retry logic. Context from the originating event matters downstream.

wheneva.ai is a webhook pipeline designed for that world — async-first, context-aware, built for I/O-bound delivery workloads.

Technical decisions

The key architectural target: Falcon over Puma for concurrency.

Puma uses a thread-per-request model. For webhook delivery — which is mostly waiting on outbound HTTP responses from subscriber endpoints — threads spend most of their time blocked. At 1000 concurrent webhooks in flight, that’s 1000 blocked threads. It scales poorly and costs memory per connection.

Falcon uses Ruby Fibers (via the async gem). A fiber yields while waiting for I/O and resumes when the response arrives. One OS thread can manage thousands of concurrent webhook deliveries. For an I/O-bound workload, this is the right primitive.

Stack: Rails 8, SQLite, Solid Queue, Solid Cache, Solid Cable, Puma, Kamal. (Falcon migration is the planned next infrastructure step — the rationale is right, the implementation is not there yet.)

The SQLite-for-everything stance was deliberate: fewer infrastructure dependencies, simpler deployment, and for the current scale (zero paying customers) the right tradeoff. Postgres can come later.

What’s working

  • Solid Queue for background delivery, SQLite for everything, Kamal deployment scaffolded
  • Core data model and webhook delivery loop exists

What’s not working

  • Nothing is shipping to users — the core loop exists but there’s no onboarding, no pricing, no signup flow
  • Stripe integration not started
  • The async infrastructure is correct but untested at any real volume

Open questions

  • Is the pain point real? Webhook delivery for LLMs is a genuine engineering problem, but it’s unclear whether teams with that problem are willing to pay for a hosted solution vs. rolling their own.
  • What’s the right pricing anchor? Per-delivery? Per-subscriber? Flat monthly?
  • Does “understands context” become a meaningful differentiator, or is it a feature that only matters after basic delivery is nailed?

Solved

  • SQLite-for-everything: fewer infrastructure dependencies, correct tradeoff at zero-paying-customer scale.
  • Deployment scaffolded via Kamal on Hetzner.
  • Architecture rationale: Falcon over Puma is the right call for I/O-bound webhook delivery (fiber model; not yet implemented — running Puma currently).

What’s deferred

Smart filtering, Liquid template transformations for payload reshaping, content analysis hooks, streaming-aware delivery — all deferred until basic delivery is working reliably at scale.

← projects