I’ve been working on LLM apps (agents, RAG, etc.) and keep running into the same issue:

something breaks… and it’s really hard to figure out why

most tools show logs and metrics, but you still have to manually dig through everything

I started experimenting with a different approach where each request is analyzed to:

for example, catching things like:
“latency spike caused by prompt token overflow”

I’m curious, how are you currently debugging your pipelines when things go wrong?