I just finished a project that talks to Anthropic, OpenAI, and Google’s APIs simultaneously — a debate platform where AI agents powered by different providers argue with each other in real time. The codebase touches all three SDKs (@anthropic-ai/sdk, openai, u/google/genai) and each provider has completely different patterns for things like streaming, structured output, and tool use.

I used AI coding tools heavily throughout (Cursor + Codex for different parts), and the experience taught me a lot about where these tools shine and where they’ll confidently lead you off a cliff.

Where AI coding tools were reliable:

Where they hallucinated or broke things:

My takeaway: AI coding tools are genuinely 3-5x multipliers for a solo developer, but the multiplier only holds if you verify every external integration point manually. The tools are great at code structure and terrible at API specifics. If your project talks to external services, budget time for verification that the AI won’t do for you.

Curious if others have found good strategies for keeping AI coding tools accurate when working across multiple external APIs.