Deine Aufgaben
We are looking for a Senior Automated QA Engineer to lead our testing efforts. You won't just be testing standard web interfaces; you'll be figuring out how to reliably automate testing for non-deterministic AI features, multi-step AI agents, and complex LLM pipelines. If you know Playwright inside and out and have scars from trying to test LLM hallucinations in production, we want to talk to you.
What you’ll be doing
What you’ll be doing
- Own the E2E framework: Build, maintain, and scale our automated testing framework using Playwright (TypeScript/Python).
- Test the unpredictable: Design strategies to test non-deterministic LLM outputs, AI agents, and RAG pipelines where standard assertions don't always work.
- Tackle LLM-specific challenges: Build guardrails and automated checks for prompt drift, hallucinations, latency, and context window limits.
- Evaluate Agent behavior: Create scenarios to test how our AI agents handle edge cases, multi-step reasoning, and error recovery in real-world document processing workflows.
- Integrate and collaborate: Wire your tests into our CI/CD pipelines to ensure we can ship quickly without breaking the core AI logic. Work closely with AI researchers, backend engineers, and product managers to define what "quality" means for an AI agent.