Description
The problem
Business enterprises overpay vendors – on every batch of invoices, on every month because the data that would catch lives in different systems. We are building an AI agent that processes invoices end-to-end, reasons across all the relevant sources, flags genuine discrepancies, and acts – without a human having to investigate each one.
What you will own
Everything engineering. Schema design to deployment to the 2am fix when something breaks in production. There is no tech lead above you. There is no platform team. There is the architecture, you, and the founders. Concretely, this means building:
● A multi-stage agentic pipeline that takes a vendor invoice and produces a structured decision – fully autonomous for clear cases, escalating to human review for genuinely ambiguous ones. We use LangGraph, but if you’ve built equivalent systems with Temporal, Prefect, or custom state machines with LLM orchestration, that works
● An LLM-powered extraction layer that handles real invoices – scanned PDFs, stamped documents, inconsistent layouts – and returns structured output
● A graph data model that connects invoices to various sources and can traverse those relationships to detect discrepancies
● ERP connectors, GST validation logic, and a write-back layer that closes the loop
What we need
● Strong Python. Async FastAPI, clean service boundaries, tests that actually catch bugs. You have shipped Python backends that handled real production load
● Solid Postgres. Complex queries, schema design, migrations without downtime, row-level security for multi-tenant data. pgvector is a plus – if not, you pick it up fast
● LLM API experience in production. You have called an LLM API for something that real users depended on. You know about structured output, retry logic, cost management, prompt versioning. A side project counts if it was genuinely deployed
● Comfort with graph data models. You understand when a graph is the right structure and when it is not. You do not need deep Neo4j production experience – you need to understand graph relationships conceptually and be willing to learn Cypher. It is a 2-day ramp for the right person
● Working knowledge of deployment. Deployed and operated production workloads on GCP. Cloud Run, Cloud SQL, Cloud Storage, Redis — you’re comfortable across the stack. If you’ve done it on AWS, the translation isn’t hard, but GCP is where we are
● You own things. Not “I contributed to” – you designed it, shipped it, and fixed it when it broke.
That pattern needs to be visible in your history
Good to have, not mandatory
● Built an agentic pipeline with multiple stages
● Any fintech, P2P domain experience – even tangential
● Worked at a startup with under 20 people
● Has a GitHub, blog, or writeup that shows how you think about a hard technical problem
What you get
● The hardest engineering problem you would have worked on. This is not CRUD with an LLM bolted on
● Real ownership. First engineering hire. Your architectural decisions will be in this product five years from now
● Equity that matters. ESOP – Open to discussion. We are pre-seed – this is a bet, not a guarantee. We will not pretend otherwise
● No meetings tax. You work directly with the founders. The product is specified clearly. You know what you are building and why
Similer jobs

