Letta is a company founded around the MemGPT project (12k+ GitHub stars). The founding team comes from the same research lab and PhD advisors at Berkeley that produced Spark (→ Databricks) and Ray (→ Anyscale). We have deep expertise in both AI and systems, are currently hiring a founding team of exceptional engineers to join us in building the next generation of LLM agent technology.
Our goal as a company is to empower developers to build state-of-the-art LLM agents to power their own applications. We are looking for someone with expertise in systems and infrastructure (but also with familiarity/interest in LLMs) to help us build a resilient, scaleable, and high-performance platform for deploying and running agents. This role will provide the opportunity to help define what the agents stack will be from a systems perspective.
We are building the developer stack for agents that can run in production applications, not just demo notebooks. As a Software Engineer, you will bridge the gap bleeding-edge agent systems and production-grade software for running agents in real applications. You will help define developer APIs for agents, and lead development of our company’s OSS stack as well as the hosted service.
Responsibilities:
Design and develop an open-source Agents API standard (an alternative to the OpenAI Assistants API) through the MemGPT OSS project
Lead development, deployment, and monitoring of the MemGPT cloud hosted service
Ensure high quality code standards through rigorous testing, documentation, and following best practices
Maintain clear, up-to-date documentation on both external facing MemGPT developer APIs and internal code
Required skills:
At least 4 years of experience
Strong proficiency with Python
Strong understanding of how to architect services for security, reliability, and performance
Ability to design clean, robust REST APIs
Ability to architect robust, production-grade services
Familiarity with IaC (Terraform) and cloud infrastructure
Familiarity with Docker and K8
Familiarity with tooling across the AI stack, such as inference engines (e.g. vLLM, Ollama), vector DBs (e.g. Chroma, pgvector), and RAG (e.g. llama-index, langchain)
Bonus: proficient with TypeScript, React, Tailwind, etc. (the modern stack for web applications)
We are hiring a small, tight-knit team of exceptionally talented founding engineers. Every hire matters, so we take the hiring process very seriously.
Initial phone interview (30m video call): We want to learn more about your background, your skills, your opinions on open source AI, and why you want to work at an early stage AI startup.
Technical take-home (<1hr assessment): To get a better sense of your skillset, we’ll give you an example problem to work that’s as targeted to your potential day-to-day work as possible.
Paid workday (in-person recommended): As the final step in the interview process, we’ll simulate working together as closely as possible by giving you a real (or as close to real as possible) task to work on for a day - and paying for your time of course. If you live in the Bay Area, we highly recommend visiting our offices in-person! We’re an in-person company, so working at our office will give you a great idea of what it will be like to join as a full-time member of the team.