Staff Site Reliability Engineer at Lattice
San Francisco, CA, US

Job overview

As part of Lattice's SRE team, you will be responsible for the performance and reliability of Lattice's infrastructure. You will collaborate with our product engineers to improve their development experience and the resiliency of our application code. You will own our AWS and Kubernetes systems to ensure consistent deployments for our product engineering team and stability of service for our customers.

Who are we

We’re a small and impactful team of software engineers continuously working to improve our product and our craft. We use a modern, cutting-edge tech stack and love experimenting with new technologies to create our products. We are really excited to find a person that has the experience and ability to own our infrastructure and improve our development experience.

What you'll do

Because our SRE team is small, you will be one of the primary owners of the whole Lattice infrastructure. You will have prime opportunities to shape the future of our systems and DevOps culture. Among other things, you will be:

  • Ensuring our Kubernetes cluster is reliable, scalable, performant, and can be extended to support new requirements.
  • Implementing our infrastructure as code (Lattice currently uses Terraform) and implementing change controls.
  • Implementing monitoring, observability and alerting tools such as dashboards and logging systems to understand the health and availability of our infrastructure and applications.
  • Crafting plans and procedures for disaster recovery.
  • Making infrastructure so seamless that product engineers rarely think about it.
  • Educating product engineers on best practices on using the infrastructure services and supporting a DevOps culture.
  • Maintaining and enhancing build and deployment pipelines for CI and CD.
  • Collaborating with product engineering teams to improve their development experience by creating tools, systems and processes.
  • Influencing our engineering practices to continuously improve our time-to-resolution from site incidents.
  • Optimizing for stability rather than focusing on scale.

Who you are

There’s no such thing as a perfect candidate. We expect you to possess some combination of the following:

  • 5+ years of professional experience in infrastructure engineering.
  • Experience with Kubernetes in production workloads.
  • Experience with AWS and distributed systems in production workloads.
  • Experience with describing infrastructure as code (IaC) in production workloads.
  • Eagerness to own and develop the infrastructure of a rapidly growing company.
  • Proficiency in leveraging CI/CD tools to automate testing and deployment.
  • You enjoy collaborating with other team members. You will be working closely with our product engineers.
  • You have a proactive ownership mentality. If you see an unidentified problem, you will fix it.

Why Lattice

  • The opportunity to join a smart and energetic team that is passionate about solving customers’ needs and loves coming to work every day
  • A culture that encourages and promotes professional growth and development
  • Competitive salary, equity, and benefits
  • Centrally located FiDi office
  • Flexible vacation/time-off policy
  • We value collaboration, drive, ownership, and honesty.
  • Other fun perks like continuous learning reimbursements, office snacks & team trips

About Lattice 

Lattice is on a mission to build cultures where employees and their companies thrive.  In an age where innovation is happening all around us, there's commonality – people are driving these changes. We offer a solution that helps companies put employees first. Lattice is a people management tool that offers performance reviews, employee engagement surveys, real-time feedback, weekly check-ins, and goal setting in a way that allows companies to focus on employee development, growth, and engagement. Since launching in 2016 we have grown to over 1,100 customers globally, including brands like Slack, Reddit, Glossier, and Asana.