Company description
we are a digital product engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!
job description
* experienced l3 sre engineer based on business-critical saas application.
* capacity to l3 across the full stack including infra, backend and front-end, before escalation to engineering business unit.
* capacity to automate sre tools to provide proactive l3 support, close to our tech monitoring strategy.
* capacity to work under business pressure for business critical applications.
* capacity to communicate accordingly with l1, l2, engineering, product managers, leadership and end-users during troubleshooting.
* experience with incident and problem management.
* experience with multitenant applications.
* solid understanding of networking concepts (tcp/ip, dns, routing, etc.) like vpcs, subnets, firewalls, and load balancing, tls and ssl.
* experience with ci/cd pipelines (e.g., jenkins, github actions) & version control.
* python, react/next.
* monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, grafana, prometheus, loki or elk.
* experience with aws, particularly eks, serverless, queue & various databases.
* solid knowledge of kubernetes.
qualifications
must have skills: eks, github actions, python (strong), kubernetes (expert), prometheus.
good to have skills:
* previous experience building a user-facing genai/llm software application.
* security best practices in cloud environments - aws managed services (rds, batch, lambda, fargate, step functions, sqs/sns, etc.).
* fastapi and nextjs experience (if we're still using the latter).
* websockets, server-side events, pub/sub (rabbitmq, kafka, etc.).
* cloud security concepts (iam, access control).
* terraform experience.
seniority level
mid-senior level
employment type
full-time
job function
it services and it consulting
#j-18808-ljbffr