Lead Site Reliability Engineer (SRE) Engineer
As a Lead Site Reliability Engineer (SRE) at NA-KD you will be responsible for helping squads solve challenges to build and extend an international e-commerce platform, by establishing operational and DevOps best practices. Our platform is based on Microsoft technologies, hosted on Azure. There will also be opportunities to create new e-commerce platforms, tooling to improve new and existing workflows and to further innovate within the Episerver Commerce platform. Your goals are set high and you plan to eventually become the best in your field. You have a genuine interest in technology and are keeping up with the latest developments. You appreciate the value of the process but can be flexible in a fast paced, start-up environment when the conditions call for it. You find joy in being the hero, solving hard problems and bringing solutions to market when it seems there are none.
Responsibilities (What you will do)
- Define and own the Incident Management process for NA-KD and ensure remediations are taken into account in a timely manner
- Ensuring that systems have the infrastructure in place for scalability and resilience
- Ensuring that systems have the infrastructure in place for monitoring and alerting
- Help structuring the SRE team for future growth
- Establish the culture and practice of SLIs, SLOs and Error Budgets
- Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
- Working with squads to automate and improve development and release processes
- Automating the provision of tools and required infrastructure
- Encouraging and building automated processes wherever possible by adopting DevOps best practices
Your Past & Your Skills
- Working with Azure cloud provider.
- Experience in setting up and / or optimizing on-call procedures, including post-incident reviews and remediations
- Experience in establishing the culture and processes of SLIs/SLOs and Error Budgets
- Good experience about running and monitoring both monolithical and distributed architectures.
- Designing, implementing and provisioning Kubernetes clusters for use at scale .
- Supporting security architecture and defining or updating a set of naming conventions, patterns and practices.
- Designing and implementing a basic application routing path incorporating CDN and load balancers.
- Ability to follow code and an understanding of stack traces written in languages such as C# and NodeJS.
- Experience working with fast-growing global tech products/companies (eCommerce).
- Experience working autonomously and helping guide the direction of product builds.
- Experience (nice to have) with Elastic/Kibana, Prometheus, Telegraph.
Location: Fully Remote in Europe or based in NA-KD's offices in Gothenburg or in Stockholm.
JOIN THE DREAM TEAM
Working at NA-KD is unlike any other gig. We’re a young company with a startup mentality and a hunger to be the best. In less than four years NA-KD has become one of the fastest growing fashion e-commerce brands in the world. And we have a three million strong community to prove it. How? We see no limitations, only possibilities. No failures, only learning opportunities. We’re problem-solvers, disruptors and early-adopters. We’re doers. And if you dream of going to work every day to build the next big thing – then welcome home.