Atera
  • 25 active jobs (view)

  • Published: December 12, 2023
Category
Job Type
Level of education
High school
Spoken Language needed
Hebrew, English
Level of Hebrew
Fluent
Location of job
Tel Aviv/ Ramat Gan
How many relevant years experience do you require for the role:
More than 3 years

Description

About Atera

Atera is inventing a new way of managing IT end-to-end for IT professionals and teams worldwide.

By creating an AI-powered IT platform, Atera's all-in-one Remote Monitoring and Management (RMM) Helpdesk, Ticketing, and Reporting solution helps more than 23,000 IT pros achieve 10X operational efficiency, cut down time-to-resolution, and deliver better outcomes faster. Located in the heart of Tel Aviv, our team of passionate, like-minded individuals is driven by a shared mission to unleash everyone's potential and constantly innovate. We create an open, transparent, and supportive environment that gives our teams the autonomy, resources, and freedom to thrive.

This is a full-time and onsite (hybrid-remote) role at our Tel Aviv office.

Atera is looking for a motivated senior site reliability engineer to join us and build the framework for the engineering ops to scale.

Responsibilities:

- Build tools and automation to monitor system health, performance, and reliability, ensuring quick detection and resolution of any anomalies or issues.
- Write high-quality infrastructure-as-code that automates the provisioning, deployment, scaling, and effective monitoring, alerting, and logging solutions.
- Work with other engineers to ensure that new services are well-designed, properly monitored, and have well-defined SLIs and achievable SLOs
- Maintain runbooks for manual tasks and replace those runbooks with automation whenever possible.
- Proactively track our capacity, quotas, and other performance limits to plan for growth.
- Participate in a 24x7 on-call rotation to handle product availability issues as well as urgent customer support escalations.
- Investigate and resolve incidents and outages, performing root cause analysis to identify systemic issues and implement preventive measures.
- Develop and maintain disaster recovery plans and perform regular testing to ensure data integrity and business continuity.

REQUIREMENTS

Requirements:

- 3+ years of experience as an SRE in large-scale production environments
- Strong experience in designing, implementing, and managing monitoring processes.
- Experience in at least one scripting language (Python, Ruby, Perl, Bash) and infrastructure as code technologies (e.g., Terraform, CloudFormation)
- Strong abilities to lead, design, and execute cross-organization projects
- Experience in managing container and infrastructure orchestration tools (e.g., Kubernetes, Terraform)
- Hands-on experience administering public clouds
- Experience with CI/CD pipelines for applications and microservices
- Excellent English communication skills

Advantages:

- Knowledge of advanced monitoring and observability tools beyond basic logging and alerting.
- Experience with tools like Prometheus, Grafana, ELK stack, or similar.
- Previous experience as DevOps Engineer- a big plus

Some about our benefits

Atera is highly collaborative and, yes, fun! To support you at work (and play), we offer some fantastic perks: ample time to learn from your teammates and contemporaries, time off to relax and recharge, community volunteer days, an annual budget to support your learning & growth, a company-paid trip, and lots more.

Apply
(Check on your spam box)
Drop files here browse files ...

Related Jobs