Site Reliability Engineer
Site reliability engineering is a discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As part of our SRE team you will:
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Integrate with other teams throughout the company to help develop solutions.
- Develop internal tools used by our engineers to improve productivity and code.
- A minimum of 2 years of full time industry experience developing or maintaining software systems in production
- Proficient in Linux administration
- Excellent knowledge of at least one configuration management tool like Ansible, Salt, Puppet.
- Expert knowledge of container technologies. We use Docker.
- Proficient with scripting, i.e. shell, Python etc.
- Experience with continuous deployment, live monitoring, dynamic load balancing, and security.
- Knowledge of cloud computing platforms (preferably AWS).
Nice to have:
- Continuous Integrations skills.
- Understanding of cryptography and security best practices.
- Previous DevOps/SRE experience
- Any big data experience a plus.
- State of the art workstation setup
- Great office location in the center of Athens
- Competitive salary
- Company sponsored breakfast and lunch, as well as snacks, coffees and fruits throughout the day
- Annual educational budget for training, certifications and conference attendance