Job Description :
- Maintain and enhance the SLA of 99.999% for offered services and managed platforms.
- Participate in 24x7 on-call for mission-critical services on a rotation basis.
- Design systems architecture for projects using Linux and Linux application stacks (LAMP, Ruby, Mysql, Redis, Aerospike, Java, Python, etc)
- Proficient incapacity planning
- Design, implement and enhance CI and CD platforms.
- Design, implement, enhance and manage internal cloud offerings
- Understand and Automate and solutions for permanently fix to prevent outages/downtimes
- Responsible for architecting deployments for High availability, scalability, and reliability
- Design and implement platforms for monitoring, log processing, metrics collection, and data visualization.
- Script and code automation tools (in shell/Perl/ruby/python etc) for automation and efficient management of sites/products
- Infrastructure and platform security.
- Puppet configuration management.
- Lead and mentor a team of Operations Engineers.
- Liaise with application development teams to drive operational best practices
Skills :
- You take pride in calling yourself expert/master in most of the below technologies/skills -
- Minimum 3 years of relevant experience in most of the following
- A proven track record of managing high-traffic internet applications, especially in the e-commerce domain.
- Linux: In-depth Linux/Unix fundamentals, Good understanding the various Linux kernel subsystems (memory, storage, network, etc), Understanding of various distributions nuances (Ubuntu/Fedora/Centos, etc), Package management, etc
- Fundamentals: DNS & Networking Fundamentals, TCP/UDP, IP Routing, HA & Load Balancing Concepts."
- Application Stacks: LAMP, Openresty/Nginx/HAproxy/ATS, Wackamole, Email Platforms, Tomcat.
- Cloud Infrastructure: OpenStack
- Edge Cache - Redis and aerospike
- Databases: SQL/RDBMS, MySQL/NDB, MongoDB, Cassandra.
- Configuration management: Ansible/Chef/Puppet
- Tools/Utilities: Nagios, Zabbix, Cacti, Ganglia, Kickstart/Cobbler, Mcollective, Yum, RPM, GIT/SVN
- Scripting/Programming: Extensive work is done on two or more of these scripting/programming languages - Bash/PERL/Ruby/Python/PHP.
- Others: Regular expressions, Excellent troubleshooting skills.