Key Responsibilities:
Ensure the reliability, availability, and performance of infrastructure and applications.
Implement monitoring and alerting solutions to identify and resolve issues proactively.
Automation & Scalability:
Develop automation frameworks for infrastructure provisioning, configuration management, and deployment processes.
Create and maintain CI/CD pipelines to streamline software delivery.
Incident Management:
Troubleshoot production incidents, conduct post-incident reviews, and implement preventive measures.
Develop and maintain runbooks for handling recurring issues.
Infrastructure as Code (IaC):
Design, build, and maintain cloud infrastructure using IaC tools (an excellent knowledge of CloudFormation is required and other popular IaC tools like Terraform and Ansible).
Optimize resource utilization and reduce operational costs.
Collaboration & Communication:
Work closely with development teams to integrate reliability into the software lifecycle.
Provide technical guidance and promote best practices in DevOps and SRE methodologies.
Security & Compliance:
Ensure infrastructure complies with security best practices and relevant regulations.
Monitor and address vulnerabilities in collaboration with the security team.
Required Qualifications:
Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
3+ years of experience in DevOps and SRE.
Strong experience with AWS cloud infrastructure (knowing Azure, or GCP will add more credit).
Proficiency in containerization and orchestration tools (Docker, Kubernetes, ECS). A good experience with the three mentioned here is necessary.
Hands-on experience with CI/CD tools, primarily Github Action (Jenkins, GitLab CI, CircleCI will add more credit).
Solid understanding of Linux/Unix systems and networking concepts.
Proficient in scripting/programming languages (e.g., Python, Bash, Go).
Experience with monitoring tools like Prometheus, Grafana, or Datadog.
Preferred Qualifications:
Certifications in cloud platforms, primarily AWS Certified DevOps Engineer (Google Cloud Professional DevOps Engineer will be a plus too).
Familiarity with chaos engineering principles.
Experience with database management and performance tuning (SQL and NoSQL).
Knowledge of compliance standards (e.g., SOC 2, ISO 27001).
Apply via :
www.linkedin.com