FreshJobs

Senior Engineer-Platform Engineering at Ezra

October 4, 2024
Urgent
Apply Now
Deadline date:

Job Description

Loading

Architecture and Design

Design and implement scalable, resilient, and secure platform solutions
Develop and maintain infrastructure-as-code using tools like Terraform, Cloud-Formation and Ansible
Create and optimize CI/CD pipelines for efficient software delivery
Architect cloud-native solutions leveraging containerization and microservices
Implement disaster recovery and business continuity strategies

Infrastructure Management

Manage and optimize our Public cloud infrastructure (AWS, Azure, or GCP)
Manage and optimize private cloud infrastructure in partner premises.
Implement best practices for cloud security, compliance, and cost optimization
Design and implement multi-region and multi-cloud strategies
Design and maintain containerized application environments using Docker
Architect, deploy, and manage Kubernetes clusters for container orchestration

Automation and DevOps

Develop automation scripts and tools to streamline operations and reduce manual tasks
Integrate monitoring, alerting, and logging systems
Ensure Standardized QA and Production environments through implementation of proper branching strategies
Configure and manage load balancers (e.g., NGINX, HAProxy, cloud-native solutions)
Implement and manage service mesh technologies (e.g., Istio, Linkerd) for microservices architectures

Performance Optimization

Analyse and optimize system performance, identifying and resolving bottlenecks
Conduct capacity planning and implement auto-scaling solutions
Optimize container resource allocation and performance
Team Leadership and Collaboration

Mentor junior engineers and provide technical guidance to the team
Collaborate with cross-functional teams to align platform capabilities with business needs
Contribute to technical decision-making and architectural reviews

Documentation and Knowledge Sharing

Maintain comprehensive technical documentation for platform components and processes
Contribute to internal knowledge bases and conduct knowledge-sharing sessions
L2 Support and Escalation Management

Provide expert-level troubleshooting and resolution for critical platform and infrastructure problems
Analyze recurring issues and implement long-term solutions to prevent future occurrences
Collaborate with the operations team to improve support processes and knowledge transfer
Conduct post-incident reviews and implement lessons learned to enhance system reliability

Required Qualifications

Bachelor’s degree in Computer Science, Engineering, or a related field
5+ years of experience in platform engineering, DevOps, or similar roles
Strong proficiency in at least one cloud platform (AWS, Azure, or GCP)
Expert-level knowledge of containerization technologies (Docker, Kubernetes)
Extensive experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation, pulumi)
Proficiency in scripting languages (e.g.Bash, )
Strong understanding of networking concepts, load balancing, and CDNs
Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
Excellent problem-solving skills and ability to troubleshoot complex systems