本ウェブサイトでは、ユーザーにウェブサイト上のサービスを最適な状態でお届けするためCookieを使用しています。ブラウザの設定(Cookieの無効化等)をそのまま変更せずに閲覧される場合は、弊社ウェブサイト上の全ページでCookieを受信することに同意したものとみなします。詳細は、弊社プライバシーポリシーをご覧ください。
本ウェブサイトでは、ユーザーにウェブサイト上のサービスを最適な状態でお届けするためCookieを使用しています。ブラウザの設定(Cookieの無効化等)をそのまま変更せずに閲覧される場合は、弊社ウェブサイト上の全ページでCookieを受信することに同意したものとみなします。詳細は、弊社プライバシーポリシーをご覧ください。
勤務地 | マレーシア, kuala lumpur |
雇用形態 | 正社員 |
給与 | 経験考慮の上、応相談 |
COMPANY OVERVIEW
A well-established client of us in Kuala Lumpur is seeking for Site Reliability Engineering Lead.
JOB RESPONSIBILITIES
○ Lead and mentor a team of SREs, fostering a culture of ownership, collaboration, and continuous improvement.
○ Define clear goals, performance metrics, and development plans for the team.
○ Design and implement strategies to improve system reliability, scalability, and performance.
○ Conduct root cause analysis of production incidents and develop preventive solutions.
○ Oversee the deployment, monitoring, and management of production environments.
○ Collaborate with development teams to design cloud-native infrastructure and architecture.
○ Drive automation of operational processes, reducing manual intervention and response times.
○ Optimize CI/CD pipelines to ensure smooth and rapid deployments.
○ Establish incident response protocols and lead efforts during major incidents.
○ Ensure robust monitoring and alerting systems are in place to proactively detect issues.
○ Act as a liaison between engineering, operations, and other teams to align objectives.
○ Share insights and best practices with internal stakeholders to enhance overall system resilience.
JOB REQUIREMENTS
○ Strong experience with cloud platforms (AWS, Azure, Google Cloud) and infrastructure-as-code tools (Terraform, Ansible, etc.).
○ Proficiency in programming/scripting languages (Python, Go, Shell, etc.).
○ Deep knowledge of Kubernetes, containerization, and distributed systems.
○ Proven track record of leading SRE or DevOps teams and managing large-scale production environments.
○ Strong decision-making, prioritization, and problem-solving capabilities.
○ Expertise in implementing and using monitoring tools (Prometheus, Grafana, Datadog, etc.) and logging systems.
○ Familiarity with service-level objectives (SLOs), service-level agreements (SLAs), and error budgets.
○ Excellent communication and collaboration skills to work across cross-functional teams.
○ Ability to mentor and upskill team members, fostering a learning-oriented culture.
○ At least 8 years of experience in SRE, DevOps, or related roles with a focus on reliability engineering
Notice: By submitting an application for this position, you acknowledge and consent to the disclosure of your personal information to the Privacy Policy and Terms and Conditions, for the purpose of recruitment and candidate evaluation.
Privacy Policy Link: https://www.jac-recruitment.my/privacy-policy
Terms and Conditions Link: https://www.jac-recruitment.my/terms-of-use
職務経験 | 3年以上 |
キャリアレベル | 中途経験者レベル |
英語レベル | ビジネス会話レベル |
日本語レベル | ビジネス会話レベル |
最終学歴 | 短大卒: 準学士号 |
現在のビザ | 日本での就労許可は必要ありません |
雇用形態 | 正社員 |
給与 | 経験考慮の上、応相談 |
業種 | ITコンサルティング |