Back to Search Results
Get alerts for jobs like this Get jobs like this tweeted to you
Company: Mastercard
Location: Nairobi, Nairobi County, Kenya
Career Level: Associate
Industries: Banking, Insurance, Financial Services

Description

Our Purpose

Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.

Title and Summary

Lead DevOps Engineer, Foundry RnD Our Purpose
Mastercard powers economies and empowers people across 200+ countries and territories. Together with our customers, we build a sustainable, inclusive economy by enabling secure, simple, smart, and accessible digital payments. Our technology, innovation, partnerships, and networks deliver products and services that help people, businesses, and governments reach their full potential.
Lead DevOps Engineer
We are seeking a Lead DevOps Engineer to join the Mastercard Foundry R&D team. You will help build and scale AI/ML infrastructure to support our innovation efforts, with a focus on automation, observability, and developer experience. The ideal candidate is hands-on, curious, motivated, and comfortable working in fast-moving R&D environments.
What You'll Do
• Drive Platform Infrastructure: Own DevOps and infrastructure for MLOps and agentic AI systems, establishing reusable patterns for CI/CD, scalable inference, orchestration, observability, and cost control. Design secure, scalable, repeatable systems using Infrastructure as Code (IaC) to support R&D workloads.
• Build secure CI/CD & automation systems: Enable secure tool access, workload isolation, and infrastructure for LLM-backed APIs and MCP servers, while partnering with security and compliance on access control, infrastructure governance and auditability.
• Ensure Reliability & Observability: Implement monitoring, logging, and alerting. Tune observability for ML-specific workloads to ensure performance, reliability, and operational insight.
• Provide Technical Leadership: Offer hands-on leadership across DevOps and platform initiatives. Review code, enforce best practices, improve tooling, and promote clean, well-tested infrastructure.
• Cross-Functional Collaboration: Partner with ML, software, and platform engineers to design deployment strategies, scope work, manage agile deliverables, and meet milestones.
What You'll Bring
• Extensive DevOps Experience: 8–12+ years in DevOps, SRE, or platform engineering, including senior/lead roles. Experience designing end-to-end infrastructure systems, solving scale/performance challenges, and operating platforms in production.
• Cloud & Infrastructure Expertise: Strong skills in cloud platforms (AWS, Azure, or GCP) and AI/ML components such as Databricks, Azure ML, and MLflow. Deep experience with Infrastructure as Code using Terraform and orchestration tools like Terragrunt.
• Container & Orchestration Mastery: Expertise in Kubernetes and Docker, including how they optimise ML development workflows. Experience with container security, networking, and cluster management at scale.
• AI/ML Platform Knowledge: Understanding of ML workflow requirements—model registries, feature stores, AI agents, Retrieval-Augmented Generation (RAG) techniques, and frameworks like LangChain/LlamaIndex.
• Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
• Problem-Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
Required Skills
• Education & Background: Bachelor's degree in Computer Science, Engineering, or related field. 8–12+ years of proven experience architecting and operating production-grade infrastructure, especially those supporting AI/ML workloads.
• Infrastructure as Code: Expert in Terraform and IaC orchestration tools like Terragrunt. Strong experience with configuration management and GitOps practices.
• Programming & Scripting: Advanced Bash and Python skills and strong software engineering fundamentals (version control, CI, code reviews). Familiarity with Go or other systems programming languages is a plus.
• CI/CD & Automation: Hands-on experience with Jenkins, GitHub Actions, GitLab CI, or similar tools. Strong understanding of pipeline design, artifact management, and deployment strategies.
• Monitoring & Observability: Experience with monitoring stacks such as Prometheus, Grafana, Splunk, and ELK. Skilled in building dashboards, alerts, and tuning observability for ML-specific use cases.
• Cloud Infrastructure: Experience deploying systems on AWS/Azure/GCP. Familiar with cloud-native services, serverless computing, and managed Kubernetes offerings (EKS, AKS, GKE). Comfortable with Linux internals and shell scripting.
• Security & Networking: Knowledge of security best practices for MLOps, including data privacy, compliance, access controls, and encryption. Understanding of modern networking protocols (mTLS) and secure service communication.
• Collaboration & Agile Delivery: Strong communication skills and experience working with cross-functional teams. Ability to document designs clearly and deliver iteratively using agile practices.
Preferred Skills
• Databricks Experience: Hands-on experience with Databricks, including workspace administration, cluster management, Unity Catalog, Delta Lake, and Lakehouse architectures. Familiarity with Databricks workflows, jobs orchestration, and MLflow integration.
• Advanced Cloud & ML Platform Expertise: Experience with Azure ML, SageMaker, or similar ML platforms. Familiarity with model serving, feature stores, and ML pipeline orchestration.
• ML Frameworks Familiarity: Knowledge of ML frameworks like TensorFlow, PyTorch, or Scikit-learn to better support ML engineering teams.
• Enterprise Security: Experience working in complex enterprise environments with strict security and compliance requirements. Strong networking fundamentals, including configuring and maintaining secure mTLS-based communication between services.
• DevOps & Platform Innovation: Experience implementing self-service platform automation, developer portals, or internal developer platforms (IDPs).
• Continuous Learning: Motivation to explore emerging technologies, especially in AI, generative AI, and cloud-native infrastructure. Certifications, personal projects, or open-source contributions are a plus.
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
Abide by Mastercard's security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.

Corporate Security Responsibility


All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:

  • Abide by Mastercard's security policies and practices;

  • Ensure the confidentiality and integrity of the information being accessed;

  • Report any suspected information security violation or breach, and

  • Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.




 Apply on company website