ML Engineer / Tech Lead

Workplace: Hybrid model (2 days from the office per week), Warszawa/Łódź

Workload: Full-time

Contract type: Contract of Mandate / B2B

 

Role Overview

We’re looking for a Principal Machine Learning Systems Engineer (Tech Lead) to join a global leader in geospatial analytics. 

In this role, you’ll become the technical owner and architect of Machine Learning workflow orchestration platform – a mission-critical ecosystem that underpins the entire AI division. You’ll design, build, and maintain scalable infrastructure capable of supporting massive data and compute workloads, while leading a 5-person engineering team in Poland.

This position is ideal for a hands-on technical leader who thrives at the intersection of systems design, scalability, and team leadership. You’ll spend roughly 30–40% of your time coding, while setting technical direction and mentoring your team.

 

Key Responsibilities

  • Act as the technical architect and owner of the ML orchestration platform powering all AI workflows.
  • Design and build large-scale distributed systems running on Kubernetes (up to 10,000 nodes).
  • Ensure platform reliability, scalability, and high availability in production environments.
  • Lead and mentor a team of 5 engineers (2 senior + 2 mid-level), fostering technical excellence and autonomy.
  • Collaborate closely with AI, SRE, and Data Engineering teams to streamline model training and deployment pipelines.
  • Define and drive best practices in CI/CD, observability, and cloud infrastructure management.
  • Stay hands-on with coding and architecture reviews (~30–40% time in technical delivery).
  • Take ownership of incident response, on-call rotations, and reliability improvements.

 

Required Skills & Experience

Must Have:

  • Kubernetes (K8s) – Deep, production-level expertise is non-negotiable.
  • Able to design and deploy clusters from scratch, manage massive workloads (10,000+ nodes).
  • Cloud Experience (AWS / GCP) – Strong hands-on experience managing infrastructure in cloud environments.
  • SRE / DevOps Background – Solid understanding of reliability engineering, monitoring, on-call operations, and CI/CD.
  • Programming Skills (Python preferred) – Strong coding ability; Go or Rust experts open to learning Python are also welcome.

 

Nice-to-Have:

  • Experience with MLOps platforms and lifecycle management.
  • Familiarity with workflow orchestration tools (e.g., Airflow, Kubeflow).

 

What we offer:

  • Access to LinkedIn Learning

  • B2B contract + benefits

ID: 131 job_post.published_on: 09/10/2025
announcement.apply