Senior Platform Engineer (Kubernetes & Data Infrastructure)

09, January 2026

Descrição do trabalho

About Sybilion

Sybilion builds AI-driven market forecasting for process industries (chemicals, packaging, pulp & paper, textiles, and broader manufacturing). We help procurement, supply chain, and commercial teams make better buy/sell decisions by turning messy external signals and internal operational data into clear, defensible forecasts that teams trust and act on.

Our stack includes Python-based microservices, PostgreSQL data infrastructure, and ML/AI workflows that support forecasting models and decision tooling.

About the Role

We’re hiring someone to own both our platform and data infrastructure: Kubernetes administration, Linux systems, CI/CD, observability, and PostgreSQL administration for our data lakes and ML pipelines. You’ll keep production reliable, fast, secure, and scalable, while supporting the day-to-day needs of our engineers and ML workflows.

This is an on-site role in Maia (Porto). We value in-person collaboration and move quickly.

What You’ll Do

Platform / Kubernetes / Systems
Design, deploy, and operate Kubernetes clusters in production (networking, storage, security)
Operate Linux server infrastructure (Ubuntu/RHEL), patching, hardening, and reliability
Manage Docker image lifecycle (builds, optimisation, registry management, security scanning)
Implement and maintain CI/CD pipelines for microservices deployments and infrastructure changes
Build and maintain Infrastructure as Code (Terraform, Ansible, Helm) and Git workflows
Operate and improve monitoring, logging, and alerting (Prometheus/Grafana, ELK/EFK/Loki, etc.)
Manage secrets and credentials securely (Vault, Sealed Secrets, or equivalent)
Ensure high availability, capacity planning, incident response, and disaster recovery readiness
Support GPU-enabled workloads and ML/LLM deployments (resource allocation, utilisation, scaling)

PostgreSQL / Data Infrastructure
Administer and optimise PostgreSQL databases and data lake infrastructure (performance, reliability, cost)
Own backup/recovery and disaster recovery procedures (including point-in-time recovery)
Design schemas, indexing strategies, and query optimisation approaches; analyse execution plans
Manage migrations and versioning (schema changes, rollout strategies, rollback plans)
Implement replication/failover/clustering patterns for high availability
Own database security: access controls, encryption at rest/in transit, audit logging, compliance needs

Python Microservices / Data Pipelines / ML Workflows
Support deployment and troubleshooting of Python microservices (FastAPI/Flask/Django or similar)
Help maintain Python environments and dependency management (pip/poetry/conda/mamba)
Support ETL/ELT pipelines feeding our data lake and ML training workflows
Implement data quality checks and validation where needed
Partner with engineers and ML team to improve runtime performance, reliability, and operational visibility

Must-Have Experience (Required)
5+ years of hands-on production experience in: Linux, Docker, Kubernetes, and PostgreSQL
Strong Kubernetes administration skills (clusters, networking, ingress, storage, RBAC, security)
Strong PostgreSQL administration skills (performance tuning, backups, replication/HA, security)
Strong Linux systems skills (operations, troubleshooting, hardening)
CI/CD experience (GitHub Actions/GitLab CI/Jenkins or similar)
Infrastructure as Code experience (Terraform and/or Ansible; Helm for Kubernetes)
Observability experience (metrics, logs, alerting; root-cause analysis)
Solid Python literacy for debugging services and automating operational tasks
Strong communication skills in English and comfort working independently end-to-end
Willingness to participate in an on-call rotation for critical systems

Preferred (Nice to Have)
Startup background (you’ve worked in small teams, moved fast, and owned outcomes end-to-end)
Experience running ML infrastructure (MLflow, Kubeflow, Airflow, KServe/TorchServe, etc.)
GPU cluster experience (NVIDIA GPU Operator or similar) and model serving optimisation
Experience with service mesh (Istio/Linkerd)
Experience with cloud managed databases (AWS RDS, GCP Cloud SQL, Azure Database)
Familiarity with data lake / warehouse patterns and data versioning (DVC/MLflow tracking)
Experience with Redis/MongoDB or other complementary data systems

Soft Skills We Value
Strong problem-solving and analytical mindset
Calm, structured incident handling and good judgement under pressure
Proactive improvement orientation (you spot issues before they become outage

Senior Platform Engineer (Kubernetes & Data Infrastructure)

Descrição do trabalho

Deixe-nos ouvir a sua opinião

[email protected]

Sobre nós

Recursos úteis

Senior Platform Engineer (Kubernetes & Data Infrastructure)

Descrição do trabalho

Partilhar esta publicação

Deixe-nos ouvir a sua opinião

[email protected]

Sobre nós

Recursos úteis