Data Engineer (Python)
TLDR
Rapidly validate new data initiatives by prototyping connectors and pipelines while leveraging modern data technologies like Kafka and NiFi.
Data Engineer (Python)
Company
Orcrist builds the Orcrist Intelligence Platform (OIP), a Kubernetes-based data intelligence system delivered as SaaS or self-hosted/on-prem (including air-gapped deployments). We run streaming and batch pipelines that power search, ML enrichment, and investigative workflows for mission-critical customers.
Role
Rapidly validate new data initiatives end-to-end—without sacrificing adoptability. On Innovation, you’ll prototype representative connectors and pipelines (batch + streaming), generate credible performance/operability readouts, and ship a handoff package that Foundation or a delivery team can productize.
What you'll do
- Prototype ingestion and connector patterns (batch + streaming) using NiFi, Kafka, Kafka Connect/Streams, and CDC approaches.
- Design “prototype-grade but adoptable” schemas and data models with clear semantics and evolution discipline.
- Build incremental lakehouse datasets (Hudi/Iceberg/Delta patterns) and produce queryable outputs for realistic latency/throughput evaluation.
- Bake in data quality and provenance mindset early (checks, metadata hooks, operability basics).
- Containerize and deploy prototypes on Kubernetes; deliver minimal runbooks/configs that make adoption straightforward.
- Produce adoption artifacts: schemas, reference implementations, technical design notes, and an integration backlog.
About You
- 3+ years data engineering experience (level dependent) with real pipeline delivery beyond ad-hoc scripts.
- Strong Python + SQL; comfortable building transformations, validation tooling, and pipeline glue code.
- Practical streaming/CDC fundamentals (ordering, duplication, replay, idempotency) and Kafka ecosystem experience.
- Familiar with lakehouse/storage and query layers (e.g., Hudi/Iceberg/Delta, Trino/Hive/Postgres) and how to make datasets usable.
- Comfortable working in Kubernetes/container environments and documenting decisions clearly.
- Eligible to work in Germany; EU/NATO citizenship preferred and export-control screening applies.
Nice‑to‑haves
- Great Expectations or similar data quality tooling; metadata/lineage platforms (OpenMetadata/DataHub/Atlas).
- Experience shipping in on-prem or air-gapped environments; governance/policy awareness for regulated customers.
- German language (B1+) and/or experience with OSINT/GEOINT/multi-INT data shapes.
What We Offer
- Modern data stack with real constraints: Kafka + NiFi + lakehouse + distributed SQL + Kubernetes.
- Remote-first in Germany with regular Berlin prototyping sprints, 30 days vacation, equipment & learning budget.
- High leverage: your prototypes become blueprints multiple teams reuse and productize.
Benefits
Remote-Friendly
Remote-first in Germany with regular Berlin prototyping sprints
Orcrist Technologies builds the Orcrist Intelligence Platform, a Kubernetes-based data intelligence system that supports B2B SaaS and self-hosted deployments. Targeting defense, law enforcement, and enterprise teams, we enable clients to transform complex data into actionable insights, enhancing decision-making in challenging environments.