The Role
We’re looking for a fully remote (EU only) senior Python Developer to build a modular, extensible Python library that automates common data transformation tasks and streamlines data preparation workflows, to reduce the time and effort data scientists spend on preprocessing activities for AI/ML applications within the Earth Observation space.
You’ll work on high-performance scientific software and complex data processing systems used to transform large scale satellite and meteorological datasets into high quality outputs which can be used is inputs for AI/ML models.
We move quickly, trust each other to deliver, and give you space to own your work from day one.
In this role, you will:
- Build Python-based workflows to access, transform, and analyse large-scale geospatial and meteorological datasets for AI/ML applications
- Work directly with cloud-native formats like Zarr and COG, and parallel computing tools like Dask
- Develop workflows that integrate multiple satellite and model datasets, handling varying spatial and temporal resolutions and performing spatial & temporal harmonisation
- Implement automated CI/CD pipelines using GitLab to manage quality checks, tests, and deployment
- Carry out performance benchmarking, scalability testing, and code optimisation to ensure efficient processing of multi-terabyte datasets
- Write clear documentation with ReadTheDocs and create user-facing usage examples as Jupyter Notebooks
- Collaborate closely with other developers and domain experts to make sure outputs are scientifically robust
- Contribute to internal LLM tooling and automation to improve efficiency
Skills and Experience
Essential:
- Strong Python development skills, particularly for data processing and scientific analysis
- Experience with cloud-native data formats (Zarr, COG) and parallel processing (Dask)
- Confidence working with large earth observation datasets from satellites and models
- Familiarity with containerisation (Docker) and deployment in Linux environments
- Fluency with Git, GitLab, and GitLab Pipelines.